the health
transformation
knowledge portal
Joaquim Cardoso MSc
March 7, 2024
What is the message?
A new report from plagiarism detector Copyleaks reveals that 60% of OpenAI’s GPT-3.5 outputs contain some form of plagiarism, sparking concerns among content creators and raising legal questions about copyright infringement.
This summary is based on the article “New report: 60% of OpenAI model’s responses contain plagiarism”, published by Axios and written by Megan Morrone on February 22, 2024.
ONE PAGE SUMMARY
What are the key points?
Copyleaks found that 60% of GPT-3.5 outputs contained plagiarism, with varying degrees of similarity ranging from identical text to paraphrased content.
Plagiarism detection tools like Copyleaks aim to turn identifying plagiarism into a precise science, using proprietary scoring methods to assess the originality of content.
The report analyzed GPT-3.5 outputs across 26 subjects, with the highest similarity scores observed in computer science and physics, and the lowest in theater and humanities.
OpenAI, the developer of GPT-3.5, responded to the findings by stating that their models are designed to learn concepts and solve problems, with measures in place to prevent inadvertent memorization.
What are the key statistics?
60% of GPT-3.5 outputs contained some form of plagiarism.
Individual outputs showed varying degrees of similarity, with computer science having the highest (100%) and theater the lowest (0.9%).
What are the key examples?
A key example provided is The New York Times lawsuit against Microsoft and OpenAI, alleging copyright infringement due to AI systems’ “widescale copying.”
Conclusion
The report underscores the prevalence of plagiarism in AI-generated content and the legal ramifications for organizations like OpenAI.
It highlights the need for stricter safeguards against plagiarism in AI models and raises questions about the ethical use of AI technology in content generation.
To read the original publication, click here.