In May, Informa, a multi-billion dollar British multinational, announced in a statement Business Update Informa has signed an agreement with Microsoft for “access to advanced learning content and data, as well as a partnership to explore expert AI applications.” Informa is the parent company of Taylor and Franciswhich publishes a wide range of academic and technical books and journals, so the data in question may include the content of those books and journals.
According to reports published last weekThe content authors do not appear to have been consulted or even informed of the agreement. Furthermore, they claim that they were not given the opportunity to opt out of the agreement and that they will not receive any income in return.
Academics are just the latest in a series of groups of what we might call content creators to take offense at having their work ingested by the generative AI models that are now rushing to suck up the products of human culture. Newspapers, visual artists And record labels are already suing AI companies.
While it’s not yet clear how Informa will respond to the rumblings of discontent, the agreement serves as a reminder to authors that they need to be aware of the contractual terms of the publishing deals they sign.
What does the Informa agreement contain?
Informa’s update outlined four priority areas of the agreement with Microsoft:
- increase Informa productivity
- develop an automated citation tool
- develop AI-powered research assistant software (perhaps as a system tested by JSTOR Online University Library)
- provide access to Microsoft data to “help improve the relevance and performance of AI systems.”
Informa will receive more than £8 million (A$15.5 million) for initial access to the data, followed by recurring payments of an unspecified amount over the next three years.
It’s unclear what Microsoft plans to do with its access to the data, but one likely scenario is that the content of academic books and papers will be added to the training data for generative AI models like ChatGPT. In principle, this should make the output of AI systems more accurate, although existing AI models have come under heavy criticism, not only for regurgitation of training data without citation (which can be considered as a kind of plagiarism), but also for invent false information And attributing These are real sources.
However, the update also states that “the agreement protects intellectual property rights, including limits on verbatim text excerpts and alignment on the importance of detailed citation references.”
The “limits on literal text excerpts” mentioned probably concern the American Fair Use Doctrinewhich permits certain uses of copyrighted material.
Many generative AI companies are currently facing copyright infringement lawsuits over their use of training data, and their defense will likely rely on fair use claims.
The importance of detailed citation references can be related to the concept of attribution in copyright law. This is a moral rights belonging to the authors. It provides that the creator of the work must be known and attributed as the author when his work is reproduced.
How does scientific publishing usually work?
Most academics do not receive compensation and do not make a profit from most of their scientific publications. Instead, writing articles for journals and conferences is considered part of the job of a full-time, permanent position. Publication enhances an academic’s credibility and promotes his or her research.
The basic process typically goes like this: An author researches and writes an original article, then submits it to a journal editor for peer review. Most reviewers and editorial board members also do not receive compensation for their work.
In fact, some journals may require authors to pay a “item processing fees“to cover publishing and other costs. This can amount to thousands of dollars for a free access publication. As a general rule, the more prestigious the publication, the higher the price.
If a paper successfully passes peer review, the author will be asked to sign a publishing agreementTerms and conditions may cover logistical aspects such as the publication date of the article, the format (print, online or both) and the distribution of royalties (if any). Provisions will also be made regarding copyright and ownership of the article.
An author must generally also grant exclusive rights to the publisher to distribute and publish the article. This may mean that the author cannot publish the article elsewhere, and the publisher may also be able to sublicense the article to a third party, such as an AI company.
Sometimes publishers ask an author to assign copyright to the article to them through a standing agreement. copyright transfer agreement.
In essence, this means that the author assigns all of his or her copyright as the copyright holder of the work to the publisher. The publisher may then reproduce, communicate, distribute, or license the work to others as he or she wishes.
It is possible to grant only limited rights, rather than all rights, and this is something authors should consider.
Content exploration
It is essential that authors understand the implications of licenses and assignments and think carefully about what they are agreeing to when they sign a contract. In light of the recent trend Publishers strike deals with generative AI companiesPublishers’ AI policies should also be closely examined.
In the United States, a standard collective licensing solution for content use in internal AI systems A new law has recently been published, which sets out the rights and remuneration of copyright holders. Similar licenses for the use of content for AI systems are expected to arrive in the Australian market very soon.
The types of deals being struck between academic publishers and AI companies have raised broader concerns among many academics. Do we want scientific research to be reduced to content for Knowledge Exploration by AIThere are no clear answers about the ethics and morality of such practices.