Microsoft CEO on AI: Web Content (Without Robots.txt) Is ‘Free Software’ for AI Training


Slashdot reader joshuark shared This report of Windows Central

Microsoft may have opened a can of worms with recent comments from the tech giant’s AI CEO, Mustafa Suleyman. The CEO spoke with CNBC’s Andrew Ross Sorkin at the Aspen Ideas festival earlier this week. In his remarks, Suleyman asserted that all content shared on the web is available to be used for AI training, unless a content producer decides otherwise.

The discussion as a whole was interesting, but this particular question was very direct. The CNBC interviewer said, “There are a number of authors here… and a number of journalists as well. And it seems like a lot of the information that’s been trained on over the years has come from the web – and some of it’s from the open web, some of it’s not, and we’ve heard stories about how OpenAI was turning YouTube videos into transcripts and then training on those transcripts.”

The question then becomes: “Who is supposed to own the intellectual property, who is supposed to get value from it, and, to put it very bluntly, whether AI companies have actually stolen the world’s intellectual property.” Suleyman begins his answer by saying: at 2:40 p.m. — with “Yeah, I think — look, that’s a very fair point.”

SULEYMAN: “I think that for content that’s already on the open Web, the social contract since the ’90s is that this content is fair use. Anybody can copy it, recreate it, reproduce it. It’s freeware, if you will. That’s the understanding.

“There is a separate category where a website or publisher or news organization has explicitly said, ‘Do not scrape or crawl my information for any reason other than to index me so that other people can find this content.’ That’s a gray area and I think it’s going to play out in court.”

Q: And what does it mean when you say “it’s a gray area”?

SULEYMAN: “Well, yes — so far, some people have taken this information… but it’s going to be brought before the courts, and I think it’s rightly so…

“You know, the information economy is about to change dramatically because we’re going to reduce the cost of producing knowledge to zero marginal cost. This is a very hard thing for people to understand, but in 15 or 20 years, we’re going to be producing new scientific cultural knowledge at almost zero marginal cost. It’s going to be widely available, open access, and available to everyone. And I think that’s going to be a real inflection point in the history of our species. Because what are we, collectively, as a human organism, if not an engine of intellectual production? We produce knowledge. Our science makes us better. And so what we really want in the world, I think, are new engines that can supercharge discovery and invention.”

Leave a Comment