14.6 C
New York
Saturday, September 23, 2023

The 50 Books Used to “Train” ChatGpt: From Harry Potter to 1984

Must read

I am James Novak, a passionate and experienced news writer with the ultimate goal of delivering the most accurate and timely information to my readers. I work in the news department at a website dedicated to providing reliable and up-to-date information about technology. My articles are widely circulated, often featured on major publications, and have been read by millions of people around the world. With over four years of writing experience in various fields such as tech startups, industry trends, cybersecurity, AI/ML advances, and more, I bring an informed perspective to all topics I write on. Beyond my published work online and in print media outlets, I'm also an avid speaker at local events where I share my insights on current issues related to technology.

Researcher David Bamman of the University of California at Berkley and a team of colleagues discovered which texts were created to ‘read’ the chatbot

From Harry Potter to 1984, through Gone with the Wind and Beloved by the Nobel Prize for Literature Toni Morrison. 50 books, including many science fiction and fantasy classics, were used to train the ChatGpt AI model. To discover it in a completely casual way, researcher David Bamman of the University of Berkeley in California did.

I study

The researcher, whose goal is to extract data from classical literature on topics such as the relationships between the various characters in a novel, was working on Jane Austen’s masterpiece Pride and Prejudice when he decided to send his questions to ChatGpt. The software provided precise answers, but there was no way to figure out how the chatbot knew those concepts, as the inner workings of large language models are a black box. So Bamman and his team asked ChatGpt’s knowledge of several books and gave a score for each. The higher the score, the more likely the book was part of the software dataset. They then combined their findings in a study reported by Business Insider.

The books known to ChatGpt

The list of 50 novels that helped ChatGpt includes such classics as Moby Dick, The Scarlet Letter, The Color Purple, The Remains of the Day, and Furore. But the books with the highest knowledge percentage according to the AI ​​model are science fiction and fantasy books. At the top of the list are JK Rowling’s Harry Potter and the Philosopher’s Stone and George Orwell’s 1984; to follow texts that have made history such as The Lord of the Rings, Fahrenheit 451, Brave new world but also Neuromancer by William Gibson and The android hunter by Philip K. Dick. And again Game of Thrones, The Hitchhiker’s Guide to the Galaxy, the Da Vinci Code. In the list of books assimilated by ChatGpt, there are also a few novels from Ian Fleming’s 007 saga. “The resources these AI models are trained on will influence the kind of models themselves and the values ​​they present,” says Bamman: “What happens when a bot devours fiction about all sorts of dark, dystopian worlds? influence these models in ways that are not related to literary or narrative things? There is still a lot of work to be done in this direction. We do not yet have an answer to this question,” the researcher concluded.

Source: TG 24 Sky

More articles


Please enter your comment!
Please enter your name here

Latest article