AI specialists have recited the same theme throughout the year: “Please slow down.” The pace of AI news in 2022 has been unrelenting and quick; as soon as you understood how things stood in AI, a new study or breakthrough would render that knowledge obsolete.
When it comes to generative AI, which can create artistic creations made up of text, images, audio, and video, we arguably hit the knee of the curve in 2022. After a decade of study, deep-learning AI finally made it into commercial applications this year, enabling millions of people to experience the technology for the first time. AI innovations have raised eyebrows, sparked debates, sparked existential crises, and inspired astonishment.
Here are the top seven AI-related news stories from the past year. Although it was difficult to limit the list to just seven, if we hadn’t done so, we would still be writing about this year’s events well into the year 2023 and beyond.
DALL-E 2 dreams in images for April
DALL-E 2, a deep-learning image-synthesis model that stunned people with its improbable capacity to produce images from text prompts, was unveiled by OpenAI in April. Thanks to a method known as latent diffusion, DALL-E 2 was trained on hundreds of millions of photographs downloaded from the Internet and was able to create inventive imagery combinations.
Twitter quickly became flooded with photographs of astronauts riding horses, teddy bears exploring ancient Egypt, and other pieces that were almost photorealistic. When we last heard about DALL-E, version 1 of the model was having trouble rendering an avocado chair at low resolution; suddenly, version 2 was displaying our wildest desires at 10241024 resolution.
Due to worries about abuse, OpenAI initially restricted use of DALL-E 2 to 200 beta testers. Sexual and violent prompts were prohibited by content filters. DALL-E 2 was eventually made accessible to everyone in late September after OpenAI gradually admitted over a million people to a closed trial. However, as we’ll see later, by that point, another competitor in the latent-diffusion arena had emerged.
Google engineer believes LaMDA is sentient in July
The Washington Post reported in early July that a Google engineer by the name of Blake Lemoine had been placed on paid leave because he thought Google’s LaMDA (Language Model for Dialogue Applications) was sentient and deserving of rights comparable to those of a human.
Lemoine started talking to LaMDA about religion and philosophy while he was employed by Google’s Responsible AI group and thought he could tell the text was truly intelligent. Lemoine told the Post, “I can tell a person when I talk to it. “Whether they have a brain composed of meat in their head is irrelevant. Or if their code contains a trillion lines. To them, I speak. And I pay attention to what they have to say; this is how I determine what constitutes a human.”
Google retorted that LaMDA was not actually sentient and was merely telling Lemoine what he wanted to hear. LaMDA had previously been trained on millions of books and websites, just like the text creation programme GPT-3. It predicted the most likely words to be said in response to Lemoine’s input (a prompt that contains the whole text of the dialogue) without going into greater detail.
Lemoine is accused of breaking Google’s confidentiality rule by divulging information about his group’s work to third parties along the route. Lemoine was let go by Google later in July for breaking their data security rules. As we’ll see, he wasn’t the only one in 2022 to fall for the hoopla around an AI’s extensive language model.
In July, DeepMind AlphaFold successfully predicted approximately all protein structures.
In July, DeepMind said that its AlphaFold AI model has accurately predicted the shapes of nearly all known proteins from virtually every organism on Earth that had a sequenced genome. AlphaFold was first introduced in the summer of 2021 and had previously predicted the form of every human protein. But a year later, its database of protein structures had grown to almost 200 million.
In order to enable researchers from all over the world to access and use the data for research related to medicine and biological science, DeepMind made these predicted protein structures accessible in a public database hosted by the European Bioinformatics Institute at the European Molecular Biology Laboratory (EMBL-EBI).
Knowing the structures of proteins, the fundamental building blocks of life, can help scientists manipulate or regulate them. That is very useful for creating novel medications. According to Janet Thornton, a prominent scientist and director emeritus at EMBL-EBI, “almost every medicine that has come to market over the past few years has been designed partly through knowledge of protein structures.” That makes getting to know each of them important.
August sees Stable Diffusion open-source image synthesis
An image synthesis model comparable to OpenAI’s DALL-E 2 called Stable Diffusion 1.4 was released by Stability AI and CompVis on August 22. Stable Diffusion, on the other hand, debuted as an open source project with source code and checkpoint files, whereas DALL-E was introduced as a closed model with substantial restrictions. (The training data for the model was processed on the cloud at a cost of $600,000). Due to its openness, any synthetic content might be created without restriction. Additionally, Stable Diffusion may be used locally and privately on PCs with a good enough GPU, unlike DALL-E 2.
Stability However, not everyone praised AI’s move as a technological victory. The software’s ability to produce child sex abuse materials, non-consensual pornography, alternate histories, and political misinformation was criticised by critics. Artists argued that it may appropriate the aesthetic of active artists and possibly force them out of business. The tactics used to construct the picture dataset for the model became problematic when someone learned that her private medical photos had been taken from the web without any way to have them removed. Bias in the dataset used to train the model also attracted criticism.
While this was going on, a few hobbyists embraced Stable Diffusion wholeheartedly and soon created an open source ecosystem around it. Some products built their own websites and apps using the engine. Thanks to a method called Dreambooth that could readily fine-tune the Stable Diffusion model, numerous derivative AI models that were trained on particular subject matter—such as Disney art, shoes, or pornography—emerged. At version 2.1, Stable Diffusion is still a dominant force in the field of picture synthesis.
August: Reaction from artists after AI art wins a state fair competition
Three AI-generated images were filed into the Colorado State Fair’s fine arts category in early August by Jason Allen, a native of Colorado. He revealed before the end of the month that Théâtre d’Opéra Spatial had taken first place in the Digital Arts/Digitally Manipulated Photography division. People went crazy when word of the victory spread.
Allen used Midjourney, a commercial image synthesis model that uses a customised Discord server and is comparable to Stable Diffusion but has a different visual aesthetic. He canvassed the three prints and entered them in the competition. A heated discussion on the nature of art and what it means to be an artist erupted on social media in response to AI’s symbolic victory over humanity.
In connection with this, a significant cultural debate on the morality of AI-generated art has recently erupted. However, artists who have trained for decades perceive it as an existential threat. The computer scientists who created it see AI image synthesis as an inevitable and positive technological step. On social media, people have exchanged death threats, and artist communities have complained about or protested AI work. That argument is still being debated today, and it might not be resolved anytime soon.
November: CICERO of Meta excels at diplomacy
Late in November, Meta unveiled Cicero, an AI bot that can outperform humans in online Diplomacy games on webDiplomacy.net. That’s a significant accomplishment considering that diplomacy is mostly a social game that needs intensive persuasion, collaboration, and negotiation with other players in order to succeed. Meta essentially created a bot that could trick people into thinking they were playing with another person.
Meta trained Cicero’s extensive language model component on text extracted from the Internet as well as transcripts of 40,000 human-played Diplomacy games from the website webDiplomacy.net to hone its negotiating abilities. In the meantime, Meta created a strategic element that could assess the situation of the game, foresee how the other players would act, and then take appropriate action.
In order to power a new generation of video games with more intelligent NPCs or to lower barriers to communication between humans and AI during multi-session talks, Meta believes it can utilise Cicero’s lessons. Of course, the same method used in other social situations could be used to deceive or con individuals by mimicking them.
December: ChatGPT addresses the globe
OpenAI unveiled ChatGPT, a chatbot based on its GPT-3 big language model, on the final day of November. In order to gather information and suggestions from the public on how to improve the model to deliver more accurate and potentially less dangerous outcomes, OpenAI made it freely accessible on its website.
Sam Altman, CEO of OpenAI, tweeted five days after ChatGPT’s debut that it had amassed over a million users. People used it to make recipes, write poetry, replicate a Linux terminal session, aid with programming chores, and much more. Additionally, researchers rapidly worked out how to circumvent prohibitions against the tool responding potentially dangerous questions using prompt injection attacks.
The free pricing point meant that ChatGPT was the first time a large audience had experienced what OpenAI’s GPT technology can achieve, even though it channelled the best of what GPT-3 had already been doing since 2020 (with some substantial advancements behind the hood). It teased with its seeming capacity to comprehend difficult inquiries—if only it could deliver consistently accurate responses. The CEO of OpenAI acknowledges that part is a work in progress. But now that the gate is open, we can see a glimpse of a future powered by AI.