Large-scale models, commonly known as LLMs, have recently become a cultural phenomenon, with models such as DALL-E, ChatGPT, and Copilot capturing the public’s imagination and making AI a household term. One should not forget the infamous MetaAI’s Galactica. LLMs are not a new concept. They’ve been there for quite some time. Google’s search engine and email, for example, which we use and interact with on a daily basis, are powered by Google’s LLM and BERT (bidirectional encoder representations from Transformers). Similarly, Apple is deploying its Transformer models (the “T” in GPT) on the Apple Neural Engine to enable a variety of experiences such as panoptic segmentation in cameras, on-device scene analysis in photos, image captioning for accessibility, and machine translation, amongst many others.
In reality, LLMs are now a household name that can be attributed to OpenAI, a non-profit organization that made the models available for the public to try and test, thereby increasing the ‘cool quotient’ of these models while also improving them. In contrast, major technology companies such as Apple, Google, and Meta have quietly incorporated their language models into their own products and software applications. While the strategy benefited them, OpenAI’s case can be viewed as a classic example of building in public. There have been many other product developments such as OpenAI’s open-source API (for example, GPT-3 for Jasper, Notion, or GPT-3.5 for Whatsapp integrations) or, in some cases, OpenAI’s products tightly incorporated into the software like an offering (Dall-E for Shutterstock, Copilot for GitHub).
Is the Scalability Sufficient?
A simple example would be that a five-year-old human processing ten images per second in about 100 milliseconds has only consumed sufficient data in their lifetime to equal the amount produced by Google, Instagram, and YouTube in hours. They can, however, reason far better than any AI has, even with the 1:1000 of data required by LLMs. While ‘text-to-anything’ applications have certainly given language models short-lived fame, their future looks bleak because it is a ‘data-intensive’ task, and with the rate at which they are deployed, we may be approaching a point where our very source of data may end up being AI-produced, for instance, how ChatGPT produced results may populate the internet in near time.
As a result, some have suggested abandoning the term “Artificial Intelligence” in favor of something more apt, such as “cultural synthesizer,” for systems such as GPT-3 and Dall-E, which lack reason and higher-level theory building.
Alternative Approches
In recent years, several approaches to addressing the “cognition problem” have emerged.
According to Dave Ferrucci, founder of Elemental Cognition, pursuing a “hybrid” approach that uses language models to generate hypotheses as an “output” and then performing reasoning on top using “causal models”. Human-in-the-loop is used to develop such an approach.
In addition to this, According to Ben Goertzel (founder of the SingularityNET Foundation and the AGI Society), current deep learning programs will not lead to AGI, but systems that “leap beyond their training” towards “open-ended growth” are now quite viable. Thus, Jürgen Schmidhuber’s concept of meta-learning, which he defines as a general system that can “learn all of these things and, depending on the circumstances, the environment, and the goal programming, it will invent deep learning that is properly-suited for this type of problem, for this class of problems, and so on,” is the way forward.
As a result, the frameworks underlying the more well-known models lack cognitive abilities. Although, other mainstream approaches, such as OpenCog Hyperon, NARS, and SOFAI, are working in these areas, although in a less glamorous and exciting manner than models like ChatGPT or GPT-3.
Author- Toshank Bhardwaj, AI Content Creator