In the classic 2014 blog post “Always Bet On Text”, Graydon Hoare persuasively articulates the many advantages of text over other data modalities: it is the most flexible communication technology; it is the most durable; it is the cheapest and most efficient; it is the most useful and versatile socially; it can convey ideas with a precisely controlled level of precision and ambiguity; it can be indexed, searched, corrected, summarized, filtered, quoted, translated. In Hoare’s words, “It is not a coincidence that all of literature and poetry, history and philosophy, mathematics, logic, programming and engineering rely on textual encodings for their ideas.”
Every industry, every company, every business transaction in the world relies on language. Without language, society and the economy would grind to a halt.
The ability to automate language thus offers entirely unprecedented opportunities for value creation. Compared to text-to-image AI, whose impacts will be felt most keenly in select industries, AI-generated language will transform the way that every company in every sector in the world works.
To illustrate the depth and breadth of the coming transformation, let’s walk through some example applications.
From Sales to Science
The first true “killer application” for generative text, in terms of commercial adoption, has proven to be copywriting: that is, AI-generated website copy, social media posts, blog posts and other marketing-related written content.
AI-powered copywriting has seen stunning revenue growth over the past year. Jasper, one of the leading startups in this category, launched a mere 18 months ago and will reportedly do $75 million in revenue this year, making it one of the fastest-growing software startups ever. Jasper just announced a $125 million fundraise valuing the company at $1.5 billion. Unsurprisingly, a raft of competitors has emerged to chase this market.
But copywriting is just the beginning.
Many pieces of the broader marketing and sales stack are ripe to be automated with large language models (LLMs). Expect to see generative AI products that will, for instance: automate outbound emails from sales development representatives (SDRs); accurately answer questions from interested buyers about the product; handle email correspondence with prospective customers as they move through the sales funnel; provide real-time coaching and feedback to human sales agents on calls; summarize sales discussions and suggest next steps; and more. As more of the sales process is automated, human representatives will be freed up to focus on the uniquely human aspects of selling, like customer empathy and relationship building.
In the world of law, generative AI will largely automate contract drafting. Much of the back-and-forth between legal teams on deal documents will come to be carried out by LLM-powered software tools that understand each client’s particular priorities and preferences and automatically hash out the language in transaction documents accordingly. Post-signing, generative AI tools will greatly simplify contract management for companies of all sizes.
Language models’ powerful ability to summarize and answer questions about text documents will likewise transform legal research, discovery and various other parts of the litigation process.
In healthcare, generative language models will help clinicians compose medical notes. They will summarize electronic health records and answer questions about a patient’s medical history. They will help automate time-intensive administrative processes like revenue cycle management, insurance claims processing and prior authorizations. Before long, they will be able to propose diagnoses and treatment regimes for individual patients by combining an in-depth understanding of the existing research literature with a given patient’s particular biomarkers and symptoms.
Generative AI will transform the world of customer service and call centers, across industries: from hospitality to ecommerce, from healthcare to financial services. The same goes for internal IT and HR helpdesks.
Language models can already automate much of the work that happens before, during and after customer service conversations, including in-call agent coaching and after-call documentation and summarization. Soon, paired with generative text-to-speech technology, they will be able to handle most customer service engagements end-to-end, with no human needed—not in the stilted, brittle, rules-based way that automated call centers have worked for years, but in fluid natural language that is effectively indistinguishable from a human agent.
To put it simply: nearly all of the interactions that you as a consumer will need to have with a brand or company, on any topic, can and will be automated.
The way that we handle structured data—a foundational business activity at the heart of most organizations—will be transformed by generative language models. Recent research out of Stanford shows that language models are remarkably effective at completing various data cleaning and integration tasks—e.g., entity matching, error detection, data imputation—even though they weren’t trained for these activities. A fun demo recently posted on Twitter hints at the ways that generative AI will transform how we work with programs like Microsoft Excel.
News reporting and journalism will become highly automated. While human investigative journalists will continue to chase down stories, the production of the articles themselves will increasingly be handed over to generative AI models. Before long, the majority of the online content that we consume in our daily lives will be AI-generated.
In government, lawmakers will rely on LLMs to help draft legislation. Regulators will employ them to help translate laws into detailed regulations and codes. Bureaucrats from the federal to the municipal level will use them to help streamline the many functions of the administrative state, from processing permitting applications to handing out petty fines.
In academia, generative language models will be used to draft grant proposals, to synthesize and interrogate the existing body of literature, and—yes—to write research papers (both by students and professors). A scandal involving students using generative language tools to write their school essays for them is no doubt just around the corner.
The process of scientific discovery itself will be accelerated by generative language models. LLMs will be able to digest the entire corpus of published research and knowledge in a given field, assimilate key underlying concepts and relationships, and propose solutions and promising future research directions.
This is not a speculative future possibility; it has already been done. A group of researchers from UC Berkeley and Lawrence Berkeley National Laboratory recently showed that large language models can capture latent knowledge from the existing literature on materials science and then propose new materials to investigate.
It is worth quoting directly from their paper, which was published in Nature: “Here we show that materials science knowledge present in the published literature can be efficiently encoded as information-dense word embeddings without human supervision. Without any explicit insertion of chemical knowledge, these embeddings capture complex materials science concepts such as the underlying structure of the periodic table and structure-property relationships in materials. Furthermore, we demonstrate that an unsupervised method can recommend materials for functional applications several years before their discovery.”
Beyond Natural Language
One of the most promising commercial applications of generative language models does not involve natural language at all: LLMs promise to revolutionize the creation of software.
Whether it’s Python, Ruby or Java, software programming happens via languages. As with natural languages like English or Swahili, programming languages are symbolically represented, with their own internally consistent syntax and semantics. It therefore makes sense that the same powerful new AI methods that can gain incredible fluency with natural language can likewise learn programming languages.
Today’s world runs on software. The size of the global software market today is estimated at half a trillion dollars. Software has become the lifeblood of the modern economy. The ability to automate its production therefore represents a staggeringly large opportunity.
The first mover and 800-pound gorilla in this category is Microsoft. Together with its subsidiary GitHub and its close partner OpenAI, Microsoft launched an AI coding companion product named Copilot earlier this year. Copilot is powered by Codex, a large language model from OpenAI (which in turn is based on GPT-3).
Soon thereafter, Amazon launched its own AI pair programming tool named CodeWhisperer. Google has likewise developed a similar tool, though the company only uses it internally and does not offer it publicly.
These products are only a few months old, but it is already becoming evident how transformative they will be.
In a recent study, Google found that employees who used its AI code completion tool saw a 6% reduction in coding time compared to those not using the tool, with 3% of those employees’ code being written by the AI.
Recent data from GitHub is even more remarkable: the company found in a recent experiment that using Copilot can reduce the time required for a software engineer to complete a coding task by 55%. According to GitHub’s CEO, up to 40% of the code written at the company is now being produced by AI.
Now imagine scaling these productivity gains across all of Google, all of Microsoft—all of today’s software industry. Untold billions of dollars of value creation are up for grabs.
Is Microsoft’s Copilot destined to own this market? Not necessarily.
For one thing, many organizations will not feel comfortable exposing their full internal codebases to a big tech player like Microsoft in the cloud, and will prefer to work with a neutral startup that deploys its solution on-premise. This will be particularly true in highly regulated industries like financial services and healthcare.
In addition, Copilot faces an interesting organizational challenge: the product is jointly built and maintained by Microsoft, GitHub and OpenAI. These are three different organizations with different teams, cultures and cadences. This space is moving at breakneck speed right now; rapid product iterations and short development cycles will be essential as the technology and market evolve. The Microsoft/GitHub/OpenAI triad may struggle with coordination and agility as they seek to compete with more nimble startups in this category.
Most importantly, software development is an enormous, sprawling field. The market for AI-generated software will not be winner-take-all. Just as there is a deep, diverse ecosystem of tools for different parts of today’s software engineering stack, a number of different winners will emerge in the world of AI code generation.
For instance, successful startups might be built that focus solely on automating code maintenance, or code review, or documentation, or front-end development. A wave of promising new startups has already emerged to pursue these opportunities.
Zooming Out
Having walked through a wide range of possible commercial applications for generative language models, three big-picture points are worth making.
First, some readers, especially those who have not spent much time working first-hand with today’s language models, may be asking themselves: are the use cases described here actually plausible? Will generative language models really be able to effectively and reliably draw up a contract, or email back and forth with a sales prospect, or draft a piece of legislation—not just in a highly controlled demo or research setting, but when faced with all of the messiness of the real world?
The answer is yes.
We have delved into the technology breakthroughs underpinning today’s language AI revolution in detail in previous articles. But one important factor is worth mentioning here: the vast majority of content that humans produce—messages we write, ideas we articulate, proposals we put forth—is unoriginal.
This may sound harsh. But the fact is that most website copy, most email exchanges, most customer service conversations, even most laws contain little true novelty. The exact words vary, but the underlying structure, semantics and concepts are predictable and consistent, echoing language that has been written or spoken a million times before.
Today’s AI has become powerful enough to learn these underlying structures, semantics and concepts from the vast corpora of existing text on which it has been trained—and to convincingly replicate them with new output when prompted.
Our current state-of-the-art language models could not produce writing with the disruptive originality of, say, Friedrich Nietzsche, whose unprecedented ideas reframed centuries of prior thought. But how much of the content that humans generate on a day-to-day basis—in any of the use cases described above, or in any other setting—falls into that category?
We will find that LLMs are effective at automating a surprisingly large amount of humanity’s language production—those parts that are essentially unoriginal.
The second big-picture point: one important reason why generative language models will become so powerful is that any output from a language model can in turn serve as the input to a language model. This is because language models’ input and output modalities are the same: text in, text out. This is a key difference between language models and text-to-image models. This may sound like an arcane detail, but it has profound implications for generative AI.
Why does this matter? Because it enables what has come to be known as “prompt chaining.”
Even though large language models are incredibly capable, many tasks that we will want them to complete are too complex to be carried out by a single run of the model—i.e., tasks that require intermediate actions or multi-step reasoning. Prompt chaining enables users to break one broad goal into various simpler subtasks that the language model can tackle in succession, with the output of one subtask serving as the input of the next.
Clever prompt chaining enables LLMs to carry out far more sophisticated activities than would otherwise be possible. Prompt chaining also enables models to retrieve information from external tools (e.g., searching Google, pulling information from a given URL), by incorporating this action as one of the steps in the chain.
An illustrative example of prompt chaining comes from Dust, a new startup building tools to help people work with generative language models. Dust built a web search assistant that can answer a user’s question (e.g., “Why was the Suez Canal blocked in March 2021?”) by searching Google, taking the top 3 results, pulling the content from those sites, summarizing it, and then synthesizing a final answer that includes citations.
Another fun prompt chaining example: an app that, when provided with the URL of a research paper, automatically generates a Twitter thread summarizing the paper’s main points.
Prompt chaining will make the creation of LLM-powered applications more composable, extensible and interpretable. It will enable the creation of complex software programs with generalized capabilities. There is no equivalent to this recursive richness in text-to-image AI.
This brings us to our third and final point: one of the most important considerations in productizing and operationalizing LLMs will be how and when to have a human in the loop.
At least initially, most generative language applications will not be deployed in a fully automated way. Some level of human oversight of their outputs will continue to be prudent or necessary. What exactly this looks like will vary considerably depending on the application.
In the near term, the most natural mode of engagement for human users of LLM applications will be iterative and collaborative: that is, the end user will be the human in the loop. The human user will, say, present the model with an initial prompt (or prompt chain) to generate a given output; review the output and then tweak the prompt to improve the quality of the output; run the model many times on the same prompt in order to select the most relevant versions of the model’s output; and then manually refine this output before deploying the language for its intended use.
This type of workflow will be effective for many of the example applications discussed above: drafting contracts, writing news articles, composing academic grant proposals. If the AI system can produce a draft that is 50%, or 75%, or 90% complete out of the box, that translates to massive time savings and value creation.
For some lower-stakes use cases—say, writing outbound sales emails or website copy—the technology will soon be advanced and robust enough that users motivated by the potential productivity gains will feel comfortable automating the application end-to-end, with no human in the loop at all.
At the other end of the spectrum, some safety-critical use cases—say, using generative models to diagnose and propose treatments for individual patients—will for the foreseeable future require a human in the loop to review and approve the models’ output before any real action is taken.
But make no mistake: generative language technology is improving fast—almost unbelievably fast. Within months, expect industry leaders like OpenAI and Cohere to release new models that represent dramatic, step-change improvements in language capabilities compared to today’s models (which themselves are already breathtakingly powerful).
Over the longer term, the trend will be decisive and inevitable: as these models get better, and as the products built on top of them become easier to use and more deeply embedded in existing workflows, we will hand over more responsibility for more of society’s day-to-day functions to AI, with little or no human oversight. More and more of the use cases described above will be carried out end-to-end, in a closed-loop manner, by language models that we have empowered to decide and act.
This may sound startling, even terrifying, to readers today. But we will increasingly acclimate to the reality that machines can carry out many of these functions more effectively, more quickly, more affordably and more reliably than humans could.
Massive disruption, vast value creation, painful job dislocation and many new multi-billion-dollar AI-first companies are around the corner.