This article is part of our series that explores artificial intelligence business
Since GPT-2, there has been a lot of excitement around applications of large language patterns. And in the past few years, we’ve seen LLMs being used for many exciting tasks, such as writing articles, designing websites, creating images, and even writing code.
But as I’ve argued before, there is wide gap between showing new technology, do something cool, and using that same technology to create a successful product with a working business model.
I think Microsoft just released the first real LLM product with public version of GitHub Copilot last week. It is an application that has a strong product-market fit, has huge added value, is hard to beat, is cost-effective, has very strong distribution channels, and can become a source of great profit.
The release of GitHub Copilot is a reminder of two things: First, LLMs are fascinating, but they are useful when applied to specific tasks, unlike artificial general intelligence. And secondly, the nature of the LLM places large technology companies such as Microsoft and Google unfair advantage to commercialize them – LLMs are not democratic.
The co-pilot is AI programming tool which installs as an extension to popular IDEs like Visual Studio and VS Code. It provides suggestions as you write code, kind of like auto-completion but for programming. Its capabilities range from completing a line of code to creating entire blocks of code, such as functions and classes.
Copilot is powered by Codeversion of OpenAI’s famous GPT-3 model, a large language model that made headlines for its ability to perform a wide range of tasks. However, unlike GPT-3, Codex is fine-tuned for programming tasks only. And it leads to impressive results.
The success of GitHub Copilot and Codex highlights an important fact. When it comes to actually using the LLM, specialization is better than generalization. When Copilot was first introduced in 2021. CNBC reported: “… when OpenAI was the first training [GPT-3]the startup had no intention of teaching her how to help code, [OpenAI CTO Greg] Brockman said. It was intended more like general purpose language model [emphasis mine] which could, for example, generate articles, correct incorrect grammar and translate from one language to another.”
But while GPT-3 found mild success in a variety of applications, Copilot and Codex proved to be great hits in one particular area. Codex cannot write poetry or articles like GPT-3, but has proven to be very useful for developers of various experience levels. Codex is also much smaller than GPT-3, meaning it is more memory and computationally efficient. And given that it is trained for a specific task, as opposed to the open and ambiguous world of human language, it is less prone to pitfalls that models like the GPT-3 often fall into.
It is worth noting, however, that just as GPT-3 knows nothing about human language, Copilot knows nothing about computer code. It is model transformer which is trained on millions of code repositories. Given a prompt (eg a piece of code or a text description), it will try to predict the next sequence of instructions that make the most sense.
With its huge training corpus and massive neural network, Copilot mostly makes good predictions. But sometimes it can make silly mistakes that even the most novice programmer would avoid. It doesn’t think about programs the way a programmer does. He can’t design software or think in steps and think about user requirements and experience and all the other things that go into building successful applications. It is it is not a substitute for human programmers.
Copilot Product/Market Fit
One of the stages for any product is achieving product/market fit, or proving that it can solve a problem better than alternative solutions on the market. In this regard, Copilot has been a resounding success.
GitHub released Copilot as a pre-production product last June and has since been used by more than one million developers.
According to GitHub, in files where Copilot is enabled, it accounts for an impressive 40 percent of the written code. Developers and engineers I spoke with last week say that while there are limitations to Copilot’s capabilities, there’s no arguing that it greatly improves their performance.
For some use cases, Copilot competes with StackOverflow and other code forums, where users must search for the solution to a specific problem they are facing. In this case, the added value of Copilot is very obvious and tangible: less frustration and distraction, more focus. Instead of leaving their IDE and searching the web for a solution, developers simply enter the description or documentation string of the functionality they want, and Copilot does most of the work for them.
In other cases, Copilot competes with manually writing frustrating code, such as configuring matplotlib charts in Python (a super frustrating task). While Copilot’s output may require some setup, it relieves most of the burden on developers.
In many other use cases, Copilot has been able to establish itself as a superior solution to problems that many developers face every day. Developers told me about things like running test cases, setting up web servers, documenting code, and many other tasks that previously required manual effort and were difficult. Copilot helped them save a lot of time in their daily work.
Distribution and profitability
Product/market fit is only one of several components to creating a successful product. If you have a good product but can’t find the right distribution channels to convey its value in a way that’s cost-effective and profitable, then you’re doomed. At the same time, you’ll need a plan to maintain your competitive edge, prevent other companies from replicating your success, and make sure you can continue to deliver value over time.
To make Copilot a successful product, Microsoft had to bring several very important pieces together, including technology, infrastructure, and market.
First, he needed the right technology, which he acquired thanks to his own exclusive license to OpenAI technology. As of 2019, OpenAI stopped open-sourcing its technology and instead licensed it to its financial backers, chief among them Microsoft. Codex and Copilot were created by GPT-3 with the help of OpenAI scientists.
Other major technology companies have succeeded in creating large language models that are comparable to GPT-3. But there’s no denying that LLMs are very expensive to train and run.
“For a model that’s 10 times smaller than Codex—the model behind Copilot (which has 12B parameters on paper)—it takes hundreds of dollars to estimate that benchmark that they used in their paper,” Lubna Ben Alal, machine learning engineer at Hugging Face, told TechTalks. Ben Alal mentioned another indicator used to evaluate the Codex, which costs thousands of dollars for its own smaller model.
“There are also security issues because you have to run untrusted programs to evaluate the model that might be malicious, sandboxes are commonly used for security,” Ben Alal said.
Leandro von Vera, another ML engineer at Hugging Face, estimated the training costs to be between tens and hundreds of thousands of dollars depending on the size and number of experiments needed to get it right.
“Inference is one of the biggest challenges,” von Vera added in comments to TechTalks. “Although almost anyone with the resources can train a 10B model these days, getting inference latency low enough to feel responsive to the user is an engineering challenge.”
This is where Microsoft’s second advantage comes into play. The company has been able to create a large cloud infrastructure that is specialized for machine learning models like Codex. It makes inferences and provides suggestions in milliseconds. And more importantly, Microsoft is able to launch and provide Copilot at a very affordable price. Copilot is currently available for $10/month or $100/year and will be made available for free to students and maintainers of popular open source repositories.
Most developers I spoke to were very happy with the pricing model because it made them much more than their cost in time savings.
Abhishek Thakur, another ML engineer at Hugging Face who I spoke to earlier this week, said, “As a machine learning engineer, I know a lot goes into building products like these, especially Copilot, which provides sub-latency offerings milliseconds. To build an infrastructure that serves these kinds of models for free is not possible in the real world over a longer period of time.”
However, starting an affordable LLM code generator is not impossible.
“Regarding the calculations to build these models and the data required: it is completely feasible and there are several Codex replications such as Incoder from Meta and CodeGen (now available for free at Hugging Face Hub) from Salesforce, matching Codex’s performance,” von Vera said. “There’s definitely a bit of engineering involved in building the models into a quick and nice product, but it seems like a lot of companies could do that if they wanted to.”
However, this is where the third piece of the puzzle begins. Microsoft’s acquisition of GitHub gave it access to the largest developer marketplace, making it easier for the company to get Copilot into the hands of millions of users. Microsoft also owns Visual Studio and VS Code, two of the most popular IDEs with hundreds of millions of users. This reduces the pressure for developers to adopt Copilot as opposed to another similar product.
With its pricing, efficiency and market reach, Microsoft seems to have cemented its position as the leader in the emerging market for AI-assisted software development. The market may take other turns. What is certain (and as I pointed out before) is that large language models will open up many opportunities to create new applications and markets. But they won’t change the fundamentals of sound product management.
This article was originally published by Ben Dixon on TechTalks, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the dark side of technology, the darker implications of new technologies and what to watch out for. You can read the original article here.