There’s been a great deal of hype and excitement in the artificial intelligence (AI) world around a newly developed technology known as GPT-3. Put simply; it’s an AI that is better at creating content that has a language structure – human or machine language – than anything that has come before it.
GPT-3 has been created by OpenAI, a research business co-founded by Elon Musk and has been described as the most important and useful advance in AI for years.
But there’s some confusion over exactly what it does (and indeed doesn’t do), so here I will try and break it down into simple terms for any non-techy readers interested in understanding the fundamental principles behind it. I’ll also cover some of the problems it raises, as well as why some people think its significance has been overinflated somewhat by hype.
What is GPT-3?
Starting with the very basics, GPT-3 stands for Generative Pre-trained Transformer 3 – it’s the third version of the tool to be released.
In short, this means that it generates text using algorithms that are pre-trained – they’ve already been fed all of the data they need to carry out their task. Specifically, they’ve been fed around 570gb of text information gathered by crawling the internet (a publicly available dataset known as CommonCrawl) along with other texts selected by OpenAI, including the text of Wikipedia.
If you ask it a question, you would expect the most useful response would be an answer. If you ask it to carry out a task such as creating a summary or writing a poem, you will get a summary or a poem.
More technically, it has also been described as the largest artificial neural network every created – I will cover that further down.
What can GPT-3 do?
GPT-3 can create anything that has a language structure – which means it can answer questions, write essays, summarize long texts, translate languages, take memos, and even create computer code.
In fact, in one demo available online, it is shown creating an app that looks and functions similarly to the Instagram application, using a plugin for the software tool Figma, which is widely used for app design.
This is, of course, pretty revolutionary, and if it proves to be usable and useful in the long-term, it could have huge implications for the way software and apps are developed in the future.
As the code itself isn’t available to the public yet (more on that later), access is only available to selected developers through an API maintained by OpenAI. Since the API was made available in June this year, examples have emerged of poetry, prose, news reports, and creative fiction.
This article is particularly interesting – where you can see GPT-3 making a – quite persuasive – attempt at convincing us humans that it doesn’t mean any harm. Although its robotic honesty means it is forced to admit that “I know that I will not be able to avoid destroying humankind,” if evil people make it do so!
How does GPT-3 work?
In terms of where it fits within the general categories of AI applications, GPT-3 is a language prediction model. This means that it is an algorithmic structure designed to take one piece of language (an input) and transform it into what it predicts is the most useful following piece of language for the user.
It can do this thanks to the training analysis it has carried out on the vast body of text used to “pre-train” it. Unlike other algorithms that, in their raw state, have not been trained, OpenAI has already expended the huge amount of compute resources necessary for GPT-3 to understand how languages work and are structured. The compute time necessary to achieve this is said to have cost OpenAI $4.6 million.
To learn how to build language constructs, such as sentences, it employs semantic analytics – studying not just the words and their meanings, but also gathering an understanding of how the usage of words differs depending on other words also used in the text.
It’s also a form of machine learning termed unsupervised learning because the training data does not include any information on what is a “right” or “wrong” response, as is the case with supervised learning. All of the information it needs to calculate the probability that it’s output will be what the user needs is gathered from the training texts themselves.
This is done by studying the usage of words and sentences, then taking them apart and attempting to rebuild them itself.
For example, during training, the algorithms may encounter the phrase “the house has a red door.” It is then given the phrase again, but with a word missing – such as “the house has a red X.”
It then scans all of the text in its training data – hundreds of billions of words, arranged into meaningful language – and determines what word it should use to recreate the original phrase.
To start with, it will probably get it wrong – potentially millions of times. But eventually, it will come up with the right word. By checking its original input data, it will know it has the correct output, and “weight” is assigned to the algorithm process that provided the correct answer. This means that it gradually “learns” what methods are most likely to come up with the correct response in the future.
The scale of this dynamic “weighting” process is what makes GPT-3 the largest artificial neural network ever created. It has been pointed out that in some ways, what it does is nothing that new, as transformer models of language prediction have been around for many years. However, the number of weights the algorithm dynamically holds in its memory and uses to process each query is 175 billion – ten times more than its closest rival, produced by Nvidia.
What are some of the problems with GPT-3?
GPT-3’s ability to produce language has been hailed as the best that has yet been seen in AI; however, there are some important considerations.
The CEO of OpenAI himself, Sam Altman, has said, “The GPT-3 Hype is too much. AI is going to change the world, but GPT-3 is just an early glimpse.”
Firstly, it is a hugely expensive tool to use right now, due to the huge amount of compute power needed to carry out its function. This means the cost of using it would be beyond the budget of smaller organizations.
Secondly, it is a closed or black-box system. OpenAI has not revealed the full details of how its algorithms work, so anyone relying on it to answer questions or create products useful to them would not, as things stand, be entirely sure how they had been created.
Thirdly, the output of the system is still not perfect. While it can handle tasks such as creating short texts or basic applications, its output becomes less useful (in fact, described as “gibberish”) when it is asked to produce something longer or more complex.
These are clearly issues that we can expect to be addressed over time – as compute power continues to drop in price, standardization around openness of AI platforms is established, and algorithms are fine-tuned with increasing volumes of data.
All in all, it’s a fair conclusion that GPT-3 produces results that are leaps and bounds ahead of what we have seen previously. Anyone who has seen the results of AI language knows the results can be variable, and GPT-3’s output undeniably seems like a step forward. When we see it properly in the hands of the public and available to everyone, its performance should become even more impressive.