On November 30, a tool called ChatGPT was released on the internet. It created quite a stir especially among the artificial intelligence (AI) crowd because this tool ‘knew’ every topic under the sun; it could answer questions and carry on a conversation. Experts in the AI community call this an epochal moment, stressing how powerful ChatGPT is. This tool interacts with humans in natural language and is impressive because aside from answering general queries, it has many other functions. ChatGPT has been developed by OpenAI, which is a research institute and company that focuses on developing artificial intelligence technology in a responsible and safe way. It was founded in 2015 by a group of entrepreneurs and researchers, including Elon Musk, Sam Altman, and Greg Brockman.
ChatGPT is much more than a chat bot. For example, you can ask it to write a program or even a simple software application. It can also do creative tasks such as writing a story. It can explain scientific concepts and answer any question that needs factual answers. ChatGPT is what is called a Language Model, rather than a chat bot. A language model is a software that prints out a sequence of words as output that are related to some words given as input with appropriate semantic relation; in practical terms, it means that it can perform tasks like answering questions and carrying on a conversation with humans. It is often used in natural language processing (NLP) applications, such as speech recognition, automatic translation, and text generation.
It is also a neural network. A neural network can be thought of as a large network of computers that can fine tune its output of words based on the feedback given to it during stages of training: this training process and the technology together are called Reinforcement Learning. The input data is typically huge corpus of text. All these technologies are part of the artificial intelligence (also called Machine Learning) that has been witnessing tremendous advancements.
While one tries to understand how a language model works, we should also look at “word embedding” which represents words as a matrix of numbers that can be manipulated inside computers. When a neural network processes these numbers, it can differentiate words according to different contexts: for example, when “shoot” appears with “gun” the neural network knows that the words that will follow may mostly be “bullets” or “victims”, whereas when “shoot” appears with “camera”, the neural network knows that the following words may be “picture” or “pixel”. With a further refining technique called “Transformer”, a neural network can accurately “understand” the context of a sentence or a paragraph. This “comprehension” can be used for multiple purposes like answering a question, summarising a paragraph or an article, translating documents and so on.
ChatGPT follows a generation of language models that were released by OpenAI from 2018. In 2018, OpenAI released the Generative Pre-Training (GPT) language model. Here, generative means that it is a type of neural network that can create new content based on input content, called Training Data. This technology makes it suitable for creative tasks like writing a new story.
With the transformer technique mentioned above, GPT was improved and “Generative Pre-trained Transformer 2” or GPT-2 was released in 2019. GPT-3 with even more sophisticated neural networks was launched in 2020. In early 2022, GPT3.5 was released and ChatGPT is successor to GPT3.5. Each successive generation is more advanced than its predecessor. For example, GPT-3 was trained with 175 billion parameters. These large language models have looked at almost all text available on the internet and many other text documents, thereby making them highly informed.
ChatGPT is fine-tuned to provide conversational responses, as against essay-type content, because the neural network behind it has been additionally trained on conversational transcripts with human feedback.
There are a few other language models that are popular in the AI community, like BERT (Bidirectional Encoder Representations from Transformers) from Google. However, ChatGPT seems to be the most powerful for conversational purposes.
The accuracy of ChatGPT or any language model can be measured using standard techniques. One such technique is “Recall-Oriented Understudy for Gisting Evaluation” or the ROUGE metric which compares ChatGPT’s output of content against a standard expected content and measures the overlap as success percentage. For language models like GPT that are also used in translation, another metric called the BLEU metric (Bilingual Evaluation Under Study) is employed; this metric compares overlap in translated content with a standard translation.
In addition to the conversational nature of the tool, the creative generating capability is very appealing. ChatGPT can become a powerful pedagogy tool on any topic to anyone, because we can instruct it to “explain it to me like I am a six year old”. It can explain in simple terms anything from philosophy to cooking recipes, including new recipes of its own. If you are in mood for some fun, you can ask ChatGPT to narrate a new story to you!
Is ChatGPT the most powerful NLP tool? For conversational purposes, the answer is yes. However, it may not be equally powerful in specialised contexts. For example, if a doctor needs an automatic conversational assistant for medical queries, the neural networks behind ChatGPT need to have been trained on specialised data. Considering that ChatGPT can write programs, it should be possible to make it knowledgeable on any specialised topic eventually. For general purposes, ChatGPT can be considered the most powerful for now. (The tool can be accessed by anyone from this site, https://chat.openai.com/chat)
S. Varahasimhan is a senior employee at a software product MNC in Chennai