Every business is scrambling to keep up with the exponential advances in Artificial Intelligence. From Chat-GPT to Image generation, it isn’t obvious where to begin. It can be a truly daunting task. In the following series of posts, I’ll walk you through creating a helpful assistant that’s trained on your company’s information.
Definitions
Let’s list some common terms to level-set your understanding.
A field of artificial intelligence that enables computers to learn and make predictions from data without explicit programming. It involves training algorithms on datasets to recognize patterns and make decisions or predictions based on that learned knowledge.
Refers to a type of artificial intelligence that can create new, original content, such as text, images, or music, by learning from patterns in existing data. It uses deep learning techniques, like neural networks, to generate output that is often indistinguishable from human-created content.
In AI classification, an “intent” represents the specific goal or purpose behind a user’s input, helping the AI system understand what the user wants to achieve. It’s a key component for natural language understanding and plays a pivotal role in recognizing and responding to user interactions effectively.
Fine-tuning an AI model involves adjusting its pre-trained weights on a specific task or dataset by training it further, typically with a smaller learning rate, to adapt its knowledge and optimize its performance for the target task. This process helps the model leverage its pre-trained knowledge while tailoring it to the specific requirements of the new task.
A natural language processing technique that combines two key components: a generative model and a retrieval mechanism. It enhances text generation by allowing the model to retrieve and incorporate relevant information from external sources, such as a database or a knowledge base, to produce more contextually informed and coherent text outputs
A mathematical representation used to store and manipulate data. It consists of a list of numerical values arranged in a specific order, often in a one-dimensional array, and is commonly employed to represent features, attributes, or data points in machine learning and data analysis tasks.
A search technique that goes beyond traditional keyword matching and takes into account the meaning and context of words in order to deliver more relevant search results. It aims to understand the user’s intent and the content of documents to provide a deeper level of understanding and accuracy in search results.
Embedding in AI is crucial because it converts high-dimensional data, such as text, images, or categorical information, into lower-dimensional representations that capture meaningful relationships and patterns. These embeddings enable machine learning models to effectively process and generalize from complex data, improving their performance in tasks like natural language understanding, image recognition, and recommendation systems.
A class of machine learning models that excel in natural language processing tasks by utilizing self-attention mechanisms, allowing them to capture contextual information efficiently. They have revolutionized various applications, including language translation, text generation, and sentiment analysis, and are known for their scalability and ability to handle large datasets.
Sophisticated artificial intelligence systems that use deep learning techniques to process and generate human-like text. These models are trained on vast amounts of text data, enabling them to understand and generate natural language text for a wide range of tasks, such as text generation, translation, and question-answering.
We’ve been talking about AI since the 1950s. So why all the hype in the last couple years? There are a few things that have caused this Cambrian explosion.
- Cloud Computing – cheap and accessible compute
- GPU Chips – fast parallel compute
- Transformers – Attention is all you need
- LLMs – Building a model with billions of parameters had an unexpected result.
Turns out that using a new type of model in 2017 called a ‘Transformer’ solved a lot of problems with memory. Let’s say we have a long passage of text that mentions a person ‘Linda’ in the first paragraph that became a millionaire by selling her business.
It was traditionally difficult to remember what that information at the beginning when you have a lot of text. So, asking a question like ‘How did she get rich?’, or ‘What was her name’ was a hard thing to get right. A transformer model is very efficient at using ‘attention’ to determine which information in a passage of text is important to keep track of.
With this breakthrough, companies like OpenAI built massive models using the internet as their source of knowledge. These models like ChatGPT use 175 billion parameters and counting. The initial intent of these models was to predict the next word. For example:
The man opened his car [door]
She smiled when she looked through the [window]
The side effect of this approach was that the model has some reasoning skills. It would know things that it wasn’t specifically trained on. For example,
What’s 2 + 2? 4
Write a movie script in the style of Quentin Tarentino.
Now that we have a basic overview of Generative AI, and large language models…we’ll dive into more details in the next article.