Master AI: AI Fundamentals with Machine Learning, Generative AI, & More

AI isn't just a futuristic concept anymore.

It's here, ready to transform how you approach your work by streamlining your daily tasks and unlocking new possibilities.

But to really maximize AI, you need to know the basics.

The essential AI knowledge in this article will demystify the rapidly changing AI landscape and equip you with empowering, practical know-how so you can start using AI effectively… and fast.

In this article:

High-level terms
Philosophical & societal considerations of AI
The machine learning (ML) toolbox
Generative AI overview
Generative AI architecture
How AI models are trained
A deep dive into LLMs
Using AI & Automations

NOTE: Because the content builds upon itself, we recommend reading this in sequential order so that you don't miss critical terminology and explanations.

High-Level Terms

‘AI’ and ‘ML’ get thrown around a lot… What do they mean?

Artificial Intelligence (AI)

Artificial intelligence is a broad term referring to a machine’s ability to perform tasks that would typically require human intelligence (e.g., speech recognition, language comprehension, and making decisions or predictions based on data).

What’s not considered AI?

Basic math, which involves fundamental arithmetic calculations that do not require advanced learning algorithms
Image manipulation like convolution and filtering, which are standard processes that don’t require intelligent decision-making
Data entry tasks that do not involve complex AI algorithms for analysis or decision-making

Algorithm

An algorithm is a set of step-by-step instructions that guide machines in performing tasks and making decisions. Algorithms are used across the entire spectrum of AI models.

AI Model

An AI model refers to a program or algorithm trained and programmed by a human on specific data to achieve an explicitly defined task.

A few types of AI models include:

Rule-Based Systems	The act of machines performing tasks based on predetermined rules that have been hard-coded into them by humans, resulting in predefined outcomes using "if-then" coding statements
Expert Systems	The act of machines performing tasks based on predetermined expertise that has been hard-coded into them by humans to simulate the judgment and behavior of a human expert
Machine Learning	The act of machines learning to perform tasks and optimize performance from experience — without a human explicitly defining the rules

AI Models vs. ML Models

While all ML models are considered AI models, not every AI model is an ML model. The key difference between rule-based and expert-based AI models and ML models is that:

Rule-based & expert AI models do not learn or improve over time
ML models do learn and improve over time

AI System

An AI system refers to the entire infrastructure and framework required for building and deploying AI.

AI Models vs. AI Systems

While the AI model is a central component of an AI system, an AI system also includes data acquisition, hardware and software training resources, the user interface, and more.

Philosophical & Societal Considerations of AI

Before we dive into more technical terminology, we think that it’s imperative for you to understand how AI is currently affecting — and could possibly affect — our world.

AI Ethics

Many people are concerned about the rapid development of AI and ensuring that it is aligned with societal values and principles.

Some key terminology in this space includes:

Alignment	Ensuring an AI system’s goals align with human values and interests
Responsible AI	Ethical & responsible use of AI technology
Explainability	Making AI models' decision-making process transparent
Black Box	When an AI model's decision-making is not understood by humans
Singularity	Hypothetical point when AI systems surpass human comprehension in a way that leads to unpredictable societal changes

Alignment

Alignment is the process of ensuring that an AI system’s goals align with human values and interests. It is a crucial aspect of the larger concept of responsible AI. In a:

Practical sense, alignment means that an AI system does what we ask it to (e.g., when we ask for a blog post about dogs, it gives us a blog post about dogs).
Broader sense, alignment ensures that an AI system does not become so overly focused on a goal that its decisions could cause greater harm.

Responsible AI

Responsible AI refers to the ethical and responsible use of AI technology — ensuring that AI systems are designed and implemented in a way that respects human rights, diversity, and privacy.

For example, inputs of an email trigger may be the email body, subject line, sender's email, sent date, tags added, etc.

Explainability

A critical approach to building responsible AI, explainability refers referring to making AI models — and how they make certain decisions — transparent and easy to understand.

Black Box

A "black box" is when the internal workings and decision-making processes of an AI model are not easily understood or explained, even by the developers who created it.

"Black boxes" raise concerns related to trust and accountability.

Singularity

Most responsible AI and alignment research focuses on ensuring that the integration of AI technology into society has a positive impact.

However, we must also plan for singularity, which is a hypothetical point in the future where AI systems become capable of designing and improving themselves without human intervention — surpassing human comprehension in a way that leads to rapid and unpredictable societal changes.

AI Progression of Intelligence

The continued progression of intelligence in AI systems is not only expected but, in some ways, is the goal.

So where is AI intelligence currently at… and, perhaps more importantly, where is it going?

Artificial Narrow Intelligence (ANI)	We initially started with ANI, where AI systems were designed to perform specific tasks or sets of tasks, e.g. voice recognition or image classification.
Artificial General Intelligence (AGI)	We are now arguably at AGI, where AI systems have human-level intelligence and can perform a wide range of tasks.
Artificial Superintelligence (ASI)	We’re on our way to ASI, where AI systems surpass human intelligence and can perform tasks beyond human comprehension. Singularity falls into this category.

The ML Toolbox

Now that we understand AI’s broader societal implications, let’s explore the more technical aspects of how it all works.

To understand machine learning, it's helpful to think of it as a toolbox filled with different tools that each solve different problems.

Just like tools in a toolbox, there are various machine learning approaches (i.e., tools) — each with its own strengths and weaknesses. To get the result you desire, it’s crucial to use the right tool for the job.

Types of Machine Learning Approaches

Imagine you work for Amazon and need to build an AI model that recommends products to customers based on their past purchases.

To pull this off, you could choose any of the following machine learning approaches:

Supervised Learning	Learning via a labeled dataset, predetermined by humans
Unsupervised Learning	Learning by identifying patterns in data without explicitly labeled outputs
Reinforcement Learning	Learning via rewards or penalties based on the model's actions
Deep Learning	Learning via a neural network (i.e., layered, interconnected nodes similar to the human brain)

NOTE: These are the most common machine learning approaches, but they are not the only available options.

Supervised Learning

The most common approach is supervised learning.

Supervised learning involves teaching an ML model how to make helpful responses by providing a labeled dataset, predetermined by humans so that the model can learn the relationship between inputs and the ideal output.

The model would then learn a pattern to make predictions for new input data based on the patterns it learned from the labeled examples.

In our Amazon example, the inputs may be customer information and product ratings & reviews, and the output might be what the customer ended up purchasing. The model could then predict which products may interest a customer.

Unsupervised Learning

Unsupervised learning is when a model identifies patterns in a dataset without any explicitly labeled outputs.

In other words, it could determine that customers who buy a laptop also tend to purchase a wireless mouse and a laptop case — even without you telling it to look for those patterns.

For example, using unsupervised learning, Amazon's recommendation engine could learn (on its own, without a human telling it) to group products that are frequently bought together or customers with similar purchasing patterns.

Unsupervised learning is particularly useful in situations where accurately labeling a large volume of diverse, intricate data is a prohibitively timely and expensive undertaking for a human to perform.

Reinforcement Learning

You could also use reinforcement learning, which refers to the model learning by receiving rewards or penalties based on its actions.

Reinforcement learning is akin to teaching the model to play a game in which it:

Gets points for recommending products that customers are likely to buy
Loses points for recommending products that customers are not interested in

Reinforcement learning would be a good option if:

You didn’t have access to a historical dataset of purchases on which to train your model
Input data is sparse or extremely complex
You want the model to have more flexibility in learning patterns that help it complete its task (which often leads to higher accuracy and performance than inflexible models)

Types of Reinforcement Learning

There are multiple types of reinforcement learning, each with differences in how the model learns.

How It Learns	Human Involvement?	AI Model Example
Interacts with its environment	No	DeepMind's AlphaGo Zero
Utilizes another model's feedback*	No	Anthropic's Claude
A combination of traditional reinforcement learning and human guidance (This is called RLHF: Reinforcement Learning from Human Feedback)**	Yes	OpenAI's ChatGPT

*By utilizing another AI model's feedback, the new model can benefit from the insights and experiences of the pre-existing model — leading to more accelerated learning and enhanced performance, particularly in tasks that require complex decision-making or understanding of intricate patterns in data.

**In tasks where human judgment is essential (such as natural language processing for chatbots), human guidance helps refine the quality of generated outputs.

Deep Learning

The most powerful of them all (and how Amazon has actually built their recommendation system) involves an approach called ‘deep learning’.

Deep learning is a process of training a program called a neural network — which is quite similar to the human brain’s structure and function, consisting of many layers of interconnected nodes that work together to process data.

NOTE: We'll cover more about how neural networks work in the "Generative AI Architecture" section.

The main advantage of deep learning is that the model can automatically learn to recognize complex patterns without a human manually defining inputs.
The downside is that the model may get exceedingly complex to the point that it turns into a “black box” — meaning that it becomes challenging to trace its thought process and understand why specific decisions are made, leading to concerns regarding reliability, safety, and fairness.

8 Common Machine Learning Tasks

Now that we know the tools in the toolbox, let’s discuss how the tools can be used (i.e., the tasks the tools can perform).

While there are a growing number of ways machine learning is being used, the eight most common include:

Prediction	Predicts likelihood of a certain outcome
Classification	Categorizes data
Natural Language Processing (NLP)	Processes language
Computer Vision	Interprets visual data
Speech Recognition	Transcribes speech into written text
Anomaly Detection	Identifies unusual patterns
Clustering	Identifies groupings in data
Generation	Creates new content

1. Prediction

Fairly self-explanatory, prediction involves the AI model predicting the likelihood of a certain outcome, typically framed as probabilities.

Example Use Case: Social media ranking algorithms predict the probability that a user will click on a specific ad.

2. Classification

Classification refers to an AI model identifying patterns in the input data and then using those patterns to predict the category (or label) for new, unseen data points. It can be thought of as categorizing data.

For example, an AI model trained on images of different types of animals (labeled as "cat," "dog," "bird") could then be used to classify a new image and predict whether it shows a cat, dog, or bird.

Example Use Case: Classification powers many e-commerce tagging systems, which categorize products using keywords or labels.

3. Natural Language Processing (NLP)

Natural language processing (NLP) focuses on an AI model understanding and processing language.

NLP is a key function of large language models (LLMs), which are models that process and generate human language (such as ChatGPT). NOTE: We’ll cover more about LLMs in the “A Deep Dive Into LLMs” section.

Example Use Cases: NLP is often used in customer service to analyze sentiment of customer feedback and in industry research to analyze large volumes of text data to extract insights.

4. Computer Vision

Computer vision refers to an AI model analyzing and interpreting visual data (e.g., images or videos) from cameras or sensors.

Example Use Cases: AI models can leverage computer vision to identify defects in products using visual data, enabling efficient quality control in manufacturing.

5. Speech Recognition

The term speech recognition is a bit misleading since it involves both recognizing speech and transcribing it into written text.

Example Use Case: Speech recognition is one of the most widely used consumer use cases of AI today… ahem, Siri.

6. Anomaly Detection

Anomaly detection identifies unusual or abnormal patterns in data.

Example Use Case: AI models can detect & prevent cyber attacks by identifying unusual network activity.

7. Clustering

Clustering involves identifying groupings and patterns in data without explicitly defining the criteria.

Example Use Case: Netflix leverages clustering to provide personalized movie recommendations.

8. Generation

Generation, often referred to as generative AI (or 'GenAI'), involves using AI to create new data or content.

Example Use Case: Creating unique graphics & designs based on existing patterns, styles, & more.

Generative AI Overview

Generative AI is seeing an explosion of use cases — and it's particularly exciting when combined with other tasks.

NOTE: For these reasons, generative AI is the crux of what we’ll cover in the remainder of this article.

Key Terms

To understand generative AI, it's helpful to familiarize yourself with some essential terms.

Modality	The type of data being processed or generated
Input	The data provided to an AI system to explain a problem, situation,or request
Prompt	The interaction of a human providing a model information
Inference	The process of a model applying training data to generate a result
Completion/Output	The response a model generates

Modality

Generative AI can be applied across a variety of modalities, which refers to the type of data being processed or generated (e.g., text, image, video, code, speech, music, 3D model, and more).

Input

An input is the data (e.g., text, images, sensor data, or many other types of relevant information) provided to an AI system to explain a problem, situation, or request.

Inputs are fundamental throughout the entire lifecycle of an AI model — from training to deployment and usage.

Prompt

A prompt (which is a type of input) is an interaction between a human and an AI model that provides the model with sufficient information to generate the user’s intended output.

Prompts can take many various forms (e.g., questions, code snippets, images, or videos). While the most common prompts today are text prompts, prompts can be any modality.

A few prompting-related terms you might hear include:

Text-to-Image	Generating an image from a text description (i.e. text prompt)
Text-to-Video	Generating a video from a text description (i.e. text prompt)
Image-to-Image	Generating an image using another image as the prompt

Inference

An inference is the process of an AI model applying the information it learned during training to generate an actionable result (e.g., generating an image).

Completion/Output

A completion (also called an ‘output’) refers to the response a model generates — whether that be text, an image, or other modality.

Data Processing Terms

Generative AI models use tokens, vectors, and embeddings to understand inputs and generate completions/outputs.

Token	A fundamental unit of data that represents words, pixels, etc.
Vector	A mathematical representation of a token
Embedding	A 'supercharged' vector that captures meaning between tokens

Token

A token is the smallest unit of data used by AI models to process inputs (including, but not limited to, prompts) and generate outputs.

Tokens represent elements such as words or pixels, depending on the modality.

For example, in the sentence "Apple is a fruit", each word ("Apple," "is," "a," "fruit") is a token.

Vector

A vector is a mathematical representation of a token.

Each token gets its own set of numbers that represent its meaning and context — enabling the model to interpret the token in a 'language' it understands.

Embedding

Embeddings are like supercharged vectors that not only represent tokens but also capture deep meanings and relationships between tokens.

Embeddings help AI models better understand the nuances and overall context of the data.

Tokens, Vectors, & Embeddings

How Tokens, Vectors, & Embeddings Relate To Each Other

Breaking down complex data into small, manageable tokens and then converting them into numerical vectors that machine learning algorithms can more easily process enables AI models to analyze, comprehend, and generate content effectively.

While vectors are suitable for tasks where the focus is on numerical operations and straightforward data representation, embeddings are required for tasks in which the AI model needs to learn complex patterns or understand subtle nuances and relationships between data, such as natural language processing (NLP) and computer vision.

Generative AI Architecture

AI architecture involves the AI system’s underlying infrastructure needed to develop, train, deploy, use, and manage AI models — including:

Hardware	E.g., GPUs & TPUs
Software	Including ML frameworks & libraries
Model Design	Such as neural network architecture
Model Behavior	Involving distinct training objectives & data

Hardware

When creating an AI system, the first step after choosing an AI model (e.g., rule-based system or machine learning) is to acquire the computing hardware needed to run it.

Computing Power

Computer power is an important aspect of machine learning models, which refers to the capability of hardware systems to perform complex computations required for training and running machine learning models efficiently.

The amount of computing power plays a crucial role in the performance of machine learning algorithms — especially deep learning models, which rely on vast amounts of data and computations.

Computational Resources

The availability of high-end computational resources enables the rapid advancement of deep learning models by facilitating parallel processing and faster computations.

High-end computational resources needed for machine learning include:

Graphics Processing Units (GPUs)	Used for training & generating outputs
Specialized chips like Tensor Processing Units	Used for speeding up training & enhancing performance

Graphics Processing Units (GPUs)

GPUs are typically used to render realistic graphics and visuals in modern video gaming and virtual reality systems — but they are also highly prominent in AI development for rapidly processing large datasets and complex algorithms.

Someone building an AI model could either:

Purchase their own GPUs (typically in the form of a chip)
Use cloud-based services like Google Cloud or Amazon Web Services (AWS) that offer GPU-equipped virtual machines

ChatGPT GPU Chips Fact

To help power OpenAI’s ChatGPT, Microsoft spent hundreds of millions of dollars building a massive supercomputer consisting of A100 NVIDIA chips — the most powerful chip on the market at the time.

Tensor Processing Units (TPUs)

Developed by Google, TPUs are specialized AI accelerators that increase the speed of training and deployment as well as the overall performance of machine learning models.

This boost of speed and performance makes them especially well-suited for generative AI tasks like image and text generation.

Software & Libraries

The two most common ways to interact with an AI model include:

Via a web interface
Locally on your own computer

Web Interface

In cases where you're using an AI model via a web interface provided by a company (e.g., ChatGPT, which is provided by the company, OpenAI), the only software you really need is a web browser (e.g., Chrome, Safari, etc.).

The web browser acts as the client software, allowing you to send queries and receive responses from ChatGPT without needing to install any additional machine learning libraries or frameworks on your local machine.

Run Locally

However, if you wanted to run an AI model locally on your computer, working with it directly requires the installation of:

Open-source machine learning frameworks
Libraries

Running An AI Model Locally

Why Would You Want To Work With An AI Model Directly?

For reasons such as privacy concerns, faster processing speeds, offline access, or the need to customize or fine-tune the model for specific requirements. By running the model locally, you have more control over the data — ensuring that sensitive information is not sent to external servers.

Open Source

Open source means that an AI model has been made freely available for anyone to use, modify, and distribute — allowing for greater collaboration, transparency, and innovation by enabling developers and researchers to access and build upon the existing model.

For context, Grok is an open-source model while ChatGPT is not (as of April 2024).

Machine Learning Framework

A machine learning framework is a tool that simplifies the development of AI models without requiring software developers, data scientists, and machine learning engineers to delve into the complex underlying algorithms.

Machine learning frameworks offer a range of functionalities to facilitate and streamline model building and training — catering to various needs and preferences.

Library

In the context of machine learning, a library refers to a collection of pre-written code that provides functions and tools for building and implementing machine learning models.

Open-Source ML Frameworks & Libraries

Prominent open-source machine learning framework libraries include:

Google’s TensorFlow, which is a widely used framework for deep learning
Facebook’s PyTorch, which is known for its flexibility and compatibility with the coding language Python

A popular library is Hugging Face's Transformers library, which is a state-of-the-art machine learning library designed for PyTorch, TensorFlow, and more that provides tools to easily download and train pre-trained models, reducing compute costs (i.e. computing power) and the time required to train models from scratch.

Model Design/Structure

There are many ways to design a model — with one of the most prominent being the design of neural network architecture, which refers to the configuration of a neural network.

Neural Networks

What's a Neural Network?

A neural network is a type of machine learning process that teaches machines to process data in a way that mimics the human brain.

By using adaptive systems encompassing interconnected nodes in a layered structure, neural networks enable machines to understand complex relationships and patterns as well as learn from their mistakes and improve.

Part of what makes neural networks so powerful is their ability to translate data into a numerical representation then meaningfully interpret the data via embeddings (which we covered earlier in the “Key GenAI Terminology” section).

If you were creating an AI system, you may choose to go with a neural network architecture if you wanted the model to perform a complex task where relationships within the data might be non-linear and intricate, such as with:

Image recognition
Natural language processing
Prediction

Transformers

Transformers are currently one of the most discussed AI architectures. You may have heard of them because they are the “T” in GPT (Generative Pre-trained Transformer) — which powers OpenAI’s ChatGPT.

Introduced in 2017, a transformer is a type of neural network architecture that is based on the self-attention mechanism, which allows the model to pay attention to different parts of the input data simultaneously (as opposed to one element at a time) as it learns.

This ability helps it develop a deeper understanding of its training data by learning more complex relationships within the data. Due to their flexibility and power, transformers have been widely adopted for training large language models (LLMs) on extensive datasets.

NOTE: While transformers are primarily used in neural network-based AI models, they are not limited to just neural networks. Transformers have become a foundational component of many state-of-the-art AI systems, regardless of the underlying architecture.

Generative Adversarial Networks (GANs)

Generative Adversarial Networks (GANs) are a type of algorithm that pits two neural networks against each other to improve the quality of the generated data (i.e., the two neural networks are trained simultaneously in a competitive manner).

The two networks include a:

Generator, which creates new data that attempts to fool the discriminator
Discriminator, which acts as a critic & aims to identify the generator’s fake data

The networks iterate until the generator becomes adept at producing data that the discriminator struggles to distinguish from real data.

GANs can be used in a variety of applications (such as generating realistic images) and allow models to generate realistic data that can be used for training other machine learning models.

Why would we want a model to train another model? Training one machine with another can improve the overall performance and efficiency of the AI system. By using a well-trained machine to generate realistic data for training other models, we can potentially improve the accuracy and generalization capabilities of those models. Additionally, it can reduce the amount of manual effort required to collect and label large datasets, making the training process faster and more cost-effective.

Transformers & GANs

How Transformers & GANs Work Together

Many of today’s AI image generation models combine the strengths of transformers and GANs to generate images from text descriptions. The text input is first processed by a transformer, which then conditions the GAN to generate the corresponding image.

Model Behavior

While a model may leverage transformer architecture and incorporate some GAN-like elements, the model's training process shapes the model's specific behavior. In order to influence a model's behavior, it may have distinct training:

Objectives and functions tailored to their specific tasks
Data sources that represent the domain or task the model is intended to perform

Constitutional AI and diffusion models are great examples of this.

Constitutional AI

Constitutional AI models are embedded with a set of ethical guidelines (such as avoiding harm, respecting preferences, and providing accurate information) within their functioning so that they produce harmless results.

Training Objective: May be trained to understand legal principles & make predictions about legal outcomes

Training Data: May be trained on legal documents, court cases, and historical precedents

Anthropic’s Claude is a large language model (LLM) that’s powered by constitutional AI.

Diffusion Model

Currently the most popular option for image and video generation, diffusion models offer realistic outputs by degrading and reconstructing data systematically. Here’s how it works:

First, the model takes an input image or sound and applies a series of steps to it, where each step adds a little bit of noise to the image or sound — distorting it more over time
The model then tries to guess what the original image or sound was by reversing the same process used to distort it
By repeating this process, the model improves in its ability to guess the original image or sound

Training Objective: May be trained to analyze & generate social media content

Training Data: May be trained on social media posts, news articles, or other sources of information

DALL-E is an AI image generator that utilizes a diffusion model.

How AI Models Are Trained

Training and deploying AI models happens in four phases:

Phase 1: Pre-Training	Creates a foundation (or 'base') model
Phase 2: Customization	Tailors a model to perform a specific task
Phase 3: Deployment	Makes the model available for use in a real-world application
Phase 4: Refinement	Improvement of the model's behavior & outputs

Phase 1: Pre-Training

Most generative AI models are pre-trained on large datasets of complex, unstructured data. Unstructured data refers to data that is not organized in a predefined manner.

Unstructured Data vs. Structured Data

Why train an AI model on unstructured data as opposed to structured data?

Unstructured data (such as raw text from websites, books, articles, or images from the internet) is more abundant and inherently diverse — providing a wealth of human knowledge, language, and visual information. This diversity is crucial for developing models with a broad understanding and the ability to generalize across a wide range of tasks.

Foundation Model

This first level of training creates what’s called a “foundation model” (also referred to as a “base model”).

Foundation models are designed to be general-purpose (i.e., capable of performing a diverse array of tasks by encompassing a broad spectrum of knowledge and capabilities).

However, this inherent generality can render them less adept at specific tasks compared to models that are tailored for those particular functions.

For example, a foundation model may be good at predicting words, but it would not be good at performing the tasks we want it to do, such as following our instructions (aka prompts) or chatting with us (like ChatGPT).

Phase 2: Customization

To tailor an AI model to excel at specific tasks, it needs further training — which may entail:

Providing it with specialized prompts
Fine-tuning it on a smaller, more specific dataset
Adjusting its temperature

NOTE: These training methods are not mutually exclusive. A model can be trained via only one method or with all methods. However, because training AI models is computationally expensive, typically only some parameters (i.e., variables within a model) are updated when introducing new data or optimizations — and even that “refresh” process can be quite expensive.

Specialized Prompts

Specialized prompts guide a model's existing knowledge and capabilities toward a specific task or output format without altering the model's original state.

A few types of specialized prompting methods include:

Retrieval-Augmented Generation (RAG) – Pulls facts from a search index or private database into the prompt so answers stay grounded and avoid hallucinations.
Zero-Shot Prompting – Gives one clear instruction; modern models can often complete the task with no examples.
Few-Shot Prompting – Adds a handful of sample Q&As in the prompt to teach the model the right tone, format, or domain quirks.
Chain-of-Thought (CoT) – Includes “Let’s think step by step” (or a worked example) so the model shows its reasoning, improving math and multi-step tasks.
Self-Consistency Decoding – Has the model think through the problem several times and keeps the answer most versions agree on, boosting reliability.
Tree-of-Thought (ToT) Planning – Lets the model branch into multiple solution paths, scoring each to find the best answer for puzzles or long-horizon planning.
Function-Calling / Tool Prompts – Supplies a JSON schema that tells the model when and how to call external functions or APIs, turning chat into an agent that fetches real-time data or triggers workflows.
Prompting vs. Fine-Tuning – Prompt engineering steers an existing model quickly and cheaply; fine-tuning rewrites the model’s weights for deeper, long-term changes but costs more.

Fine-Tuning

AI models can be tailored to a specific task through fine-tuning, which means running extra training on a smaller, carefully chosen dataset.

Unlike prompt engineering, which leaves the base model untouched, fine-tuning changes some of the model’s weights so the new knowledge or style becomes permanent. Modern methods such as LoRA or QLoRA make this cheaper and faster by updating only small “adapter” layers instead of the entire network.

Because the model itself is altered, fine-tuning can cause overfitting—the model may memorize the limited training examples and struggle with new, unseen inputs. To prevent this, teams:

train only a small portion of the model for a few passes
monitor accuracy on a separate validation set
keep the dataset as varied and representative as possible

With these safeguards, fine-tuning produces a model that meets the task requirements while still performing well on fresh data.

Temperature

Generative AI models pick each next word by looking at a list of probabilities. Temperature is a single number that scales those probabilities before the model makes its choice. A low value such as 0 forces the model to grab the top-ranked word every time, so responses stay predictable and factual. A high value like 0.8 or 1 spreads out the odds, letting lower-ranked words win more often and producing more varied or creative text.

Changing temperature does not retrain the model; it only adjusts how “bold” the sampler is when choosing the next token. Lower temperatures are best for tasks that demand accuracy and consistency, while higher temperatures can inspire brainstorming, storytelling, or other open-ended writing.

Because higher randomness can also raise the chance of nonsense or “hallucinations,” many production systems start at a conservative setting (0 – 0.3) and increase it only when extra creativity outweighs the risk.

Phase 3: Deployment

Once the model has been trained and customized, it is either:

Privately held for use within one specific company
Packaged up and licensed (i.e., sold) via an API (which we’ll explain further in the next section), e.g., OpenAI’s ChatGPT
Released for free as an open-source model for a company to run on its own servers, such is the case with Stability AI’s Stable Diffusion

Phase 4: Refinement

Continuous monitoring and feedback loops track the model's performance and ensure that it remains effective over time.

This could involve refining algorithms, retraining the model with fresh data to keep it up-to-date, adding new features, implementing new techniques to handle specific tasks or scenarios better, and more.

A Deep Dive into LLMs

While generative AI encompasses a broader range of tasks beyond language generation (including image and video generation, music composition, and more), large language models (LLMs) are specifically designed for tasks revolving around natural language generation and comprehension.

LLMs operate by using extensive datasets to learn patterns and relationships between words and phrases — enabling them to generate coherent and contextually relevant text outputs based on given prompts or inputs.

Limitations of LLMs

Despite their impressive capabilities, LLMs:

Math Is Better—Not Perfect: Newer models such as GPT-4o and Claude 3.5 solve far more algebra and word-problem benchmarks than earlier versions, especially when “think-step-by-step” prompts or code-execution tools are enabled. Even so, accuracy falls on longer proofs and highly symbolic tasks, so production systems still hand off complex calculations to external math engines.
Facts Can Still Be Wrong: Retrieval-augmented generation, larger context windows, and fine-grained alignment have cut hallucination rates, but they have not eliminated them. An LLM may confidently invent dates, citations, or product specs when it lacks grounded evidence, which is why many apps now require source links or database lookups before an answer is shown to users.

NOTE: To ensure accuracy, you should use credible sources to verify information generated by LLMs.

LLM Bottlenecks

Although LLMs showcase a wide range of capabilities, they also face and present certain challenges.

Training Cost	Requiring expensive computational resources
Fine-Tuning Limits	Doesn't expand knowledge or improve understanding
Data Shortage	Lack of access to high-quality data
Limited Context Windows	Struggles around maintaining coherence over larger amounts of data
Hallucinations	Generation of incorrect or nonsensical information
Latency & Cost Tradeoffs	Faster responses are more expensive
Training Dates	Lack of up-to-date knowledge

Training Cost

Because LLMs demand considerable computational resources, training cost is a significant concern.

In addition to the problem, there’s a concern about a possible GPU shortage, which would exacerbate this problem by making it challenging for organizations to secure necessary hardware.

The availability of foundation models helps mitigate training costs. However, the costs of inference and fine-tuning these models for specific tasks still persist — creating a potential bottleneck for even the foundation model providers themselves.

Fine-Tuning

While fine-tuning can improve an AI model’s performance at certain tasks (like summarizing), it's not as effective for expanding the AI's knowledge base with entirely new information or improving its understanding of facts.

Instead, adding more varied and targeted data during the training (aka data augmented generation) is currently seen as a better approach.

However, this is still an area where researchers are actively exploring and learning, so there aren't definitive answers yet.

Data Shortage

Context and knowledge are essential for models to perform well — making sourcing more high-quality data a challenge that’s very top of mind.

Data Sourcing

It’s unclear where this higher-quality data will come from.

One theory is that OpenAI is trying to source richer data through projects like ElevenLabs and Rewind AI, state-of-the-art speech-to-text transcription tools that make it easier to analyze and utilize data from sources such as podcasts, interviews, speeches, and other audio recordings — enabling access a wider range of data sources that were previously inaccessible or difficult to analyze.

Context Windows

A context window is the maximum amount of text an LLM can read and remember at once. Recent models have stretched this limit: Gemini 1.5 Pro tests up to 1 million tokens (with 2 million planned), Claude 3 offers 200 K tokens for most users and even larger windows for select customers, and GPT-4o’s public API handles 128 K tokens—enough to keep full books, long contracts, or multithreaded chats in view without immediate truncation.

Bigger windows still carry trade-offs. Memory and compute costs grow with length, so very long prompts can run slower and cost more, even with speed-ups like FlashAttention 2. If input exceeds the limit, the model may drop early context, forget details, or repeat itself.

Because no window is infinite, many systems pair LLMs with retrieval-augmented generation (RAG). Instead of stuffing every detail into one prompt, RAG fetches only the most relevant snippets on demand, grounding answers and keeping costs down.

Hallucinations

Hallucinations refer to large language models’ tendency to generate incorrect or nonsensical information in order to complete the task at hand (e.g., an LLM might generate sentences that don't make sense) — posing challenges to the reliability of LLMs.

NOTE: Because this is such a large problem, this is a very active area of research.

Latency & Cost Tradeoffs

In the world of LLMs, latency (i.e., the speed at which you get a response) and cost are directly opposed to each other — meaning that faster responses are more expensive, and reduced costs may lead to slower responses.

When companies use LLMs for critical parts of products, they may face a problem because slow responses can make their product ideas impractical.

For example, if a chatbot takes too long to respond, users may lose interest or go to a competitor's faster service.

AI model providers (like Google for Gemini) currently make these decisions for us. However, we can expect more granular control in the future, which would provide more flexibility for optimizing interactions with AI models. And, in fact, ChatGPT is already offering this — allowing users the ability to toggle features like web browsing and DALL-E options on and off.

Companies and products that need high latency (fast models) typically get more control by opting to use an open-source, self-hosted model. This is because, by self-hosting, companies avoid latency issues that may arise from using third-party hosting services, and open-source software allows for greater flexibility to tailor systems to meet a company’s unique requirements.

Training Dates

Because of how LLMs are currently trained, the model is unlikely to be fully up to date — lacking knowledge of recent events, trends, or technological advancements, which limits their relevance and accuracy in certain applications.

However, some applications or systems built using LLMs (such as Gemini and Perplexity) may incorporate real-time information by integrating with external APIs or data sources. In such cases, the integration of real-time data sources is handled by the system built around the LLM rather than by the LLM itself.

Training LLMs

LLMs are trained to train general language patterns, grammar, and semantic relationships between words and phrases. Just like other generative AI, training an LLM involves:

Pre-training to create a foundation model
Fine-tuning (i.e., customization) to make it better at a specific task
Refinement to further improve the model

Pre-training

During pre-training, the LLM is trained on an extensive dataset with the objective of predicting the next word in a sentence given the previous words.

For example, For example, when given the input "The sky is", the LLM is trained to predict the next word "blue".

ChatGPT's Training

OpenAI’s GPT-3.5 was reportedly trained on Books, Wikipedia, and a substantial portion of the internet (known as "CommonCrawl") — illustrating the vastness and intricacy of its training data.

Fine-Tuning

An LLM goes from predicting the next word in a sentence to understanding complex text and semantic meaning through further training and fine-tuning on a large dataset of text.

Additionally, LLMs are equipped with attention mechanisms that help them focus on relevant parts of the input text, aiding in understanding and generating more sophisticated language — helping them to develop a deeper understanding of more sophisticated language by learning more complex relationships within the text data.

This process of fine-tuning allows the model to learn patterns, relationships, and contextual information within the text — enabling it to generate coherent, human-like responses.

AI research teams like Open AI are actively working to make LLMs as easy to use and helpful as possible. The progress from the first iteration of ChatGPT to ChatGPT-4 clearly shows this.

The first version feels noticeably more like auto-complete, whereas ChatGPT-4 offers a more interactive and conversational experience, making it feel as if you are having a helpful conversation with an assistant.

Using AI

It’s undeniable that AI has revolutionized the way businesses operate and individuals interact with AI tools.

Two terms currently gaining prominence in AI are automation and agents. But before we delve into those, it’s essential to define the concepts of:

Low-Code/No-Code	A type of programming that requires zero (or very little) traditional coding knowledge
Application Programming Interfaces (APIs)	A way for programs (like tools, software, or apps) to communicate with each other

NOTE: While these next terms aren’t directly considered AI terminology, you’ll encounter them when learning to apply AI and will benefit from a base understanding of what they each mean and their relation to AI.

LC/NC & APIs

Low-Code/No-Code

Businesses often use low-code/no-code tools to quickly create custom applications without the need for a large development team while still having some level of customizability and control. Examples of each include:

Low-code: Retool, which cuts the time to develop internal tools dramatically but still requires basic knowledge of languages like Javascript and SQL
No-code: Zapier, which connects various tools’ APIs to perform data transformation between them — all within an easy-to-use, no-code interface

Application Programming Interface (API)

APIs are the cornerstone of the low-code/no-code space because most of these tools (like Zapier) primarily connect various APIs and provide a simple user interface for businesses to integrate different applications and services into their own operations easily.

Automations & Agents

Automation is creating a new level of intelligent workflows that transform how your business operates — turning complex challenges into manageable solutions by delegating tasks to machines.

Automation tools streamline processes by linking data across various tools and needing minimal ongoing manual intervention once established. The automation tools available range from user-friendly low-code/no-code platforms to more sophisticated systems that offer extensive customization for experienced developers.

3 Types of Automation

The key differentiator between each of the three types of automation lies in their independence and cognitive capabilities.

Type of Automation	How It Works	Level of Human Involvement	Tasks
Traditional Automation	Predetermined rules	High — need explicit instructions	Menial, repetitive tasks
AI Automation	Learns from data to make human-like decisions	Medium — moderate amount of setup	Data analysis & decision making
AI Agents	Make decisions autonomously	Low — need only an end goal	Complex problem-solving & interactions

Traditional Automation: Traditional automation is designed to execute repetitive tasks based on explicit, predefined rules determined by humans and does not learn or evolve.

AI Automation: AI automation uses artificial intelligence capabilities like machine learning and natural language processing (NLP) to enable machines to learn from data and experiences, recognize patterns, and make human-like decisions. AI automations do adapt and evolve over time.

AI Agents: Similar to AI automation, AI agents are designed to perceive their environment and make human-like decisions. However, unlike AI automations, AI agents take autonomous actions (i.e., without needing any human input).

How Automation Works: 6 Key Concepts

Automation involves six key concepts, including:

Concept	Description
Workflow	A sequence of steps to complete a task
Triggers	Events that initiate the workflow
Inputs	The data required for the automation to work
Logic	The rules that determine what happens within a workflow
Action	The steps taken by the automation
Output	The ultimate result of an automation

How Automation Works: APIs & API Wrappers

APIs play a crucial role in automation by facilitating the real-time exchange of data between different software (such as customer databases, CRM platforms, and social media analytics tools) within an automation workflow.

Almost every automation tool on the market is essentially an API wrapper, which refers to software that provides a more user-friendly way to work with an API by removing the complexity of directly interacting with the API.

You can think of an API wrapper as a translator that enables you to use APIs without getting lost in the technical details or as a middleman that takes care of the complex interactions needed to use APIs.

» Discover: AI Copywriters

Additional AI 101 Resources

Automation through AI	Delve deeper into how automation works (with examples!)
AI Evolution	Explore the progression of AI — from its earliest foundations to tomorrow's innovations
AI Glossary	Discover even more key AI terminology