The Gist

  • AI essentials. Artificial intelligence, machine learning and deep learning are interconnected fields focusing on developing intelligent systems capable of learning, reasoning and problem-solving like humans.
  • Model varieties. Generative AI, conversational AI, ethical AI and explainable AI are subfields addressing unique challenges and applications, such as data generation, humanlike interactions, ethical considerations and transparency.
  • Building blocks. Terms like activation function, artificial neural network, attention mechanism and backpropagation describe essential components and processes in AI models that enable complex learning and problem-solving.

With all the recent buzz about artificial intelligence, it’s tough to keep up with all the terms that are used to define AI, and even more challenging to understand how AI works. Here are some key terms related to AI, machine learning, natural language processing (NLP), natural language understanding (NLU), as well as subfields like generative AI, conversational AI, ethical AI and explainable AI. 

Related Article: How Artificial Intelligence Can Break Through Data Silos

AI Definitions 

Activation Function: Activation functions are important because they help neural networks learn complex things better. They transform simple outputs into more complicated ones.

Artificial Intelligence (AI): Artificial Intelligence is a subdomain of computer science focused on developing systems or machines that exhibit humanlike intelligence in tasks such as problem-solving, learning, reasoning, perception and natural language understanding. 

Artificial Neural Network: Inspired by the biological neural networks found in animal brains, artificial neural networks (ANN) serve as the basis for numerous machine learning and deep learning models. They consist of interconnected nodes, or neurons, arranged into layers to create complex computing systems.

Attention Mechanism: Attention mechanisms improve neural networks by focusing on important input elements, which enhances performance in tasks such as machine translation, text summarization and image captioning.

Backpropagation: Backpropagation is a crucial algorithm for training neural networks, especially deep learning models, by adjusting weights to minimize the error between actual and predicted outputs. It computes the loss function gradient (a mathematical concept that is used to optimize the performance of neural networks) for each weight using the chain rule and iterates backward through layers for optimization.

Bag of Words (BoW): A document is represented as an unordered word set, ignoring grammar and word order but retaining frequency information.

Bias: For neural networks, bias is an extra parameter that shifts the activation function along the input axis. In the broader AI context, bias refers to systematic errors in model predictions stemming from prejudices or assumptions in the training data.

Convolutional Neural Networks (CNNs): A CNN is a deep learning model for processing gridlike data such as images using convolutional layers with filters to detect spatial hierarchies and recognize patterns at various scales.

Cross-Entropy Loss: A loss function often employed in classification tasks for assessing the difference between predicted and true probabilities.

Conversational AI: Conversational AI involves technologies that allow computers to engage in humanlike conversations, using NLP, NLU and natural language generation (NLG) techniques. Common applications that use conversational AI include chatbots, voice assistants or other applications enabling natural language-based human-machine interactions.

Clustering: An unsupervised machine learning method that clusters data points by similarity without relying on pre-existing labels. Popular algorithms include K-means, hierarchical clustering and density-based clustering algorithm (DBSCAN).

DALL-E: An AI model developed by OpenAI, which combines the capabilities of generative models and large language models to create images from text descriptions. DALL-E is based on a variant of the GPT-3 architecture, with modifications that enable it to generate images instead of text.

Deep Learning: Deep learning is a subfield of machine learning that emphasizes multilayered artificial neural networks that learn intricate patterns from large datasets, advancing applications like image recognition, speech recognition and natural language processing.

Dimensionality Reduction: Dimensionality reduction techniques minimize dataset features while maintaining essential information, enhancing computational efficiency and addressing the curse of dimensionality. 

Ethical AI: The practice of designing artificial intelligence systems that adhere to ethical principles and values, including considerations of fairness, accountability, transparency and the impact of AI on society. 

Feature Engineering: Feature engineering involves creating or modifying input variables to enhance machine learning model performance, including tasks such as scaling, normalization and encoding categorical variables, often requiring domain knowledge for selecting relevant and informative features.

Feature Selection: Identifying and selecting crucial features from a dataset for machine learning model building, reducing overfitting, enhancing performance and minimizing computational complexity. Feature selection techniques include filter, wrapper and embedded methods.

Generative Adversarial Networks (GANs): A deep-learning model that includes two neural networks: a generator that creates fake data and a discriminator that distinguishes between real and fake data. The networks compete with each other in a gamelike way, with the generator trying to create more realistic data to fool the discriminator.

Generative AI: A class of machine learning models that can generate new data samples (i.e. a conversation, an answer to a question, or an image) resembling the training data. These models, such as generative adversarial networks have been used in various applications, including image synthesis, text generation, data augmentation and chatbots.

Gradient Descent: An optimization algorithm that minimizes a function, often used in training machine learning models to reduce the loss function. It iteratively adjusts model parameters following the negative gradient, converging to a local minimum.

Hyperparameters: User-defined model parameters, such as learning rate, batch size and neural network hidden layers, not learned from data. Hyperparameter tuning seeks optimal values for a specific model and dataset.

Image-Text Pairs: These consist of images and their related textual descriptions or labels, used to train AI models like image captioning systems or visual question-answering models, enabling them to grasp the connection between visual and textual data.

Input Token: Units of meaning in text for training AI models or NLP tasks, can be words, phrases, or characters. They are processed by encoders in Seq2Seq models or other NLP architectures to capture sequence structure and meaning.

Learning Opportunities

Large Language Model: AI models that often use deep learning, aimed at understanding and generating humanlike text. Trained on extensive text data sets, they capture complex language patterns. Examples include OpenAI's GPT series (GPT-3, ChatGPT-4), Microsoft Bing and Google Bard.

Loss Function: A function that measures the difference between the predicted output of a machine learning model and the actual output or target. The training aims to minimize the loss function, achieved using optimization algorithms such as gradient descent or stochastic (randomly determined) gradient descent.

Output Token: Single units of meaning in the text generated by sequence-to-sequence models or other NLP tasks. They can be words, phrases or characters, and a Seq2Seq model's decoder creates them one by one to form the complete output sequence.

Overfitting: Overfitting occurs when a machine learning model learns the training data too well, including noise and random changes, rather than just the main pattern. This causes the model to perform poorly on new data. 

Reinforcement Learning (RL): This is when a computer program learns to make decisions by interacting with a virtual environment. It gets feedback as rewards or penalties and tries to get the highest total reward over time. It's used in areas like video games, robots and recommendation systems.

Sequence-to-Sequence Models (Seq2Seq): Deep learning structures used to convert an input series of data into an output series. They have an encoder to process the input and a decoder to create the output.

Soft Weights: Soft weights refer to how within neural networks, there are probabilities assigned to elements when calculating attention scores. They help the model pay attention to multiple inputs instead of just one, making the attention process smoother and more adaptable.

Stable Diffusion: An open-source deep learning model released in 2022, developed by Stability AI in collaboration with academic researchers. This model is mainly used for generating detailed images from text descriptions, but it can also be applied to other tasks including inpainting, outpainting and image-to-image translations as guided by a text prompt. The current version of Stable Diffusion is called Dream Studio.

Supervised Learning: Supervised learning is a machine learning approach where models are trained using labeled data and learn to map inputs to outputs by minimizing the difference between predicted and actual targets. It includes common tasks such as classification and regression.

Tokenization: Tokenization refers to the process of breaking text into tokens, which are the smallest units of meaning. These tokens can be words, subwords, phrases or characters, depending on the level of granularity chosen. Tokenization is a crucial preprocessing step for many natural language processing tasks.

Tokens: As mentioned above, tokens are the smallest units of meaning in a large language model. Tokens serve as the input for AI models that process text, such as large language models, sequence-to-sequence models and classifiers.

Unsupervised Learning: A machine learning approach where models are trained using only input data, without corresponding target outputs. The goal of unsupervised learning is to discover patterns, structures, or relationships within the data on its own, without any prior knowledge or guidance.

Value Vector: The attention mechanism in neural networks includes value vectors that store information about input elements. These vectors combine with attention scores to create a context vector, which is used to generate the output. The attention scores determine the importance of input elements, allowing the model to focus on the most relevant information.

Weight: Parameters in a neural network that influence neuron connections, and are adjusted during training to optimize output generation that closely matches the target data distribution.

Related Article: 5 AI Applications for Marketers to Streamline Work

AI Articles for a Deeper Dive

Now that we understand the terms that are used to define the inner workings of AI applications, if you’d like a deeper dive into some of the specific aspects of AI, and specifically how they can or will affect marketing, advertising, content creation and SEO, as well as issues such as AI’s impact on privacy, legal ramifications and regulations, here are some articles to whet your appetite.

Quick AI Links for Additional Information

Here are some quick links to some of the most popular AI organizations and generative AI models: