Sapience Glossary – Sapience Cloud

💡

Like any technical field, AI is filled with jargon. We’ve put this glossary together to give you a cheat sheet to all the jargon in the industry. If members of the technology team like to use these phrases, you can use this to decode what they’re saying!

Note on proper nouns: This glossary contains both Sapience words with definitions, e.g. “Project”, “File Manager” (things within Sapience are always given proper noun case in these docs. If you see File Manager, that is referring to a Sapience ‘thing’.

Agent: There is little consensus in the industry about the definition of what an AI agent truly is. In our lexicon, an AI agent means an agent deployed onto the Sapience platform. Agents generally have knowledge, skills, and behavior. For details on how Agents work in Sapience, see:

Agents

AI: artificial intelligence.

Agentic RAG: see Retrieval Augmented Generation.

Agentic: When AI systems do more than a one shot response. It is often referred to as Agentic under the hood. What this means is that the AI system goes through a loop and Analyzes the query in more detail. It comes up with an action plan and then executes the action plan. The term agentic has the same problems with definition as the term agent does, and there is still great debate in the industry as to what truly agentic behavior means.

Anthropic: One of the three major labs (OpenAI and Google being the others) creating Frontier AI models and. Anthropic is closely aligned with Amazon and creates the Claude series of models.

Claude: The Frontier model from Anthropic. Currently ships in two main forms, being version 4 Sonnet and version 4 Opus. Sonnet is their fast model and Opus is their long thinking reasoner.

Context: This is AI lingo for data. Under the hood, any given AI model like Claude or ChatGPT has a limited amount of information it can take as an input. In the case of most models this is thousands of words. In the case of some advanced models, this is as many as a million words. Advanced agents like the ones in Sapiens use these very large context windows to teach the AI system new things on the fly.

Conversation: a Sapience concept. This is a series of messages and tool calls between at least one human and one Agent. A chat with a Team of Agents is at least one human at 2+ Agents. Sapience supports N humans and M Agents in a given chat. See this for more:

Chats & Conversations

Frontier Model: There are thousands of AI models available currently. The big models that are from labs pursuing AGI or Artificial General Intelligence are known as Frontier models. The three main companies creating frontier models are OpenAI, Anthropic, and Google, in that order of importance and priority.

Gemini: The brand name for a series of Frontier models from Google.

Generative AI: These days, generative AI isn't really used as much as the simple term AI or Artificial Intelligence is. However, generative AI is actually the more accurate descriptor. There are many types of AI systems. Generative AI is a subset of the broader AI world. Generative AI systems were popularized in November 2022 by the launch of ChatGPT. Modern generative AI systems are based on large language models also called LLMs. They are a type of neural network.

GPT: Generative pretrained transformer. This refers to a subtype of AI systems that are deep neural networks trained with reinforcement learning.

GPT-4o: OpenAI’s current flagship model for consumers and the “brain” behind ChatGPT. Fast. Cheap. This changes frequently and GPT-5 is rumored to be launching imminently.

GPT-4.1: OpenAI’s large context model. Sapience uses this in parts because it offers a 1-million token (word) context window. Fast. Moderately expensive.

GPT-5, 5.1, 5.2: OpenAI’s latest work-horse models. Is a “mini reasoner”. Not as smart as o3, but faster and cheaper. This is what is behind ChatGPT.

Hallucination: Generous word for a lie from an AI system. A common bug of all large language models. Sapience has anti hallucination technology built in.

Hybrid search: a combination of keyword search and semantic search. This is what Sapience uses. Semantic search is an AI search concept where we look for the user’s meaning, and search for all of the concepts tied to their query. Keyword search is looking for phrases, like what Google does.

LLM: see large language model.

Large language model: All of the AI systems that you have heard about are large language models. This includes Gemini from Google. Claude from Anthropic and all of OpenAI’s models. Including the GPT series and the O series Reasoners. The phrase is derived from how the neural network is built, where it is trained on trillions of words of human Language (including code)

Model: the core of an AI system. They have names like GPT-4o, GPT-4.1, o1, o3, Claude-4-Opus. Think of them as different versions of the core product from the frontier AI labs.

O3: a reasoning model from OpenAI. Very powerful. Expensive. We use it in Sapience.

O3-Pro: OpenAI’s most powerful model. Slow, expensive but brilliant. We use it in Sapience.

OpenAI: the most famous and most important AI company on earth. If you only track one company, make it OpenAI. CEO Sam Altman.

Opus: the most powerful reasoning model from Anthropic.

Output Mode: A Sapience term. See FAQ above.

RAG: see Retrieval Augmented Generation.

Retrieval Augmented Generation:

RAG is an AI architecture that enhances the capabilities of large language models (LLMs) by integrating an information retrieval component into the text generation process. This approach allows LLMs to access and incorporate up-to-date, domain-specific, or proprietary information from external data sources—such as databases, document repositories, or APIs—at inference time, rather than relying solely on their static, pre-trained knowledge.

Expand me for more:

Core Components and Workflow

RAG systems typically operate through the following key stages2 4 5 9:

Stage	Description
Indexing	External data (structured, semi-structured, or unstructured) is converted into embeddings (vector representations) and stored in a vector database for efficient retrieval.
Retrieval	Upon receiving a user query, the system retrieves the most relevant documents or data chunks from the external knowledge base using similarity search in the vector space.
Augmentation	The retrieved information is injected into the prompt or context provided to the LLM, often through prompt engineering techniques.
Generation	The LLM generates a response that is grounded in both its pre-trained knowledge and the newly retrieved, context-specific information.

Key Characteristics

Grounded Generation: RAG ensures that generated outputs are based on authoritative, relevant, and current information, reducing the risk of hallucinations or outdated responses

No Retraining Required: RAG enables LLMs to incorporate new information without the need for costly and time-consuming retraining, as updates to the knowledge base are immediately accessible

Transparency and Source Attribution: Many RAG implementations can cite or reference the sources used, increasing transparency and user trust

Domain Adaptability: RAG is especially valuable for applications requiring domain-specific knowledge, such as enterprise chatbots, healthcare, legal research, and technical support

Technical Advantages

Improved Accuracy: By referencing external, authoritative data, RAG mitigates the limitations of LLMs trained on static datasets

Reduced Hallucination: RAG constrains the model’s outputs to facts present in the retrieved documents, decreasing the likelihood of fabricated or incorrect information

Cost Efficiency: Organizations can adapt LLMs to new domains or update knowledge simply by refreshing the external data sources, avoiding the need for model retraining

Example Use Case

When an employee asks a chatbot, “How much annual leave do I have?”, a RAG-enabled system retrieves the relevant HR policy and the employee’s leave record from internal databases, augments the prompt with this data, and the LLM generates a precise, context-aware answer2.

Comparison: Traditional LLM vs. RAG

Feature	Traditional LLM	RAG-Enabled LLM
Knowledge Source	Static, pre-trained data	Dynamic, external sources (databases, APIs, docs)
Update Frequency	Requires retraining for new data	Instantaneous via external data refresh
Factual Accuracy	May hallucinate or be outdated	Grounded in retrieved, current information
Domain Adaptation	Costly and slow (retraining needed)	Fast and flexible (update knowledge base only)
Source Attribution	Rarely available	Often possible (can cite retrieved documents)

Vector Store: fancy type of database for storing AI-native knowledge graphs. Sapience uses multiple vector stores to power its semantic search capabilities.

Semantic Search: is an advanced information retrieval technique used in Sapience that focuses on understanding the contextual meaning and intent behind a user's search query, rather than simply matching keywords or phrases. Unlike traditional keyword-based (lexical) search, which retrieves results based on exact word matches, semantic search interprets the meaning of words, their relationships, and the broader context to deliver more relevant and accurate results.

Expand me for more:

Key concepts:

Intent Recognition: Semantic search aims to discern what the user actually wants, even if their query is ambiguous or phrased differently from the content in the database1 6 8.

Context Awareness: It considers factors such as previous searches, user location, and the relationships between terms to refine results.

Natural Language Understanding: Leveraging natural language processing (NLP) and machine learning (ML), semantic search can handle synonyms, polysemy (words with multiple meanings), and paraphrased queries.

Vector Embeddings: Words, sentences, or documents are transformed into high-dimensional vectors (embeddings) that capture semantic similarity, allowing the system to match queries and content based on meaning rather than exact wording.

Knowledge Graphs and Ontologies: These structures model relationships between entities (like people, places, or concepts), further enhancing the system’s ability to interpret queries and retrieve contextually relevant information.

Comparison Table: Lexical vs. Semantic Search

Feature	Lexical (Keyword) Search	Semantic Search
Matching Method	Exact keyword match	Meaning, context, and intent
Handles Synonyms	No	Yes
Understands Context	No	Yes
Disambiguates Terms	No	Yes
Personalization	Limited	Supports personalization via user context
Core Technologies	Inverted index, tokenization	NLP, ML, vector embeddings, knowledge graphs

If a user searches for "best laptops for graphic design students":

Keyword search returns pages containing those exact words.

Semantic search interprets the need for laptops with strong graphics capabilities, ample RAM, and color-accurate displays, returning results even if those exact keywords aren't present1 6.

Technologies Used

Natural Language Processing (NLP)

Machine Learning (ML)

Vector Search and Embeddings

Knowledge Graphs and Ontologies

SLM: see small language model.

Small Language Model: the baby brother of the large language models. Small, fast, cheap. Designed to run on the edge (e.g. a laptop or modern phone).

Sonnet: The standard model used when you interact with Claude from Anthropic. It is their equivalent of ChatGPT.

Token: AI jargon for word. “A” and “arachnophobia” are each one token despite character length disparity.

Tools: This is AI engineer speak for a piece of technology made available to an AI agent itself. For example, you can give an AI agent a tool that lets it search the web, and we do this in Sapience. Equally, you could give it a tool to search a database, which we also do. Even basic operations like editing a file is difficult for AI systems and models cannot do them natively. They require tools to be given to them. Whenever we are talking about Sapience taking actions or having skills under the hood, what we are talking about is actually providing the AI model with. The appropriate context and data on the one hand, and tools to use that context and data on the other. For more on how Sapience uses tools and features on agents go here:

Agent Features & Tools

Traceability: The ability to see where an AI system got its information from. This is a fundamental capability lacking in all large language models and is one of the great features of Sapiens. We allow the user to trace every answer back to source data. Including seeing both the document it comes from as well as the individual page and paragraph.