retrieval-augmented-generation.com Open in urlscan Pro
2600:9000:2363:cc00:6:3c83:3780:93a1  Public Scan

URL: https://retrieval-augmented-generation.com/
Submission: On June 10 via api from US — Scanned from DE

Form analysis 1 forms found in the DOM

POST https://protoku.us17.list-manage.com/subscribe/post?u=c5f934dbe37984ec515aaca34&id=8f24333076&f_id=009cf7e0f0

<form action="https://protoku.us17.list-manage.com/subscribe/post?u=c5f934dbe37984ec515aaca34&amp;id=8f24333076&amp;f_id=009cf7e0f0" method="post">
  <input type="email" id="email" name="EMAIL" required="" aria-label="Email address" placeholder="Email address">
  <input type="submit" value="Submit">
</form>

Text Content

HANDS-ON
RETRIEVAL-AUGMENTED GENERATION

Jeroen Herczeg


A BOOK WITH INTERACTIVE EXAMPLES THAT TEACHES YOU HOW TO IMPLEMENT
RETRIEVAL-AUGMENTED GENERATION APPLICATIONS THAT ARE PRODUCTION-READY,
EFFECTIVE, AND SAFE.

Get the first chapter →


WHAT IS RETRIEVAL-AUGMENTED GENERATION (RAG)?

A RAG (Retrieval-Augmented Generation) is an architecture that combines
information retrieval with language models (LLMs) to improve the quality and
relevance of the generated text. RAG first retrieves relevant information and
then integrates it into the LLM's query input. This enables LLMs to access the
latest or private information and provide factual answers with verifiable
sources.


DESIGNED TO TEACH YOU PRACTICAL, HANDS-ON METHODS FOR IMPLEMENTING
RETRIEVAL-AUGMENTED GENERATION (RAG).

When you first learn about RAG, it might come across as a simple system meant to
improve the accuracy of a Large Language Model. But once you start implementing
it, you realize that it's quite complicated and requires a good grasp of
retrieval and generation techniques.

In this book, we will start by examining how large language models work, as well
as their limitations and challenges. Next, we'll take a close look at the RAG
architecture and how it can improve the performance of a language model.
Finally, we'll discuss the most frequent difficulties encountered when
developing a RAG application.

 * Learn how to ingest PDFs and other documents that include multimedia content.
 * Discover the essential elements of RAG and how to choose the right components
   for your system.
 * Understand how to integrate external memory to enhance conversation
   continuity and context in RAG applications.
 * Master advanced optimization techniques, including re-ranking and hybrid
   search, to improve the efficiency and effectiveness of your RAG system.
 * Learn best practices for developing RAG systems that prioritize safety and
   security in production environments.
 * Gain insights into testing and evaluating the performance of your RAG system
   to ensure reliability and accuracy.
 * Discover strategies for forecasting and managing the operational costs
   associated with running a RAG system.

Throughout the book, you will learn through live interactive examples that will
help solidify your understanding. By the end, you will have the confidence to
deploy powerful RAG applications that solve real-world problems.

Get the first chapter for free straight to your inbox →

01 Table of contents


GET A LOOK AT ALL OF THE CONTENT COVERED IN THE BOOK. EVERYTHING YOU NEED TO
KNOW IS INSIDE.

“Hands-On Retrieval-Augmented Generation” is comprised of 240 tightly edited
pages designed to teach you everything you need to know about
Retrieval-Augmented Generation with no unnecessary filler.


RETRIEVAL-AUGMENTED GENERATION

 1. coming soon Introduction
    
    What is Retrieval-Augmented Generation?

 2. coming soon Understanding the Challenges of Large Language Models
    
    Hallucinations and Inaccuracies
    Knowledge Gaps and Cutoff
    Limited Contextual Understanding
    Observability
    

 3. coming soon Use cases
    
    Question-Answering System
    Conversational Agent
    Real-time Event Commentary
    Content Generation

 4. coming soon Building a Naive RAG System
    
    Components of a RAG System
    Retrieval Implementation
    Generation Implementation

 5. coming soon Advantages and Limitations of RAG
    
    Benefits of RAG
    Potential Drawbacks


LARGE LANGUAGE MODELS

 1. coming soon Foundation
    
    Model Architecture
    Weights and Biases
    Tokenization
    Training
    Inference
    Settings

 2. coming soon Context Window
    
    Sliding Window
    Attention Mechanism
    Memory

 3. coming soon Prompt Engineering
    
    Anatomy of a Prompt
    Zero-shot Prompting
    Few-shot Prompting
    Chain-of-Thought Prompting
    

 4. Fine-Tuning
    
    Transfer Learning
    Domain-Specific Fine-Tuning


DATA INGESTION

 1. coming soon Data Sources
    
    Document Formats
    SERP and REST APIs
    Web Scraping
    Databases
    PDFs, Images, and Multimedia

 2. coming soon Data Preprocessing
    
    Text Splitting
    Converting Unstructured to Structured Data
    Dealing with Noisy Data


VECTOR SEARCH

 1. coming soon Introduction
    
    Keyword Search vs Semantic Search

 2. coming soon Understanding Vectors
    
    Definition of a Vector
    Norms, Distances, and Similarities
    Vector Operations

 3. coming soon Creating Embeddings
    
    What are Embeddings?
    Types of Embeddings
    Training Embeddings
    Pre-trained Embeddings

 4. coming soon Measuring Similarity
    
    Cosine Similarity
    Euclidean Distance
    Dot product

 5. coming soon Approximate Nearest Neighbor (ANN) Search
    
    Introduction to ANN Search
    ANN Algorithms
    Trade-offs in ANN Search

 6. coming soon Vector Databases
    
    Introduction to Vector Databases
    Vector Search Engines
    Vector Indexing and Querying


BUILDING A RAG PIPELINE

 1. coming soon LlamaIndex
    
    Introduction
    Implementation

 2. coming soon LangChain
    
    Introduction
    Implementation

 3. coming soon Haystack
    
    Introduction
    Implementation


ADVANCED TECHNIQUES

 1. coming soon Re-ranking
    
    Introduction
    Re-ranking Strategies
    

 2. coming soon Hybrid Search
    
    Introduction
    Hybrid Search Strategies
    

 3. coming soon Multi-Modal Search
    
    Introduction
    Multi-Modal Search Strategies
    

 4. coming soon Advanced Embedding Techniques
    
    Contextual Embeddings
    Dynamic Embeddings


DEPLOYMENT

 1. coming soon Model Serving
    
    Hosted Services
    On-premises Deployment
    Model Versioning
    
    

 2. coming soon Continuous Integration and Deployment
    
    CI/CD Pipelines
    Evaluation
    Monitoring

 3. coming soon Scalability and Performance
    
    Horizontal and Vertical Scaling
    Performance Tuning


PRODUCTION-READY

 1. coming soon Operational
    
    Cost Analysis
    Monitoring

 2. coming soon Security
    
    Data Privacy
    Model Security
    Compliance

02 Interactive examples


ACCELERATING YOUR UNDERSTANDING.

Explore interactive examples deployed on HuggingFace Spaces to accelerate your
understanding as you progress through the book.

Running


LLM CHAT

Running


TOKENIZATION

Running


VECTOR SEARCH

Running


HYBRID SEARCH

Running


EVALUATION


MORE COMING

03 Pre-order


BECOME AN EARLY READER.

Enter your email address and I’ll send you the first chapter from the book for
free.



> “We are currently living in a remarkable time. Artificial intelligence is
> advancing at an unprecedented rate. With access to the most advanced AI
> models, we are now able to develop software features that were previously
> difficult or even impossible to create. The future of AI is not just about
> algorithms and data. It's about the people who harness these models to solve
> real-world problems.”
> 
> — Jeroen Herczeg, Author

04 Author


JEROEN HERCZEG


HEY THERE, I’M THE AUTHOR.

I have worked in software engineering for over two decades, specializing in
building and maintaining efficient, reliable, and scalable systems. In 2015, I
discovered my passion for artificial intelligence and have been learning more
about this field and how to apply it in a practical way. As a speaker at various
meetups, I have always been passionate about learning and sharing my knowledge
with others.

Follow on Hugging Face Follow on GitHub Follow on X

Copyright © 2024 Jeroen Herczeg
All rights reserved.