Skip to main content

Overview

The RAG (Retrieval-Augmented Generation) Chat API enables intelligent question-answering over uploaded documents. Documents are processed, chunked, embedded, and indexed for semantic search, then combined with Google Gemini AI for accurate responses. Base Path: /rag-chat/

Key Features

Document Upload

Upload PDF documents for indexing

Vector Search

Semantic similarity search using embeddings

AI Responses

Google Gemini-powered contextual answers

Query History

Track and retrieve past queries

Endpoints

Upload Document

POST /rag-chat/api/upload/

Upload and process a PDF document for RAG indexing
URL: /rag-chat/api/upload/
Method: POST
Content-Type: multipart/form-data
Auth: Optional (user tracked if authenticated)
Implemented in rag_chat/views.py:105 as UploadDocumentView.
file
file
required
PDF file to upload (must have .pdf extension)
title
string
Display name for the document (defaults to filename)

Processing Pipeline

  1. Validation - Checks file type (PDF only)
  2. Database Record - Creates DocumentCollection with status processing
  3. Text Extraction - Extracts text from PDF pages
  4. Chunking - Splits text into 500-char chunks with 50-char overlap
  5. Embeddings - Generates vector embeddings for each chunk
  6. Indexing - Stores chunks and embeddings in database
  7. Status Update - Sets status to indexed or error

Example Request

curl -X POST https://your-domain.com/rag-chat/api/upload/ \
  -F "file=@/path/to/manual.pdf" \
  -F "title=User Manual v2.0"

Success Response (201)

collection_id
integer
required
Unique identifier for the uploaded document collection
title
string
required
Document title
status
string
required
Processing status: indexed, processing, pending, or error
chunk_count
integer
required
Number of text chunks created from the document
page_count
integer
required
Total pages in the PDF document
{
  "collection_id": 3,
  "title": "User Manual v2.0",
  "status": "indexed",
  "chunk_count": 45,
  "page_count": 12
}

Error Response (400)

{
  "error": "No se envió archivo"
}
{
  "error": "Solo PDFs soportados"
}

Error Response (500)

{
  "error": "Failed to extract text from PDF"
}

Query RAG

POST /rag-chat/api/query/

Ask a question and get an AI-powered answer based on uploaded documents
URL: /rag-chat/api/query/
Method: POST
Content-Type: application/json
Auth: Optional (query saved if authenticated)
Implemented in rag_chat/views.py:190 as QueryRAGView.
collection_id
integer
required
ID of the document collection to query
query
string
required
User’s question (cannot be empty)
top_k
integer
default:"3"
Number of relevant chunks to retrieve (1-10 recommended)

How It Works

  1. Query Embedding - Converts user question to vector
  2. Similarity Search - Finds top-k most relevant document chunks
  3. Context Building - Constructs prompt with relevant passages
  4. AI Generation - Calls Google Gemini API with context
  5. Response - Returns answer with source citations

Example Request

curl -X POST https://your-domain.com/rag-chat/api/query/ \
  -H "Content-Type: application/json" \
  -d '{
    "collection_id": 3,
    "query": "How do I reset my password?",
    "top_k": 3
  }'

Success Response (200)

answer
string
required
AI-generated answer based on document context
sources
array
required
Array of source chunks used to generate the answer
sources[].content
string
Excerpt from the relevant document chunk (truncated to 200 chars)
sources[].page
array
Page number(s) where this content appears
sources[].similarity
number
Similarity score (0-1) indicating relevance
collection_title
string
required
Title of the document collection queried
{
  "answer": "To reset your password, go to the Settings page and click 'Reset Password'. You will receive an email with instructions to create a new password.",
  "sources": [
    {
      "content": "[Página 5] Password Reset: Navigate to Settings > Security > Reset Password. An email will be sent to your registered address...",
      "page": [5],
      "similarity": 0.892
    },
    {
      "content": "[Página 12] For security purposes, password reset links expire after 24 hours. If your link has expired, request a new one...",
      "page": [12],
      "similarity": 0.754
    },
    {
      "content": "[Página 3] Account security features include two-factor authentication and password reset functionality...",
      "page": [3],
      "similarity": 0.698
    }
  ],
  "collection_title": "User Manual v2.0"
}

Error Responses

400 - Bad Request
{
  "error": "Query vacío"
}
{
  "error": "collection_id requerido"
}
{
  "error": "Colección no lista (status: processing)"
}
404 - Not Found
{
  "error": "Colección no encontrada"
}
500 - Server Error
{
  "error": "Error API: 500 - API key invalid"
}
Requires GOOGLE_API_KEY environment variable. See rag_chat/views.py:40 for implementation.

List Documents

GET /rag-chat/api/documents/

Retrieve all uploaded document collections
URL: /rag-chat/api/documents/
Method: GET
Auth: Optional (returns all documents)
Implemented in rag_chat/views.py:272 as ListDocumentsView.

Example Request

curl -X GET https://your-domain.com/rag-chat/api/documents/

Success Response (200)

documents
array
required
Array of document collection objects
documents[].id
integer
Document collection ID
documents[].title
string
Document title
documents[].status
string
Processing status: pending, processing, indexed, or error
documents[].page_count
integer
Number of pages in the document
documents[].chunk_count
integer
Number of indexed chunks
documents[].created_at
string
ISO 8601 timestamp of upload
documents[].error
string|null
Error message if status is error
{
  "documents": [
    {
      "id": 3,
      "title": "User Manual v2.0",
      "status": "indexed",
      "page_count": 12,
      "chunk_count": 45,
      "created_at": "2026-03-06T14:30:00Z",
      "error": null
    },
    {
      "id": 2,
      "title": "API Documentation",
      "status": "processing",
      "page_count": 8,
      "chunk_count": 0,
      "created_at": "2026-03-05T10:15:00Z",
      "error": null
    },
    {
      "id": 1,
      "title": "Old Manual",
      "status": "error",
      "page_count": 0,
      "chunk_count": 0,
      "created_at": "2026-03-01T08:00:00Z",
      "error": "Failed to extract text from PDF"
    }
  ]
}

Delete Document

DELETE /rag-chat/api/document/<int:collection_id>/

Remove a document collection and all associated chunks
URL: /rag-chat/api/document/<collection_id>/
Method: DELETE
Auth: Optional
Implemented in rag_chat/views.py:300 as DeleteDocumentView.
collection_id
integer
required
ID of the document collection to delete
Deleting a collection cascades to delete all associated DocumentChunk records and removes the physical file from storage.

Example Request

curl -X DELETE https://your-domain.com/rag-chat/api/document/3/

Success Response (200)

message
string
required
Confirmation message
{
  "message": "Documento \"User Manual v2.0\" eliminado correctamente"
}

Error Responses

404 - Not Found
{
  "error": "Documento no encontrado"
}
500 - Server Error
{
  "error": "Failed to delete file from storage"
}

Query History

GET /rag-chat/api/history/

Retrieve past RAG queries (for all users or current user)
URL: /rag-chat/api/history/
Method: GET
Auth: Optional (returns all history)
Implemented in rag_chat/views.py:333 as QueryHistoryView.
limit
integer
default:"20"
Maximum number of queries to return

Example Request

curl -X GET "https://your-domain.com/rag-chat/api/history/?limit=10"

Success Response (200)

history
array
required
Array of past query objects
history[].id
integer
Query record ID
history[].query
string
User’s original question
history[].response
string
AI-generated response (truncated to 200 chars)
history[].collection
string|null
Title of the document collection queried
history[].created_at
string
ISO 8601 timestamp of the query
{
  "history": [
    {
      "id": 42,
      "query": "How do I reset my password?",
      "response": "To reset your password, go to the Settings page and click 'Reset Password'. You will receive an email with instructions to create a new password...",
      "collection": "User Manual v2.0",
      "created_at": "2026-03-06T14:35:22Z"
    },
    {
      "id": 41,
      "query": "What are the system requirements?",
      "response": "The system requires Python 3.8+, Django 5.2+, MySQL database, and at least 2GB RAM for optimal performance...",
      "collection": "API Documentation",
      "created_at": "2026-03-06T13:20:15Z"
    }
  ]
}

Data Models

DocumentCollection

Defined in rag_chat/models.py:6
FieldTypeDescription
idIntegerPrimary key
userForeignKeyUploader (nullable)
titleStringDocument name
fileFileFieldPDF file path
file_typeStringFile format (default: pdf)
page_countIntegerTotal pages
chunk_countIntegerTotal chunks indexed
statusStringpending, processing, indexed, error
error_messageTextError details
created_atDateTimeUpload timestamp
updated_atDateTimeLast modification

DocumentChunk

Defined in rag_chat/models.py:37
FieldTypeDescription
idIntegerPrimary key
collectionForeignKeyParent document
chunk_indexIntegerPosition in document
contentTextChunk text
embeddingJSONFieldVector embedding (list of floats)
metadataJSONFieldPage numbers, sections, etc.
created_atDateTimeCreation timestamp

RAGQuery

Defined in rag_chat/models.py:74
FieldTypeDescription
idIntegerPrimary key
userForeignKeyQuerying user
collectionForeignKeyDocument queried
queryTextUser question
responseTextAI-generated answer
chunks_usedJSONFieldArray of chunk IDs
created_atDateTimeQuery timestamp

Technical Details

Embeddings Generation

The system uses sentence transformers for generating embeddings. See rag_chat/embeddings.py for implementation.
from .embeddings import get_embedding_generator

generator = get_embedding_generator()
texts = ["example text chunk"]
embeddings = generator.encode_documents(texts, show_progress=True)
Database-based cosine similarity search via DatabaseVectorStore in rag_chat/vector_store.py:
from .vector_store import DatabaseVectorStore

vector_store = DatabaseVectorStore(collection_id, dimension=384)
results = vector_store.search(query_vector, k=3)

Google Gemini Integration

Implemented in rag_chat/views.py:27 as _call_google_api_with_context(). Model: gemini-2.0-flash
Endpoint: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent
Requires GOOGLE_API_KEY environment variable. Calls have a 30-second timeout.

Document Processing

Chunk Size: 500 characters
Overlap: 50 characters
Supported Formats: PDF only
See rag_chat/document_loader.py for the DocumentLoader implementation.

Example Workflow

1

Upload Document

curl -X POST /rag-chat/api/upload/ -F "[email protected]"
Receive collection_id: 3
2

Check Status

curl /rag-chat/api/documents/
Verify status is indexed
3

Query Document

curl -X POST /rag-chat/api/query/ \
  -H "Content-Type: application/json" \
  -d '{"collection_id": 3, "query": "How do I...?"}'
Receive answer with sources
4

View History

curl /rag-chat/api/history/?limit=5
See past queries and responses

Error Handling

Common Issues

Cause: Missing environment variable
Solution: Set GOOGLE_API_KEY in .env or environment
Cause: Uploaded file is not a PDF
Solution: Convert document to PDF format
Cause: Document still being indexed
Solution: Wait for processing to complete, then retry
Cause: No chunks matched the query above threshold
Solution: Rephrase query or upload more comprehensive documents

Next Steps

Authentication

Learn about user authentication

API Overview

Explore other API endpoints

Build docs developers (and LLMs) love