Overview
The RAG (Retrieval-Augmented Generation) API allows you to upload documents and create knowledge bases that your AI agents can search. Documents are uploaded to Google Cloud Storage and indexed in Vertex AI Search for semantic retrieval.Workflow: First create a DataStore using
POST /rag/datastores, then upload files to that DataStore using POST /rag/upload. Files are indexed asynchronously via background tasks.Authentication
All RAG endpoints require API key authentication with specific scopes:| Endpoint | Required Scope |
|---|---|
POST /rag/datastores | rag:upload |
GET /rag/datastores | rag:read |
GET /rag/datastores/{id}/files | rag:read |
DELETE /rag/datastores/{id} | rag:delete |
DELETE /rag/datastores/{id}/files/{filename} | rag:delete |
POST /rag/upload | rag:upload |
GET /rag/uploads | rag:read |
GET /rag/uploads/{upload_id} | rag:read |
X-API-Key header:
Supported File Types
The following file types can be uploaded to RAG Datastores:| MIME Type | Extension | Description |
|---|---|---|
application/pdf | .pdf | PDF documents |
application/vnd.openxmlformats-officedocument.wordprocessingml.document | .docx | Microsoft Word documents |
application/vnd.openxmlformats-officedocument.presentationml.presentation | .pptx | Microsoft PowerPoint presentations |
text/plain | .txt | Plain text files |
text/html | .html | HTML documents |
application/json | .json | JSON files |
application/x-ndjson | .jsonl | Newline-delimited JSON |
text/markdown | .md, .markdown | Markdown documents |
File Limits: Maximum 10MB per file, maximum 10 files per upload request.
Create DataStore
Create a new DataStore to organize uploaded documents.Endpoint
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
display_name | string | Yes | Human-readable name for the DataStore (e.g., “Company Policies”) |
description | string | No | Optional description of the DataStore’s purpose |
Request Example
cURL
Python
JavaScript
Response
Status Code:200 OK
Response Fields
| Field | Type | Description |
|---|---|---|
success | boolean | Whether creation was successful |
datastore_id | string | Unique ID for the DataStore (use this for uploads) |
display_name | string | Human-readable name |
description | string | DataStore description (null if not provided) |
message | string | Success message |
Error Responses
409 Conflict - DataStore Already Exists
Upload Files
Upload files to an existing DataStore. Files are encoded as Base64 and indexed asynchronously.Endpoint
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
datastore_id | string | Yes | ID of existing DataStore (from create response) |
files | array | Yes | Array of files to upload (min: 1, max: 10) |
files[].filename | string | Yes | Filename with extension (e.g., “policy.pdf”) |
files[].content | string | Yes | Base64-encoded file content |
files[].content_type | string | No | MIME type (auto-detected from extension if not provided) |
Request Example
cURL
Python
Response
Status Code:200 OK
Response Fields
| Field | Type | Description |
|---|---|---|
success | boolean | Whether upload was initiated |
datastore_id | string | Target DataStore ID |
datastore_name | string | Target DataStore display name |
uploads | array | Status of each uploaded file |
uploads[].upload_id | string | Unique upload ID (UUID) |
uploads[].filename | string | Original filename |
uploads[].status | string | Upload status (processing) |
message | string | Summary message |
Error Responses
404 Not Found - DataStore Not Found
400 Bad Request - Unsupported File Type
400 Bad Request - File Too Large
400 Bad Request - Invalid Base64
List DataStores
Retrieve all DataStores in your organization.Endpoint
Request Example
cURL
Python
Response
Status Code:200 OK
Response Fields
| Field | Type | Description |
|---|---|---|
datastores | array | List of DataStores |
datastores[].datastore_id | string | Unique DataStore ID |
datastores[].display_name | string | Human-readable name |
datastores[].description | string | DataStore description (null if not set) |
datastores[].document_count | integer | Number of completed uploads |
datastores[].created_at | string | Creation timestamp (ISO 8601) |
datastores[].is_active | boolean | Whether DataStore is active |
total | integer | Total number of DataStores |
List Uploads
Retrieve a paginated list of file uploads for your organization.Endpoint
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
limit | integer | No | Maximum items to return (default: 50, max: 100) |
offset | integer | No | Number of items to skip (default: 0) |
status | string | No | Filter by status: pending, processing, indexing, completed, failed |
datastore_id | string | No | Filter by DataStore ID |
Request Example
cURL
Python
Response
Status Code:200 OK
Upload Status Values
| Status | Description |
|---|---|
pending | Upload received, waiting for processing |
processing | File being uploaded to GCS |
indexing | File being indexed in Vertex AI Search |
completed | Indexing complete, document is searchable |
failed | Upload or indexing failed |
Response Fields
| Field | Type | Description |
|---|---|---|
uploads | array | List of uploads |
uploads[].upload_id | string | Unique upload ID (UUID) |
uploads[].filename | string | Original filename |
uploads[].file_size | integer | File size in bytes |
uploads[].file_type | string | MIME type |
uploads[].datastore_id | string | Target DataStore ID |
uploads[].datastore_name | string | Target DataStore name |
uploads[].status | string | Current status |
uploads[].uploaded_at | string | Upload timestamp (ISO 8601) |
uploads[].indexed_at | string | Indexing completion timestamp (null if not indexed) |
uploads[].error_message | string | Error description if failed |
total | integer | Total matching uploads |
limit | integer | Items per page |
offset | integer | Current offset |
Get Upload Status
Retrieve detailed status of a specific upload.Endpoint
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
upload_id | string | Yes | Upload ID (UUID format) |
Request Example
cURL
Response
Status Code:200 OK
Response Fields
| Field | Type | Description |
|---|---|---|
upload_id | string | Unique upload ID (UUID) |
filename | string | Original filename |
file_size | integer | File size in bytes |
file_type | string | MIME type |
status | string | Current status |
ready | boolean | true if document is searchable (status === completed) |
datastore_id | string | Target DataStore ID |
datastore_name | string | Target DataStore name |
uploaded_at | string | Upload timestamp (ISO 8601) |
indexed_at | string | Indexing completion timestamp (null if not indexed) |
error_message | string | Error description if failed |
Error Responses
404 Not Found
List DataStore Files
Retrieve all files in a specific DataStore with their indexing status.Endpoint
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
datastore_id | string | Yes | DataStore ID |
Request Example
cURL
Python
JavaScript
Response
Status Code:200 OK
Response Fields
| Field | Type | Description |
|---|---|---|
datastore_id | string | DataStore ID |
datastore_name | string | DataStore display name |
files | array | List of files in the DataStore |
files[].name | string | Filename |
files[].size | integer | File size in bytes |
files[].content_type | string | MIME type |
files[].uploaded_at | string | Upload timestamp (ISO 8601) |
files[].status | string | Indexing status (pending, indexing, completed, failed) |
total_files | integer | Total number of files |
indexed_files | integer | Number of fully indexed files |
Error Responses
404 Not Found
Delete DataStore
Delete a DataStore and all its contents (files and indexed documents).Endpoint
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
datastore_id | string | Yes | DataStore ID to delete |
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
force | boolean | No | If true, delete even if DataStore is connected to agents (default: false) |
Request Example
cURL
Python
JavaScript
Response
Status Code:200 OK
Response Fields
| Field | Type | Description |
|---|---|---|
success | boolean | Whether deletion was successful |
datastore_id | string | Deleted DataStore ID |
message | string | Success message |
Error Responses
404 Not Found
409 Conflict - DataStore In Use
Delete File
Delete a single file from a DataStore while preserving the DataStore itself.Endpoint
Partial Deletion: Only the specified file is removed. The DataStore and other files remain intact.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
datastore_id | string | Yes | DataStore ID |
filename | string | Yes | Name of the file to delete (URL-encoded if contains special characters) |
Request Example
cURL
Python
JavaScript
Response
Status Code:200 OK
Response Fields
| Field | Type | Description |
|---|---|---|
success | boolean | Whether deletion was successful |
datastore_id | string | DataStore ID |
filename | string | Deleted filename |
remaining_files | integer | Number of files remaining in the DataStore |
message | string | Success message |
Error Responses
404 Not Found - DataStore
500 Internal Server Error - File Not Found
Best Practices
Polling for Upload Completion
Polling for Upload Completion
When an upload is in
processing or indexing status, poll the status endpoint to check for completion:Batch File Uploads
Batch File Uploads
Upload multiple related files in a single request (up to 10 files):
Organizing DataStores
Organizing DataStores
Create separate DataStores for different topics or document types:
- By Department: “HR Policies”, “Engineering Docs”, “Sales Materials”
- By Project: “Project Alpha”, “Project Beta”
- By Type: “Legal Contracts”, “Technical Specs”, “Meeting Notes”
File Size Optimization
File Size Optimization
For large documents:
- Split large PDFs into smaller chapters
- Remove unnecessary images from documents
- Use text formats (TXT, MD) when formatting isn’t important
- Compress images before including in documents
