Cloudflare AI Search (AutoRAG) Integration: Build a Knowledge Assistant for Your Blog

Introduction#

I recently added an AI assistant to my blog that can search posts and answer user questions. It is built with Cloudflare AI Search (also called AutoRAG). I hit several pitfalls, so I organized the full setup flow here.

Preview: click the chat button in the lower-right corner and ask, “Which posts are about Serverless?” It retrieves related content, answers the question, and shows citations.

What is Cloudflare AI Search?#

AI Search is Cloudflare’s RAG (Retrieval-Augmented Generation) service. In short:

You upload documents (web pages, PDFs, Markdown, etc.)
Cloudflare automatically builds vector indexes
On user queries, it retrieves relevant documents first, then generates answers grounded in those documents

Compared with self-hosted RAG:

Free quota: 100,000 neural AI calls per day
Zero ops: no vector DB or embedding infrastructure to manage
All-in-one: retrieval + generation through one API

1. Create AI Search#

1.1 Open the dashboard#

1.2 Create an instance#

Click Create AI Search and fill in:

Name: any name, for example my-blog-search (used later in code)
Model: choose a generation model, recommended @cf/meta/llama-3.3-70b-instruct-fp8-fast (fast)

After creation, you will see the instance details page.

2. Add data sources#

AI Search supports two ways to ingest data:

2.1 Option A: Website URL (recommended)#

Great for public websites. Cloudflare crawls pages automatically.

In AI Search details, click use template
Enter your site URL, for example https://blog.example.com
Click Start Indexing

After crawling, you can view all indexed pages in the Overview tab.

Pros: auto-updates when site content changes
Cons: your site must be publicly reachable

2.2 Option B: Upload files#

Best for local/private documents.

In AI Search details, choose file-based source and configure it as needed.

3. Test in dashboard#

After data ingestion:

Open the AI Search instance details
Go to Playground
Ask something like “Which posts are about deployment?”
Check the answer and cited source documents

4. Code integration (practical)#

Now the key part: how to call AI Search in your app. Example below uses an Astro blog project.

4.1 Configure AI binding#

First, add AI binding in wrangler.jsonc (or wrangler.toml):

1
{
2
  "name": "my-blog",
3
  "compatibility_date": "2025-08-11",
4
  "pages_build_output_dir": "./dist",
5
  "ai": {
6
    "binding": "AI"
7
  }
8
}

Then you can access AI services in Cloudflare Pages Functions via env.AI.

4.2 Call AI Search (key snippet)#

1
const result = await
2
env.AI
3
.autorag('my-blog-search').aiSearch({
4
  query: query.trim(),
5
  model: '@cf/meta/llama-3.3-70b-instruct-fp8-fast',
6
  rewrite_query: true,
7
  max_num_results: 5,
8
  ranking_options: { score_threshold: 0.3 },
9
  reranking: { enabled: true, model: '@cf/baai/bge-reranker-base' },
10
  stream: true,
11
});

4.3 Parameter guide#

Parameter	Description	Suggested value
`model`	Generation model	`llama-3.3-70b-instruct-fp8-fast`
`rewrite_query`	Rewrite user query	`true`
`max_num_results`	Number of retrieved docs	`3-10`
`score_threshold`	Relevance threshold (0-1)	`0.3` (too high may return nothing)
`reranking.enabled`	Enable reranking	`true`
`stream`	Stream response	`true` (better UX)

5. Streaming response handling (important)#

When stream: true is enabled, AI Search returns SSE-formatted data, so the client must parse it correctly.

6. Deployment configuration#

6.1 Configure Pages binding#

Choose either this or 4.1 AI binding, and 4.1 is recommended.

In Cloudflare Dashboard:

Open your Pages project → Settings → Bindings
Find AI Bindings
Click Add binding:
- Variable name: AI
- AI Search: select your instance
Click Save

6.2 Deploy#

1
pnpm build
2
npx wrangler pages deploy dist

Summary#

Cloudflare AI Search significantly lowers the entry barrier for RAG: no need to run vector databases or embedding pipelines yourself. It also provides a generous daily free tier, which is usually enough for lightweight knowledge-assistant use cases.