Skip to content

Agentic Search (Cloudflare Worker)

  • Config: components/agentic_search/wrangler.toml
  • Source: components/agentic_search/src/

AI-powered conversational search using Google Gemini. Called via service binding from search_proxy (not directly accessible).

Worker Names

Env Worker Name
Staging staging-agentic-search
Preprod preprod-agentic-search
Prod prod-agentic-search

No custom domain (RPC-only via service binding from search_proxy).

Bindings

Binding Type Purpose
SEARCH_PROXY_WORKER Service Binding Calls back to search_proxy for document retrieval
CONVERSATION_DO Durable Objects ConversationSqlDO - multi-turn conversation history (SQLite)

Durable Objects: ConversationSqlDO

Each conversation gets its own Durable Object instance with SQLite storage.

CREATE TABLE conversation (
    conversation_id TEXT PRIMARY KEY,
    data TEXT NOT NULL,           -- JSON serialized ConversationContext
    created_at INTEGER NOT NULL,
    updated_at INTEGER NOT NULL
)

Conversations auto-expire after 30 days of inactivity (via DO alarms).

Caching Strategy

Three-tier cache for agentic queries:

  1. Cloudflare Cache API (fastest, TTL ~5 min)
  2. In-request cache (within single request)
  3. DynamoDB (AgenticCachedQueriesTable, TTL ~1 hour)

Cache key: {account_id}#{index_name}#{normalized_query}

Automatic Filtering (Facet Context)

If agentic_config.filter_facets.enabled is enabled in per-index settings, the worker will:

  1. Prefetch facet values from Marqo for configured fields
  2. Cache the facet context in Cloudflare Cache API (short TTL; default 5 minutes)
  3. Inject facet values into the LLM system prompt so the LLM can optionally construct Marqo filter DSL

Facet cache key:

  • facets:{account_id}:{index_name}

Docs:

  • docs/components/agentic_search/automatic-filters.md

Stream Metadata: appliedFilter

When the worker performs category searches, it may combine a client-provided filter with an agent-constructed filter. The filter that was actually applied is surfaced back to clients as appliedFilter on each streamed categoryHits[] item.

If the agent filter results in 0 hits, the worker retries without the agent filter and surfaces:

  • filterDropped: true
  • originalFilter (the agent filter that was dropped)

Docs:

  • docs/components/agentic_search/applied-filters.md

External Dependencies

Service Credentials
Google Gemini API GOOGLE_API_KEY (from Secrets Manager)
DynamoDB (cached queries) AGENTIC_AWS_ACCESS_KEY_ID/SECRET

Typical Investigation Paths

Agentic search failing:

  1. Tail worker: npx wrangler tail {env}-agentic-search
  2. Check Google API key is set (env var GOOGLE_API_KEY)
  3. Check search_proxy worker (upstream dependency)

Conversation context lost:

  1. Durable Objects persist data - check DO console in Cloudflare dashboard
  2. 30-day inactivity cleanup may have triggered

Cache not working:

  1. Check DDB table: query AgenticCachedQueriesTable for the account
  2. Check Cloudflare Cache API headers in worker logs