Skip to main content

Image Uploads and imageIds (/converse)

This feature adds short-lived, R2-backed image uploads for the converse path. Clients upload image bytes once, receive server-issued imageId values, then send those IDs in /agentic-search/converse.

Scope

Implemented:

  • POST /api/v1/indexes/:index/agentic-search/images
  • /agentic-search/converse supports imageIds and imageUrls
  • Max 5 images total across imageIds + imageUrls

Not implemented:

  • /agentic-search does not accept imageIds

End-to-End Flow

  1. Client uploads 1-5 images as data URLs to /agentic-search/images.
  2. Server validates, normalizes, and stores each image in R2.
  3. Server returns { clientImageId, imageId, mimeType, expiresAt }[].
  4. Client calls /agentic-search/converse with base64 JSON payload containing imageIds and/or imageUrls.
  5. Worker resolves imageIds, fetches URL images, and sends all valid images to Gemini as inlineData.

API Contract

Upload endpoint

POST /api/v1/indexes/:index/agentic-search/images

Request body:

{
"images": [
{
"clientImageId": "img-1",
"dataUrl": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUg..."
}
]
}

Response body:

{
"images": [
{
"clientImageId": "img-1",
"imageId": "img_abc123...",
"mimeType": "image/png",
"expiresAt": "2026-02-25T10:30:00.000Z"
}
]
}

Notes:

  • clientImageId is caller-provided and echoed back.
  • imageId is server-generated and must be used in converse payloads.

Converse payload

GET /api/v1/indexes/:index/agentic-search/converse?payload=<base64-json>

Image fields:

  • imageUrls?: string[]
  • imageIds?: string[]

Example payload (before base64 encoding):

{
"q": "compare these",
"imageUrls": ["https://example.com/a.png"],
"imageIds": ["img_abc123"]
}

Limits and Validation

  • Upload accepts 1..5 images.
  • Converse enforces imageUrls.length + imageIds.length <= 5 (400 otherwise).
  • Upload data URLs must match data:<mime>;base64,<payload>.
  • MIME type must be image/*, then pass image processing validation.

Supported upload formats:

  • Resizable: JPEG, PNG, WebP, BMP, TIFF, ICO (image/x-icon) (max 5 MiB input)
  • Passthrough-only: HEIC, HEIF (max 1 MiB input)

Processing behavior:

  • Oversized images are rejected (400).
  • Large dimensions are resized to max 1024px on each side, preserving aspect ratio.
  • Normalization uses the same image pipeline as URL-based image processing in converse.

Storage and Expiry (R2)

Bucket binding:

  • AGENTIC_IMAGES_BUCKET

Object key format:

  • v1/<systemAccountId>/<indexName>/<imageId>

Custom metadata:

  • systemAccountId
  • indexName
  • mimeType
  • expiresAt

Expiry model:

  • expiresAt = now + 24h (returned to client and checked during resolution)
  • R2 lifecycle should delete v1/* objects after 1 day

Runtime Semantics in /converse

When imageIds are present, each image is resolved by key and validated for:

  • object existence
  • account/index binding match
  • non-expired expiresAt
  • non-empty payload

If both imageIds and imageUrls are present:

  • ID resolution and URL fetch/processing start in parallel
  • successful images are merged into one Gemini message

Failure handling is per image (soft-fail):

  • invalid/missing/expired imageId emits SSE error
  • failed URL fetch/process emits SSE error
  • stream continues with remaining valid images

Upload Failure Semantics

Upload is all-or-nothing:

  1. Validate/process all input images.
  2. Write all objects to R2.
  3. If any write fails, delete previously written objects.
  4. Return 500 (Failed to upload images).

Code References

  • components/agentic_search/src/images/upload-images.ts
  • components/agentic_search/src/images/resolve-image-ids.ts
  • components/agentic_search/src/converse/converse.ts
  • components/agentic_search/src/image-utils.ts
  • components/search_proxy/src/app.ts
  • components/search_proxy/src/env.ts