When I built Astra Docs Chat
, the overview post described the stack in one paragraph: Hugo page on the front, Langflow and Astra DB behind it. The piece that makes that safe for a public site is small and easy to skip over: a Cloudflare Pages Function at /api/astra-chat that proxies chat requests to a private Langflow instance.
This post is that piece in detail: why the browser never talks to Langflow directly, what the proxy validates, how streaming is translated, and how to check that secrets stay server-side.
If you have not read the parent post yet, start here: Building Astra Docs Chat .
Try the live chat: Astra Docs Chat
The problem ¶
Langflow exposes a run API. Calling it requires an x-api-key header and knowledge of your instance URL. Putting either of those in static JavaScript would expose them to every visitor.
The usual Langflow embed widget sidesteps this by running inside Langflow’s own origin or by baking credentials into a hosted config: fine for internal tools, not what I wanted for a public Hugo page.
The pattern I already used on Document Analyzer applies here too: the browser only talks to my domain. Everything sensitive stays in Cloudflare secrets and runs at request time on the edge.
Browser → jamieede.com/api/astra-chat → Langflow (private)
↑ secrets live here
Same-origin requests also avoid CORS headaches. Langflow’s CORS settings become belt-and-braces; the visitor never crosses origins.
Where the proxy lives ¶
Pages Functions sit beside the Hugo public/ output. One file handles the route:
jamieedecom/
└── functions/
└── api/
└── astra-chat.js → POST /api/astra-chat
Deploy is unchanged: hugo build then wrangler pages deploy public. The function ships with the static site: no separate Worker repo.
Secrets, not config files ¶
Two environment variables, set as Cloudflare Pages secrets:
| Variable | Purpose |
|---|---|
LANGFLOW_URL |
Base URL of the private Langflow instance (no trailing slash) |
LANGFLOW_API_KEY |
API key Langflow expects on /api/v1/run/... |
Set them once per project:
wrangler pages secret put LANGFLOW_URL --project-name jamieedecom
wrangler pages secret put LANGFLOW_API_KEY --project-name jamieedecom
If either is missing at runtime, the function returns 502 Chat service unavailable: generic on purpose, no hint about which secret is wrong.
I removed an early astra-chat-config.json from static/ for exactly this reason: nothing upstream belongs in assets the browser can fetch.
Request validation ¶
The front end POSTs JSON:
{
"message": "How do I create a collection?",
"session_id": "550e8400-e29b-41d4-a716-446655440000"
}
The proxy checks:
- Body parses as JSON: otherwise
400 Invalid JSON messageis a non-empty string after trim: otherwise400 Message requiredmessagelength ≤ 2000 characters: otherwise400 Message too longsession_idis optional; if absent, the proxy generates a UUID so Langflow can keep conversational context
That length cap is a simple abuse guard. RAG prompts do not need novel-length questions.
Forwarding to Langflow ¶
On a valid request, the proxy calls:
POST {LANGFLOW_URL}/api/v1/run/datastax-astra-chat?stream=true
Headers:
Content-Type: application/jsonx-api-key: {LANGFLOW_API_KEY}
Body:
{
"input_value": "<message>",
"output_type": "chat",
"input_type": "chat",
"session_id": "<session_id>"
}
datastax-astra-chat is the Langflow flow endpoint name: the chat half of the RAG graph (retrieve from Astra DB, then call the LLM). Ingest uses a separate endpoint; that is a different post.
A 60-second abort timeout maps to 504 Request timed out. Langflow 401/403 and other upstream failures map to 502 Chat service unavailable: again, no credential details in the response.
Translating the stream ¶
Langflow streams newline-delimited JSON (NDJSON). Each line is an event object. Token chunks look like:
{"event":"token","data":{"chunk":"Hello"}}
The chat UI on this site (same pattern as the analyzer) expects Server-Sent Events:
data: {"chunk":"Hello"}
The proxy reads the upstream body, parses line by line, and re-emits SSE:
if (event.event === 'token' && event.data?.chunk) {
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ chunk: event.data.chunk })}\n\n`)
);
}
When Langflow sends an end event, the proxy emits data: [DONE] so the front end knows to stop reading.
Blob fallback ¶
Not every model path streams token-by-token. Sometimes the useful text only appears on the end event. The proxy tracks whether any token events arrived; if not, it extracts the full message from event.data.result.message and sends it as a single chunk. Without that fallback, some runs would look like empty responses in the UI.
Stream errors mid-flight enqueue a short “Response interrupted.” chunk before [DONE], so the user sees partial output instead of a hung cursor.
What the browser sees ¶
Open DevTools → Network on Astra Docs Chat
, send a message, and inspect the /api/astra-chat request:
- Request URL is same-origin (
jamieede.comor your preview hostname) - Request headers contain no API key
- Response is
text/event-streamwithdata: {"chunk":"..."}lines
You should not see your Langflow hostname or x-api-key anywhere in client-side code or network traffic. If you do, something is miswired.
Local testing: run hugo server for the page and wrangler pages dev public (or your usual preview setup) so /api/astra-chat hits the function with secrets loaded.
Error behaviour (summary) ¶
| Condition | Status | Body |
|---|---|---|
| Missing/empty message | 400 | { "error": "Message required" } |
| Message too long | 400 | { "error": "Message too long" } |
| Invalid JSON | 400 | { "error": "Invalid JSON" } |
| Missing secrets / Langflow down | 502 | { "error": "Chat service unavailable" } |
| Upstream timeout (60s) | 504 | { "error": "Request timed out" } |
The front end maps non-OK responses to an inline error bubble, using error from the JSON body when present.
Why Pages Functions and not a standalone Worker? ¶
I could have deployed a dedicated Worker on a subdomain. Pages Functions won one file colocated with the Hugo site, one deploy pipeline (Gitea → Cloudflare Pages), and the same pattern I wanted to reuse across tools.
The analyzer still calls a separate Worker URL for historical reasons; Astra Docs Chat keeps everything under jamieede.com. Both approaches hide secrets: pick based on how much you want in one repo vs. a shared API Worker.
For a deeper look at the analyzer’s Worker-side streaming and caching, see Rebuilding the document analyzer on Cloudflare’s full stack .
Next in the series ¶
This post is the edge layer. The trilogy continues with batch ingest (how 271 doc pages became vectors) and the Hugo streaming chat UI (the browser side of the same SSE contract). Those follow-ups are next in the Astra Docs Chat series .
Open Astra Docs Chat and ask something: the proxy is the invisible part that keeps it public-safe.