Proxying Langflow from Cloudflare Pages Functions


When I built Astra Docs Chat , the overview post described the stack in one paragraph: Hugo page on the front, Langflow and Astra DB behind it. The piece that makes that safe for a public site is small and easy to skip over: a Cloudflare Pages Function at /api/astra-chat that proxies chat requests to a private Langflow instance.

This post is that piece in detail: why the browser never talks to Langflow directly, what the proxy validates, how streaming is translated, and how to check that secrets stay server-side.

If you have not read the parent post yet, start here: Building Astra Docs Chat .

Try the live chat: Astra Docs Chat


Langflow exposes a run API. Calling it requires an x-api-key header and knowledge of your instance URL. Putting either of those in static JavaScript would expose them to every visitor.

The usual Langflow embed widget sidesteps this by running inside Langflow’s own origin or by baking credentials into a hosted config: fine for internal tools, not what I wanted for a public Hugo page.

The pattern I already used on Document Analyzer applies here too: the browser only talks to my domain. Everything sensitive stays in Cloudflare secrets and runs at request time on the edge.

Browser  →  jamieede.com/api/astra-chat  →  Langflow (private)
                ↑ secrets live here

Same-origin requests also avoid CORS headaches. Langflow’s CORS settings become belt-and-braces; the visitor never crosses origins.


Pages Functions sit beside the Hugo public/ output. One file handles the route:

jamieedecom/
└── functions/
    └── api/
        └── astra-chat.js   →  POST /api/astra-chat

Deploy is unchanged: hugo build then wrangler pages deploy public. The function ships with the static site: no separate Worker repo.


Two environment variables, set as Cloudflare Pages secrets:

Variable Purpose
LANGFLOW_URL Base URL of the private Langflow instance (no trailing slash)
LANGFLOW_API_KEY API key Langflow expects on /api/v1/run/...

Set them once per project:

wrangler pages secret put LANGFLOW_URL --project-name jamieedecom
wrangler pages secret put LANGFLOW_API_KEY --project-name jamieedecom

If either is missing at runtime, the function returns 502 Chat service unavailable: generic on purpose, no hint about which secret is wrong.

I removed an early astra-chat-config.json from static/ for exactly this reason: nothing upstream belongs in assets the browser can fetch.


The front end POSTs JSON:

{
  "message": "How do I create a collection?",
  "session_id": "550e8400-e29b-41d4-a716-446655440000"
}

The proxy checks:

  1. Body parses as JSON: otherwise 400 Invalid JSON
  2. message is a non-empty string after trim: otherwise 400 Message required
  3. message length ≤ 2000 characters: otherwise 400 Message too long
  4. session_id is optional; if absent, the proxy generates a UUID so Langflow can keep conversational context

That length cap is a simple abuse guard. RAG prompts do not need novel-length questions.


On a valid request, the proxy calls:

POST {LANGFLOW_URL}/api/v1/run/datastax-astra-chat?stream=true

Headers:

  • Content-Type: application/json
  • x-api-key: {LANGFLOW_API_KEY}

Body:

{
  "input_value": "<message>",
  "output_type": "chat",
  "input_type": "chat",
  "session_id": "<session_id>"
}

datastax-astra-chat is the Langflow flow endpoint name: the chat half of the RAG graph (retrieve from Astra DB, then call the LLM). Ingest uses a separate endpoint; that is a different post.

A 60-second abort timeout maps to 504 Request timed out. Langflow 401/403 and other upstream failures map to 502 Chat service unavailable: again, no credential details in the response.


Langflow streams newline-delimited JSON (NDJSON). Each line is an event object. Token chunks look like:

{"event":"token","data":{"chunk":"Hello"}}

The chat UI on this site (same pattern as the analyzer) expects Server-Sent Events:

data: {"chunk":"Hello"}

The proxy reads the upstream body, parses line by line, and re-emits SSE:

if (event.event === 'token' && event.data?.chunk) {
  controller.enqueue(
    encoder.encode(`data: ${JSON.stringify({ chunk: event.data.chunk })}\n\n`)
  );
}

When Langflow sends an end event, the proxy emits data: [DONE] so the front end knows to stop reading.

Not every model path streams token-by-token. Sometimes the useful text only appears on the end event. The proxy tracks whether any token events arrived; if not, it extracts the full message from event.data.result.message and sends it as a single chunk. Without that fallback, some runs would look like empty responses in the UI.

Stream errors mid-flight enqueue a short “Response interrupted.” chunk before [DONE], so the user sees partial output instead of a hung cursor.


Open DevTools → Network on Astra Docs Chat , send a message, and inspect the /api/astra-chat request:

  • Request URL is same-origin (jamieede.com or your preview hostname)
  • Request headers contain no API key
  • Response is text/event-stream with data: {"chunk":"..."} lines

You should not see your Langflow hostname or x-api-key anywhere in client-side code or network traffic. If you do, something is miswired.

Local testing: run hugo server for the page and wrangler pages dev public (or your usual preview setup) so /api/astra-chat hits the function with secrets loaded.


Condition Status Body
Missing/empty message 400 { "error": "Message required" }
Message too long 400 { "error": "Message too long" }
Invalid JSON 400 { "error": "Invalid JSON" }
Missing secrets / Langflow down 502 { "error": "Chat service unavailable" }
Upstream timeout (60s) 504 { "error": "Request timed out" }

The front end maps non-OK responses to an inline error bubble, using error from the JSON body when present.


I could have deployed a dedicated Worker on a subdomain. Pages Functions won one file colocated with the Hugo site, one deploy pipeline (Gitea → Cloudflare Pages), and the same pattern I wanted to reuse across tools.

The analyzer still calls a separate Worker URL for historical reasons; Astra Docs Chat keeps everything under jamieede.com. Both approaches hide secrets: pick based on how much you want in one repo vs. a shared API Worker.

For a deeper look at the analyzer’s Worker-side streaming and caching, see Rebuilding the document analyzer on Cloudflare’s full stack .


This post is the edge layer. The trilogy continues with batch ingest (how 271 doc pages became vectors) and the Hugo streaming chat UI (the browser side of the same SSE contract). Those follow-ups are next in the Astra Docs Chat series .

Open Astra Docs Chat and ask something: the proxy is the invisible part that keeps it public-safe.

×
Page views: