After working with distributed databases at DataStax, I kept doing the same thing: open five doc tabs, search, skim, repeat. The official Astra DB Serverless documentation is thorough, but there is just a lot of it.
I wanted a single place to ask normal questions (How do I create a collection?, What are PCU groups?, How does hybrid search work?) and get answers grounded in the real docs, not a model’s best guess from training data.
That became Astra Docs Chat : a free chat page on this site, backed by retrieval-augmented search over 271 DataStax documentation pages.
Try it here: Astra Docs Chat
What it does ¶
You type a question (or pick a starter prompt). The answer streams in with markdown: code samples, lists, step-by-step detail when the docs have it.
There is no account, no API key, and nothing to install. Refresh the page and your conversation can continue where you left off.
It is still an AI: answers can be wrong, especially on edge cases the docs barely mention. Version one does not link you back to the source pages (that is on the list for later), but in day-to-day use it has been good at Astra terminology, APIs, and how things fit together.
The stack ¶
At a high level:
- Corpus: markdown exports of the public DataStax doc set
- Langflow: two flows on a private instance: one to ingest docs into a vector database, one to answer questions
- Astra DB: vector store (
datastax_astra_docscollection) - Embeddings: OpenAI
text-embedding-3-smallfor the indexed content - Chat model: DeepSeek
deepseek-v4-flash, streamed - This site: Hugo for the page, plus a small Cloudflare Pages Function that proxies chat requests
Same pattern as the Document Analyzer
on this site: the browser only talks to jamieede.com. Langflow credentials and the upstream URL live in server-side secrets on a Cloudflare Pages Function
, not in JavaScript.
You → jamieede.com/astra-chat
→ Cloudflare Pages Function (/api/astra-chat)
→ Langflow (RAG + DeepSeek)
→ Astra DB (doc search)
← streamed answer
What happens when you send a message ¶
1. Your question hits the edge
The chat UI posts to /api/astra-chat on this domain. The Pages Function checks the message (required, length limit) and attaches a session id so follow-up questions stay in context.
2. Langflow retrieves, then answers
Langflow runs the RAG flow: search the ingested documentation in Astra DB, build a prompt from the relevant chunks, then call DeepSeek. Tokens stream back as they are generated.
3. The proxy adapts the stream for the browser
Langflow and the analyzer tool use slightly different streaming formats. The Pages Function translates that into a simple event stream the front end already knows how to render, including a fallback when upstream streaming arrives as one blob instead of many tokens.
4. You see markdown as it arrives
The UI re-renders the assistant message on each chunk (via marked), so code blocks and lists appear progressively. Long code lines wrap inside the chat panel instead of breaking the layout.
Getting 271 pages into Astra DB ¶
Before chat was useful, the docs had to exist as searchable vectors. That was a one-time batch job, not part of the normal site deploy:
- Export or collect the doc pages as markdown
- Run an ingest flow in Langflow for each file: upload, split, embed, write to Astra DB
- Resume on failure (it takes a few hours end-to-end)
I am not auto-refreshing when DataStax ships doc updates yet. For a personal reference tool, a manual re-ingest now and then is enough.
Why build it this way ¶
Custom UI instead of Langflow’s embed widget: matches the rest of the site and keeps the experience simple (one panel, starter prompts, no iframe chrome).
Proxy instead of calling Langflow from the browser: API keys stay off the client. Visitors only see requests to this domain.
Langflow for the graph: ingest and RAG are already modeled there (file → split → embed → store; chat → retrieve → LLM). I did not want to reimplement that orchestration in a Worker.
DeepSeek for chat: cheaper streaming responses than the original OpenAI chat node in the template, with acceptable quality on technical Q&A.
Astra DB as the vector store: natural fit given the subject matter, and the Langflow bundle already supported it.
What it can’t do ¶
- Source citations with links back to doc pages
- Automatic re-ingest when documentation changes
- A full chat history UI (session continuity only)
If retrieval finds nothing useful, the model may still answer from general knowledge. Tightening that with a stricter “docs only” prompt is a sensible next step.
Try it ¶
Open Astra Docs Chat and ask something you would otherwise search the docs for. If you build something similar (RAG over a fixed corpus, Langflow behind a proxy, Astra as the store), I would be interested to hear how you approached it on LinkedIn .