Swapping DeepSeek in for OpenAI chat in a Langflow RAG flow


The Langflow DataStax RAG template ships with an OpenAI chat node. For Astra Docs Chat , I replaced it with DeepSeek for streaming answers: cheaper at volume, acceptable on technical Q&A over retrieved docs.

Context: Building Astra Docs Chat · Langflow chat flow · Astra DB vector store

Embeddings stayed on OpenAI text-embedding-3-small: only the generation step changed. See chunking and embedding post for why those stay paired.

Try it: Astra Docs Chat


Component v1 template This project
Language model OpenAI (e.g. gpt-4o-mini) DeepSeek deepseek-v4-flash
Streaming optional enabled (stream: true)
Temperature template default 0.1 (slightly factual bias)
Embeddings OpenAI unchanged
Astra DB unchanged unchanged

DeepSeek API key stored as Langflow global variable DEEPSEEK_API_KEY: same pattern as OPENAI_API_KEY and Astra tokens.

The chat endpoint name stays datastax-astra-chat. The Pages Function is model-agnostic: it forwards input_value and transforms the stream (proxy post ).


  1. Open the DataStax Astra Docs RAG flow in Langflow
  2. Remove or bypass the template OpenAI Language Model node (LanguageModelComponent-cAjdO in the exported template)
  3. Add DeepSeek Language Model component (or generic OpenAI-compatible node pointed at DeepSeek base URL)
  4. Set model to deepseek-v4-flash (or deepseek-chat if you prefer quality over speed)
  5. Enable stream
  6. Connect: Prompt output → DeepSeek input → ChatOutput
  7. Store API key in global variables, not in the node field that exports with JSON

Smoke test in Playground before touching the public site:

curl -s --compressed \
  -H "x-api-key: $LANGFLOW_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"input_value":"What is a collection in Astra DB?","output_type":"chat","input_type":"chat","session_id":"smoke-test"}' \
  "$LANGFLOW_URL/api/v1/run/datastax-astra-chat?stream=false"

Compare answer grounding to the OpenAI node on the same retrieved context.


  • Cost: long doc-style answers add up on OpenAI chat pricing; DeepSeek flash-tier models are cheaper for a public, unauthenticated chat
  • Streaming: required for the Hugo UI cursor UX (streaming chat UI post ); both providers support it
  • Quality: on “explain this Astra feature from context” tasks, quality has been acceptable for a personal reference tool

I did not run a formal benchmark: spot-checks against known doc passages after the swap. Edge-case API details still need human verification against the live docs.


Langflow emits NDJSON token events when streaming works. Some model paths deliver the full message only on the end event.

The Pages Function tracks whether any tokens arrived; if not, it extracts text from event.data.result.message and sends one SSE chunk (proxy code ). Test both paths after a model swap.

The Hugo UI re-parses markdown on every chunk with marked. Watch code fences and numbered lists during streaming.


  1. Instruction following: refusal behaviour when context is empty (guardrails )
  2. Markdown formatting: code fences and lists in streamed output
  3. Latency: flash models vary; 60s proxy timeout covers slow runs
  4. API compatibility: DeepSeek uses OpenAI-compatible chat completions; Langflow’s component handles base URL + model name

If quality drops on a class of questions (dense REST tables, version-specific defaults), test one model change in Playground before production.


Swap the Language Model component back to OpenAI in Langflow, point global variable at OpenAI key, redeploy nothing on Hugo if endpoint name unchanged. Ingest embeddings unaffected.


Model tier Typical use
deepseek-v4-flash Public chat, fast answers, lower cost
deepseek-chat Higher quality, still cheaper than flagship OpenAI for many prompts
OpenAI gpt-4o-mini Revert path if DeepSeek drifts on Astra-specific details

Generation cost dominates per-message spend; embedding cost dominates refresh (re-ingest post ).


Series index: Building Astra Docs Chat

Open Astra Docs Chat : compare feel and detail to asking the same questions in Langflow Playground with each model if you are evaluating.

×
Page views: