Complete Tutorial: Docker Desktop + n8n + Qdrant + Embedding Auto-Ingestion Pipeline
๐ Complete Tutorial: Docker Desktop + n8n + Qdrant + Embedding Auto-Ingestion Pipeline
A fully local AI knowledge base system you can run right away: Webhook โ Embedding โ n8n processing โ Qdrant vector database โ searchable knowledge base Including all the pitfalls I personally encountered (Webhook parsing, vector format, HTTP JSON errors, payload structure, etc.) โ this is the “bug-proof” edition.
๐งฑ 1. Overall Architecture
User Request (Webhook)
โ
n8n Workflow
โ
Embedding Model (Ollama / OpenAI)
โ
Set Node (standardize structure)
โ
HTTP Request (write to Qdrant)
โ
Qdrant Vector DB
โ
Subsequent retrieval / RAG
๐ณ 2. Launch Qdrant + n8n with Docker Desktop
1๏ธโฃ docker-compose.yml
version: "3.9"
services:
qdrant:
image: qdrant/qdrant
ports:
- "6333:6333"
volumes:
- ./qdrant_data:/qdrant/storage
n8n:
image: n8nio/n8n
ports:
- "5678:5678"
environment:
- N8N_HOST=localhost
- N8N_PORT=5678
- N8N_PROTOCOL=http
- NODE_ENV=production
volumes:
- ./n8n_data:/home/node/.n8n
Start up:
docker compose up -d
Access:
- n8n: http://localhost:5678
- Qdrant: http://localhost:6333
๐ง 3. Create a Qdrant Collection
curl -X PUT http://localhost:6333/collections/test \
-H "Content-Type: application/json" \
-d '{
"vectors": {
"size": 768,
"distance": "Cosine"
}
}'
๐ 4. n8n Workflow Design (Core)
๐งฉ Step 1: Webhook (Entry Point)
Node: Webhook
- Method: POST
- Path:
/qdrant-ingest - Body Content Type: JSON (required)
Test payload:
{
"text": "Artificial intelligence is the technology that simulates human intelligence"
}
โ ๏ธ Common Pitfall
If the webhook output shows:
body: "{ \"text\": \"xxx\" }"
โ This means JSON was not parsed ๐ You must enable JSON mode
๐งฉ Step 2: Embedding (Ollama example)
HTTP Request Node:
POST http://host.docker.internal:11434/api/embeddings
Body:
{
"model": "nomic-embed-text",
"prompt": "{{$json.body.text}}"
}
โ Correct output should be:
{
"embedding": [0.1, 0.2, 0.3, ...]
}
๐งฉ Step 3: Set Node (Critical structure cleanup)
๐ This is where you got stuck the longest
โค Add Node: Set
Mode: Manual Mapping
โ Set these fields:
โ id (Number)
{{ Date.now() }}
โก vector (Array)
{{ $json.embedding }}
โข payload (Object) ๐ฅCritical
{
"text": "{{$node["Webhook"].json.body.text}}"
}
โ Correct Set output structure:
{
"id": 1700000000000,
"vector": [0.1, 0.2, ...],
"payload": {
"text": "Artificial intelligence..."
}
}
๐งฉ Step 4: HTTP Request (Write to Qdrant)
URL
http://qdrant:6333/collections/test/points
Method
PUT
Body (โ ๏ธCorrect format)
{
"points": [
{
"id": {{$json.id}},
"vector": {{$json.vector}},
"payload": {{$json.payload}}
}
]
}
โ Summary of why you got errors before
โ JSON Body errors
Reason:
vector / payload were not properly JSON-serialized
โก payload undefined
Reason:
Webhook body wasn't parsed as JSON
โข vector is not an array
Reason:
embedding was treated as a string
๐งช 5. Verify data was ingested successfully
curl http://localhost:6333/collections/test/points/scroll \
-H "Content-Type: application/json" \
-d '{
"limit": 10,
"with_payload": true,
"with_vectors": true
}'
๐ฏ 6. Final stable workflow (recommended version)
Webhook
โ
Embedding Node (Ollama / OpenAI)
โ
Set Node (standardize structure)
โ
HTTP Request โ Qdrant
๐ง 7. Key lessons learned (very important)
โ 1. n8n JSON rules
| Mistake | Correct |
|---|---|
| Concatenate JSON with {{}} | Use expressions properly |
| vector as string | vector as array |
| payload as string | payload as object |
โ 2. Qdrant requirements
id: number / uuid
vector: float[]
payload: object
โ 3. Webhook must enable JSON mode
Otherwise:
body = string โ
body.text = undefined โ
๐ 8. Next steps / upgrade directions
Your system is already capable of upgrading to:
๐ฅ Enterprise-grade AI Knowledge Base
- RAG-based Q&A retrieval
- Multi-collection classification (doc / chat / blog)
- Automatic chunk splitting
- Deduplication on write
- Vector update strategies
- Switching between multiple Embedding models
- LangChain / Flowise integration
If you want to keep upgrading, we can continue with ๐
๐ n8n + Qdrant + RAG Q&A system (production-ready)
Including:
- Chat UI (Web)
- Vector retrieval
- LLM responses (Ollama / GPT)
- Multi-turn conversation memory
- Knowledge base tiering
This can directly become an AI SaaS MVP.