DocMind Architecture | Enterprise RAG Knowledge Base System

DocMind
企业级 RAG 智能知识库系统Enterprise RAG Knowledge Base System

全栈 AI 知识库系统，支持 RAG 检索增强生成、文档管理、混合搜索、流式问答、自主 Agent 工具调用。 Full-stack AI knowledge base system with RAG (Retrieval-Augmented Generation), document management, hybrid search, streaming Q&A, and an autonomous Agent with tool calling.

25K+

代码行数Lines of Code

15

API 模块API Modules

12

数据模型Data Models

11

Agent 工具Agent Tools

311

自动化测试Total Tests

12

优化轮次Optimization Rounds

FastAPI SQLAlchemy 2.0 Vue 3 TypeScript Naive UI MySQL 8 Redis Elasticsearch 8 Kafka MinIO DeepSeek API LangChain Pinia Vue Router Prometheus Grafana

A

系统架构System Architecture

多层架构，关注点分离：表现层（Vue 3）→ API 层（FastAPI）→ 业务层（Services）→ 数据层（Models + Infrastructure）。 Multi-layered architecture with clear separation of concerns: Presentation Layer (Vue 3) -> API Layer (FastAPI) -> Business Layer (Services) -> Data Layer (Models + Infrastructure).

CLIENT LAYER / 客户端层

Vue 3 SPA

Naive UI Components

Pinia State Management

Vue Router

SSE / WebSocket Client

Axios HTTP Client

HTTP / WebSocket / SSE

API LAYER / API 网关层

Auth Router

Chat Router

Documents Router

Knowledge Router

Agent Router

Monitoring Router

Workflow Router

+ 7 more routers

Dependency Injection / Service Layer

BUSINESS LAYER / 业务逻辑层

RAG Pipeline

Agent Engine

Document Parser

Embedding Service

Knowledge Service

Workflow Engine

Graph RAG Service

Memory Service

Semantic Chunker

ORM / SDK / Client Libraries

DATA LAYER / 数据存储层

MySQL 8

Redis

Elasticsearch 8

Kafka

MinIO

Prometheus

Container Network / Docker Compose

INFRASTRUCTURE / 基础设施层

Circuit Breaker

Rate Limiter

Distributed Tracing

Structured Logging

Health Checks

Cache Layer

Kafka Worker

P

后端模块Backend Modules

Python（FastAPI）后端遵循 Router → Service → Model 模式，15 个 API 端点、17 个业务服务、12 个数据模型。 Python (FastAPI) backend following the Router -> Service -> Model pattern. 15 API endpoints, 17 business services, 12 SQLAlchemy models.

E

API Endpoints

backend/app/api/v1/endpoints/

14 FastAPI routers handling REST API, WebSocket, and SSE streaming endpoints.

auth.py chat.py documents.py knowledge.py agent.py monitoring.py workflow.py users.py organizations.py prompts.py manuals.py memory.py files.py notifications.py

S

Business Services

backend/app/services/

Core business logic: authentication, document processing, embeddings, RAG, workflows, and graph-based retrieval.

auth_service.py embedding_service.py document_parser.py knowledge_service.py rag_service.py graph_rag_service.py workflow_engine.py memory_service.py semantic_chunker.py file_service.py masking_service.py permission_service.py audit_service.py organization_service.py

R

RAG Pipeline

backend/app/rag/

Retrieval-Augmented Generation pipeline with hybrid search, reranking, context compression, and evaluation.

pipeline.py retriever.py reranker.py context_window.py context_compressor.py query_processor.py evaluator.py cache.py metrics.py

A

Agent System

backend/app/agent/

ReAct-style autonomous agent with self-registering tools, skill learning, and subagent delegation.

registry.py loop.py context.py skills.py subagent.py tools.py service.py

M

Data Models

backend/app/models/

SQLAlchemy 2.0 models using modern Mapped + mapped_column syntax. 12 domain models with relationships.

user.py chat.py document.py knowledge_job.py organization.py rbac.py prompt.py manual.py workflow.py notification.py user_audit.py

C

Core Infrastructure

backend/app/core/

Infrastructure layer: database, cache, search, messaging, security, observability, and configuration.

database.py redis.py elasticsearch.py kafka_client.py minio_client.py security.py circuit_breaker.py prometheus.py tracing.py logging.py middleware.py config/

V

前端模块Frontend Modules

Vue 3 + TypeScript + Naive UI 前端，20+ 页面组件，Pinia 状态管理，组合式函数，实时流式传输支持。 Vue 3 + TypeScript + Naive UI frontend with 20+ views, Pinia state management, composables, and real-time streaming support.

V

Page Views

frontend/src/views/

20+ page components covering chat, knowledge management, agent, dashboard, monitoring, workflow, and admin features.

chat/ knowledge/ agent/ dashboard/ monitoring/ workflow/ documents/ search/ conversations/ upload/ login/ users/ organizations/ prompts/ manual/ notifications/ profile/ admin/ firsthome/ system-about/

P

State Management

frontend/src/stores/

Pinia stores for global state management: application state, chat sessions, user auth, workflow, and notifications.

app.ts chat.ts user.ts workflow.ts notification.ts

C

Composables

frontend/src/composables/

Reusable composition functions for chat connections, message handling, error management, and data prefetching.

useDebounce.ts useErrorHandler.ts usePrefetch.ts

A

API Client

frontend/src/api/

TypeScript API client modules with typed request/response interfaces for all backend endpoints.

auth.ts chat.ts conversation.ts knowledge.ts agent.ts rag.ts search.ts workflow.ts monitoring.ts user.ts organization.ts prompt.ts manual.ts memory.ts notification.ts audit.ts

R

RAG 检索增强生成流水线RAG Pipeline

智能查询处理：复杂度分类、自适应策略选择、混合搜索（BM25 + 向量）、RRF 融合、重排序、上下文压缩。 Intelligent query processing with complexity classification, adaptive strategy selection, hybrid search (BM25 + Vector), RRF fusion, reranking, and context compression.

1 Query Processing Flow / 查询处理流程

User Query
用户查询

Complexity Classifier
复杂度分类器

Strategy Selection
策略选择

Query Expansion
查询扩展

Strategy: keyword_only | hybrid | hybrid_hyde

2 Hybrid Search / 混合检索

BM25 Keyword Search
关键词搜索

Vector Similarity Search
向量相似度搜索

RRF Fusion
倒数排名融合

Reranker
重排序

Context Window
上下文窗口

3 Generation / 生成阶段

Context Compression
上下文压缩

Citation Tracking
引用追踪

LLM Generation
大模型生成

SSE Stream Response
流式响应

1

Query Complexity Classification / 查询复杂度分类

Classifies queries as simple, moderate, or complex to select optimal retrieval strategy.

query_processor.py

|

2

Adaptive Strategy Selection / 自适应策略选择

keyword_only for simple factual, hybrid for moderate, hybrid_hyde for complex reasoning queries.

pipeline.py

|

3

Hybrid Search + RRF Fusion / 混合检索 + RRF 融合

BM25 keyword search + dense vector search, merged via Reciprocal Rank Fusion for optimal recall.

retriever.py

|

4

Cross-Encoder Reranking / 交叉编码器重排序

Neural reranker scores query-document pairs for precise relevance ranking.

reranker.py

|

5

Context Window + Compression / 上下文窗口 + 压缩

Smart context assembly with token budget management and redundancy removal.

context_window.py / context_compressor.py

|

6

Citation Tracking + LLM Generation / 引用追踪 + 大模型生成

Sources tracked and cited in responses, streamed via SSE for real-time user experience.

evaluator.py / metrics.py

G

Agent 智能体系统Agent System

ReAct 风格自主智能体，自注册工具注册表、并行执行、上下文压缩、技能学习、子智能体委托。 ReAct-style autonomous agent with self-registering tool registry, parallel execution, context compression, skill learning, and subagent delegation.

I Input / 输入

User Query
Natural language question

Context
Conversation history

Metadata
User, org, permissions

|

O Output / 输出

Final Answer
SSE streaming response

Tool Results
Intermediate outputs

Reasoning Trace
Step-by-step log

ReAct Agent Loop

Thought -> Action -> Observation -> Repeat

思考 -> 行动 -> 观察 -> 循环

L Loop Engine / 循环引擎

Max iterations control

Early stopping detection

Error recovery & retry

Token budget management

C Context Engine / 上下文引擎

Context compression

Memory management

Conversation window

Token counting

S Skill Learning / 技能学习

Pattern recognition

Success/failure tracking

Skill composition

Auto-optimization

D SubAgent / 子智能体

Task delegation

Specialized agents

Result aggregation

Hierarchical planning

T Tool Registry (11 tools) / 工具注册表

Document Search

Knowledge Query

Web Search

Code Analysis

Data Analysis

Summarize

Translate

Conversation

Memory Store

Prompt Template

File Operations

R Tool Execution / 工具执行

Parallel execution support

Timeout & retry policies

Result validation

SSE real-time visualization

D

数据流转Data Flow

端到端数据流转：从文档上传到处理流水线再到智能查询响应。 End-to-end data flow from document upload through processing pipeline to intelligent query response.

1 Document Ingestion Flow / 文档摄入流程

File Upload
文件上传

MinIO Storage
对象存储

DB Record
数据库记录

Kafka Message
消息队列

Worker Process
异步处理

Parse & Chunk
解析分块

Embedding
向量化

Elasticsearch
索引存储

2 Query Response Flow / 查询响应流程

User Query
用户查询

WebSocket/SSE
实时连接

RAG Pipeline
检索增强

Hybrid Search
混合搜索

Rerank
重排序

Context Assembly
上下文组装

DeepSeek LLM
大模型推理

Stream Response
流式响应

I

基础设施Infrastructure

生产级基础设施：缓存、消息队列、搜索引擎、对象存储、可观测性。 Production-grade infrastructure with caching, messaging, search, storage, and observability.

D

MySQL 8

Primary relational database. 12 SQLAlchemy 2.0 models with async support, connection pooling, and migration management.

R

Redis

In-memory cache for sessions, rate limiting, token blacklists, and distributed locks. Pub/Sub for real-time notifications.

E

Elasticsearch 8

Hybrid search engine supporting BM25 keyword search and dense vector similarity search with RRF fusion.

K

Kafka

Async message queue for document processing pipeline. Decouples upload from heavy parsing and embedding operations.

M

MinIO

S3-compatible object storage for document files. Supports multiple file formats: PDF, DOCX, TXT, MD, HTML.

P

Prometheus + Grafana

17+ RAG metrics, 18 Grafana dashboard panels. Custom metrics for latency, throughput, cache hits, and quality scores.

S

安全防护层Security Layers

纵深防御安全架构：认证、授权、输入验证、基础设施加固。 Defense-in-depth security architecture with authentication, authorization, input validation, and infrastructure hardening.

1

Authentication / 身份认证

JWT token-based authentication with access/refresh token rotation and token blacklisting on logout.

JWT Token Blacklist bcrypt Refresh Rotation

2

Authorization / 权限控制

Role-Based Access Control (RBAC) with organization-level multi-tenancy and granular permission management.

RBAC Multi-Tenancy Permission Check

3

Input Validation / 输入验证

Pydantic schema validation, SSRF prevention, path traversal protection, and AST-based code sandboxing.

Pydantic SSRF Prevention Path Traversal AST Sandbox

4

Rate Limiting & Protection / 限流防护

Redis-based rate limiting, brute force protection, CORS policy, and request throttling per user/IP.

Rate Limiting Brute Force CORS Throttling

5

Data Protection / 数据保护

Sensitive data masking, encrypted storage, audit logging, and secure file handling with content type validation.

Data Masking Audit Log Content Validation

T

测试与质量Testing & Quality

全面测试覆盖，311 个测试用例，12 轮优化迭代，生产级监控。 Comprehensive test coverage with 311 total tests, 12 optimization rounds, and production-grade monitoring.

216

后端测试Backend Tests

95

前端测试Frontend Tests

12

优化轮次Optimization Rounds

100%

重构完成Refactor Complete

M

17+ RAG 指标17+ RAG Metrics

Prometheus 指标覆盖延迟、吞吐量、缓存命中率、检索质量、重排序分数和生成指标。Prometheus metrics covering latency, throughput, cache hit rate, retrieval quality, reranker scores, and generation metrics.

G

18 个 Grafana 面板18 Grafana Panels

全面的仪表盘：系统健康、RAG 管线性能、API 延迟分布和错误率追踪。Comprehensive dashboards for system health, RAG pipeline performance, API latency distribution, and error rate tracking.

R

52/53 重构任务52/53 Refactor Tasks

系统化代码质量改进：服务提取、异常处理、类型安全、性能优化和文档完善。Systematic code quality improvements: service extraction, error handling, type safety, performance optimization, and documentation.