企业级 RAG 智能知识库系统Enterprise RAG Knowledge Base System
全栈 AI 知识库系统,支持 RAG 检索增强生成、文档管理、混合搜索、流式问答、自主 Agent 工具调用。 Full-stack AI knowledge base system with RAG (Retrieval-Augmented Generation), document management, hybrid search, streaming Q&A, and an autonomous Agent with tool calling.
系统架构System Architecture
多层架构,关注点分离:表现层(Vue 3)→ API 层(FastAPI)→ 业务层(Services)→ 数据层(Models + Infrastructure)。 Multi-layered architecture with clear separation of concerns: Presentation Layer (Vue 3) -> API Layer (FastAPI) -> Business Layer (Services) -> Data Layer (Models + Infrastructure).
后端模块Backend Modules
Python(FastAPI)后端遵循 Router → Service → Model 模式,15 个 API 端点、17 个业务服务、12 个数据模型。 Python (FastAPI) backend following the Router -> Service -> Model pattern. 15 API endpoints, 17 business services, 12 SQLAlchemy models.
API Endpoints
Business Services
RAG Pipeline
Agent System
Data Models
Core Infrastructure
前端模块Frontend Modules
Vue 3 + TypeScript + Naive UI 前端,20+ 页面组件,Pinia 状态管理,组合式函数,实时流式传输支持。 Vue 3 + TypeScript + Naive UI frontend with 20+ views, Pinia state management, composables, and real-time streaming support.
Page Views
State Management
Composables
API Client
RAG 检索增强生成流水线RAG Pipeline
智能查询处理:复杂度分类、自适应策略选择、混合搜索(BM25 + 向量)、RRF 融合、重排序、上下文压缩。 Intelligent query processing with complexity classification, adaptive strategy selection, hybrid search (BM25 + Vector), RRF fusion, reranking, and context compression.
用户查询
复杂度分类器
策略选择
查询扩展
关键词搜索
向量相似度搜索
倒数排名融合
重排序
上下文窗口
上下文压缩
引用追踪
大模型生成
流式响应
Query Complexity Classification / 查询复杂度分类
Classifies queries as simple, moderate, or complex to select optimal retrieval strategy.
Adaptive Strategy Selection / 自适应策略选择
keyword_only for simple factual, hybrid for moderate, hybrid_hyde for complex reasoning queries.
Hybrid Search + RRF Fusion / 混合检索 + RRF 融合
BM25 keyword search + dense vector search, merged via Reciprocal Rank Fusion for optimal recall.
Cross-Encoder Reranking / 交叉编码器重排序
Neural reranker scores query-document pairs for precise relevance ranking.
Context Window + Compression / 上下文窗口 + 压缩
Smart context assembly with token budget management and redundancy removal.
Citation Tracking + LLM Generation / 引用追踪 + 大模型生成
Sources tracked and cited in responses, streamed via SSE for real-time user experience.
Agent 智能体系统Agent System
ReAct 风格自主智能体,自注册工具注册表、并行执行、上下文压缩、技能学习、子智能体委托。 ReAct-style autonomous agent with self-registering tool registry, parallel execution, context compression, skill learning, and subagent delegation.
I Input / 输入
Natural language question
Conversation history
User, org, permissions
O Output / 输出
SSE streaming response
Intermediate outputs
Step-by-step log
ReAct Agent Loop
Thought -> Action -> Observation -> Repeat
思考 -> 行动 -> 观察 -> 循环
L Loop Engine / 循环引擎
C Context Engine / 上下文引擎
S Skill Learning / 技能学习
D SubAgent / 子智能体
T Tool Registry (11 tools) / 工具注册表
R Tool Execution / 工具执行
数据流转Data Flow
端到端数据流转:从文档上传到处理流水线再到智能查询响应。 End-to-end data flow from document upload through processing pipeline to intelligent query response.
文件上传
对象存储
数据库记录
消息队列
异步处理
解析分块
向量化
索引存储
用户查询
实时连接
检索增强
混合搜索
重排序
上下文组装
大模型推理
流式响应
基础设施Infrastructure
生产级基础设施:缓存、消息队列、搜索引擎、对象存储、可观测性。 Production-grade infrastructure with caching, messaging, search, storage, and observability.
MySQL 8
Primary relational database. 12 SQLAlchemy 2.0 models with async support, connection pooling, and migration management.
Redis
In-memory cache for sessions, rate limiting, token blacklists, and distributed locks. Pub/Sub for real-time notifications.
Elasticsearch 8
Hybrid search engine supporting BM25 keyword search and dense vector similarity search with RRF fusion.
Kafka
Async message queue for document processing pipeline. Decouples upload from heavy parsing and embedding operations.
MinIO
S3-compatible object storage for document files. Supports multiple file formats: PDF, DOCX, TXT, MD, HTML.
Prometheus + Grafana
17+ RAG metrics, 18 Grafana dashboard panels. Custom metrics for latency, throughput, cache hits, and quality scores.
安全防护层Security Layers
纵深防御安全架构:认证、授权、输入验证、基础设施加固。 Defense-in-depth security architecture with authentication, authorization, input validation, and infrastructure hardening.
Authentication / 身份认证
JWT token-based authentication with access/refresh token rotation and token blacklisting on logout.
Authorization / 权限控制
Role-Based Access Control (RBAC) with organization-level multi-tenancy and granular permission management.
Input Validation / 输入验证
Pydantic schema validation, SSRF prevention, path traversal protection, and AST-based code sandboxing.
Rate Limiting & Protection / 限流防护
Redis-based rate limiting, brute force protection, CORS policy, and request throttling per user/IP.
Data Protection / 数据保护
Sensitive data masking, encrypted storage, audit logging, and secure file handling with content type validation.
测试与质量Testing & Quality
全面测试覆盖,311 个测试用例,12 轮优化迭代,生产级监控。 Comprehensive test coverage with 311 total tests, 12 optimization rounds, and production-grade monitoring.
17+ RAG 指标17+ RAG Metrics
Prometheus 指标覆盖延迟、吞吐量、缓存命中率、检索质量、重排序分数和生成指标。Prometheus metrics covering latency, throughput, cache hit rate, retrieval quality, reranker scores, and generation metrics.
18 个 Grafana 面板18 Grafana Panels
全面的仪表盘:系统健康、RAG 管线性能、API 延迟分布和错误率追踪。Comprehensive dashboards for system health, RAG pipeline performance, API latency distribution, and error rate tracking.
52/53 重构任务52/53 Refactor Tasks
系统化代码质量改进:服务提取、异常处理、类型安全、性能优化和文档完善。Systematic code quality improvements: service extraction, error handling, type safety, performance optimization, and documentation.