140 lines
3.2 KiB
Markdown
140 lines
3.2 KiB
Markdown
|
|
# 部署与运维规范 (Deployment & Operations Specification)
|
|||
|
|
|
|||
|
|
## 1. 环境配置
|
|||
|
|
|
|||
|
|
### 1.1 环境变量总表
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# ========== PostgreSQL G5 连接 ==========
|
|||
|
|
POSTGRES_HOST_G5=10.8.8.80
|
|||
|
|
POSTGRES_PORT_G5=5434
|
|||
|
|
POSTGRES_DATABASE_G5=dbv6
|
|||
|
|
POSTGRES_USER_G5=<your_username>
|
|||
|
|
POSTGRES_PASSWORD_G5=<your_password>
|
|||
|
|
|
|||
|
|
# ========== Kafka 消费 ==========
|
|||
|
|
KAFKA_BROKERS=kafka.blv-oa.com:9092
|
|||
|
|
KAFKA_TOPIC_HEARTBEAT=blwlog4Nodejs-oldrcu-heartbeat-topic
|
|||
|
|
KAFKA_CONSUMER_INSTANCES=3 # 配置的消费者数(自动伸缩到分区数)
|
|||
|
|
KAFKA_BATCH_SIZE=100000 # 单次拉取消息条数
|
|||
|
|
KAFKA_FETCH_MIN_BYTES=65536 # 等待最少字节数
|
|||
|
|
KAFKA_COMMIT_INTERVAL_MS=200 # 偏移量提交周期
|
|||
|
|
|
|||
|
|
# ========== Redis 连接 ==========
|
|||
|
|
REDIS_HOST=10.8.8.109
|
|||
|
|
REDIS_PORT=6379
|
|||
|
|
REDIS_PASSWORD=<optional_password>
|
|||
|
|
|
|||
|
|
# ========== 缓冲与去重配置 ==========
|
|||
|
|
HEARTBEAT_BUFFER_SIZE_MAX=5000 # 缓冲最大条数
|
|||
|
|
HEARTBEAT_BUFFER_WINDOW_MS=5000 # 缓冲时间窗口(毫秒)
|
|||
|
|
HEARTBEAT_WRITE_COOLDOWN_MS=30000 # 写入冷却期(毫秒)
|
|||
|
|
|
|||
|
|
# ========== 日志与调试 ==========
|
|||
|
|
LOG_LEVEL=info # debug | info | warn | error
|
|||
|
|
NODE_ENV=production # development | production
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 2. 启动流程
|
|||
|
|
|
|||
|
|
### 2.1 开发环境启动
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 1. 安装依赖
|
|||
|
|
npm install
|
|||
|
|
|
|||
|
|
# 2. 配置环境变量
|
|||
|
|
cp .env.example .env
|
|||
|
|
# 编辑 .env 填入真实的数据库、Kafka、Redis 地址
|
|||
|
|
|
|||
|
|
# 3. 运行开发服务
|
|||
|
|
npm run dev
|
|||
|
|
|
|||
|
|
# 预期输出:
|
|||
|
|
# ✓ Redis connected & heartbeat started
|
|||
|
|
# ✓ PostgreSQL G5 connected
|
|||
|
|
# ✓ Kafka consumer scaling resolved
|
|||
|
|
# ✓ Started 6 Kafka consumer(s)
|
|||
|
|
# ✓ bls-oldrcu-heartbeat-backend started
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2.2 生产环境构建
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 1. 构建
|
|||
|
|
npm run build
|
|||
|
|
|
|||
|
|
# 输出: dist/index.js (约 22KB)
|
|||
|
|
|
|||
|
|
# 2. 验证构建
|
|||
|
|
node dist/index.js
|
|||
|
|
|
|||
|
|
# 3. 通过 Docker 部署(可选)
|
|||
|
|
docker build -t bls-rcu-heartbeat:latest .
|
|||
|
|
docker run -e POSTGRES_HOST_G5=... -e KAFKA_BROKERS=... bls-rcu-heartbeat:latest
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 3. 监控与告警
|
|||
|
|
|
|||
|
|
### 3.1 关键指标
|
|||
|
|
|
|||
|
|
#### 消费健康度
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
指标: 消费速率 (msg/s)
|
|||
|
|
目标: > 10,000 msg/s
|
|||
|
|
警告阈值: < 5,000 msg/s
|
|||
|
|
|
|||
|
|
指标: 消息有效率 (%)
|
|||
|
|
目标: > 95%
|
|||
|
|
警告阈值: < 80%
|
|||
|
|
正常值: 99.9%
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### 缓冲健康度
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
指标: 缓冲大小 (条数)
|
|||
|
|
目标: < 1,000(正常运行)
|
|||
|
|
警告阈值: > 3,000(缓冲堆积)
|
|||
|
|
|
|||
|
|
指标: 冷却期覆盖率 (%)
|
|||
|
|
说明: 被冷却期阻止的键百分比
|
|||
|
|
目标: > 50%
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 4. 故障排查
|
|||
|
|
|
|||
|
|
### 4.1 消费速度慢
|
|||
|
|
|
|||
|
|
**症状**: 消费速率 < 5,000 msg/s
|
|||
|
|
|
|||
|
|
**检查清单**:
|
|||
|
|
1. Kafka 分区数与消费者数是否匹配?
|
|||
|
|
2. 网络连接是否正常?
|
|||
|
|
3. 数据库写入是否成为瓶颈?
|
|||
|
|
4. 是否有网络延迟或抖动?
|
|||
|
|
|
|||
|
|
### 4.2 消息验证失败率高
|
|||
|
|
|
|||
|
|
**症状**: invalidCount > 1% of totalMessages
|
|||
|
|
|
|||
|
|
**检查清单**:
|
|||
|
|
1. Kafka 消息结构是否改变?
|
|||
|
|
2. 验证规则是否过严?
|
|||
|
|
3. 数据源是否发送了垃圾数据?
|
|||
|
|
|
|||
|
|
### 4.3 数据库连接失败
|
|||
|
|
|
|||
|
|
**症状**: "PostgreSQL G5 connection failed" → exit(1)
|
|||
|
|
|
|||
|
|
**检查清单**:
|
|||
|
|
1. 数据库地址和端口是否正确?
|
|||
|
|
2. 网络连通性?
|
|||
|
|
3. 数据库用户名/密码是否正确?
|
|||
|
|
4. 数据库是否在线?
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
**上次修订**: 2026-03-11
|