feat: 添加 Kafka 消费者和消息处理功能

- 新增 Kafka 消费者实现，支持消息处理和错误处理。 - 实现 OffsetTracker 类，用于跟踪消息偏移量。 - 新增消息解析和数据库插入逻辑，支持从 Kafka 消息构建数据库行。 - 实现 UDP 数据包解析功能，支持不同类型的 UDP 消息。 - 新增 Redis 错误队列处理，支持错误重试机制。 - 实现 Redis 客户端和集成类，支持日志记录和心跳机制。 - 添加 Zod 验证模式，确保 Kafka 消息有效性。 - 新增日志记录和指标收集工具，支持系统监控。 - 添加 UUID 生成工具，支持唯一标识符生成。 - 编写处理器逻辑的单元测试，确保功能正确性。 - 配置 Vite 构建工具，支持 Node.js 环境下的构建。
2026-03-14 17:33:19 +08:00
parent d62f83b4a4
commit 677dda80b9
101 changed files with 14904 additions and 0 deletions
--- a/bls-register-backend/openspec/changes/archive/2026-02-04-fix-kafka-partition-schema/proposal.md
+++ b/bls-register-backend/openspec/changes/archive/2026-02-04-fix-kafka-partition-schema/proposal.md
@@ -0,0 +1,17 @@
+# Change: Fix Kafka Partitioning and Schema Issues
+
+## Why
+Production deployment revealed issues with data ingestion:
+1. Kafka Topic name changed to include partition suffix.
+2. Legacy data contains second-level timestamps (1970s) causing partition lookup failures in PostgreSQL (which expects ms).
+3. Variable-length fields (reboot reason, status) exceeded VARCHAR(10) limits, causing crashes.
+
+## What Changes
+- **Modified Requirement**: Update Kafka Topic to `blwlog4Nodejs-rcu-onoffline-topic-0`.
+- **New Requirement**: Implement heuristic timestamp conversion (Sec -> MS) for values < 100B.
+- **New Requirement**: Truncate specific fields to VARCHAR(255) to prevent DB rejection.
+- **Modified Requirement**: Update DB Schema to VARCHAR(255) for robustness.
+
+## Impact
+- Affected specs: `onoffline`
+- Affected code: `src/processor/index.js`, `scripts/init_db.sql`
--- a/bls-register-backend/openspec/changes/archive/2026-02-04-fix-kafka-partition-schema/specs/onoffline/spec.md
+++ b/bls-register-backend/openspec/changes/archive/2026-02-04-fix-kafka-partition-schema/specs/onoffline/spec.md
@@ -0,0 +1,25 @@
+## MODIFIED Requirements
+### Requirement: 消费并落库
+系统 SHALL 从 blwlog4Nodejs-rcu-onoffline-topic-0 消费消息，并写入 log_platform.onoffline.onoffline_record。
+
+#### Scenario: 非重启数据写入
+- **GIVEN** RebootReason 为空或不存在
+- **WHEN** 消息被处理
+- **THEN** current_status 等于 CurrentStatus (截断至 255 字符)
+
+## ADDED Requirements
+### Requirement: 字段长度限制与截断
+系统 SHALL 将部分变长字段截断至数据库允许的最大长度 (VARCHAR(255))，防止写入失败。
+
+#### Scenario: 超长字段处理
+- **GIVEN** LauncherVersion, CurrentStatus 或 RebootReason 超过 255 字符
+- **WHEN** 消息被处理
+- **THEN** 字段被截断为前 255 个字符并入库
+
+### Requirement: 时间戳单位自动识别
+系统 SHALL 自动识别 UnixTime 字段是秒还是毫秒，并统一转换为毫秒。
+
+#### Scenario: 秒级时间戳转换
+- **GIVEN** UnixTime < 100000000000 (约 1973 年前)
+- **WHEN** 解析时间戳
+- **THEN** 自动乘以 1000 转换为毫秒
--- a/bls-register-backend/openspec/changes/archive/2026-02-04-fix-kafka-partition-schema/tasks.md
+++ b/bls-register-backend/openspec/changes/archive/2026-02-04-fix-kafka-partition-schema/tasks.md
@@ -0,0 +1,6 @@
+## 1. Implementation
+- [x] Update Kafka Topic in .env and config
+- [x] Implement timestamp unit detection and conversion in processor
+- [x] Implement field truncation logic in processor
+- [x] Update database schema definition (init_db.sql) to VARCHAR(255)
+- [x] Verify data ingestion with production stream
--- a/bls-register-backend/openspec/changes/archive/2026-02-04-optimize-kafka-consumption/proposal.md
+++ b/bls-register-backend/openspec/changes/archive/2026-02-04-optimize-kafka-consumption/proposal.md
@@ -0,0 +1,18 @@
+# Change: Optimize Kafka Consumption Performance
+
+## Why
+User reports extremely slow Kafka consumption. Current implementation processes and inserts messages one-by-one, which creates a bottleneck at the database network round-trip time (RTT).
+
+## What Changes
+- **New Requirement**: Implement Batch Processing for Kafka messages.
+- **Refactor**: Decouple message parsing from insertion in `processor`.
+- **Logic**: 
+  - Accumulate messages in a buffer (e.g., 500ms or 500 items).
+  - Perform Batch Insert into PostgreSQL.
+  - Implement Row-by-Row fallback for batch failures (to isolate bad data).
+  - Handle DB connection errors with retry loop at batch level.
+
+## Impact
+- Affected specs: `onoffline`
+- Affected code: `src/index.js`, `src/processor/index.js`
+- Performance: Expected 10x-100x throughput increase.
--- a/bls-register-backend/openspec/changes/archive/2026-02-04-optimize-kafka-consumption/specs/onoffline/spec.md
+++ b/bls-register-backend/openspec/changes/archive/2026-02-04-optimize-kafka-consumption/specs/onoffline/spec.md
@@ -0,0 +1,13 @@
+## ADDED Requirements
+### Requirement: 批量消费与写入
+系统 SHALL 对 Kafka 消息进行缓冲，并按批次写入数据库，以提高吞吐量。
+
+#### Scenario: 批量写入
+- **GIVEN** 短时间内收到多条消息 (e.g., 500条)
+- **WHEN** 缓冲区满或超时 (e.g., 200ms)
+- **THEN** 执行一次批量数据库插入操作
+
+#### Scenario: 写入失败降级
+- **GIVEN** 批量写入因数据错误失败 (非连接错误)
+- **WHEN** 捕获异常
+- **THEN** 自动降级为逐条写入，以隔离错误数据并确保有效数据入库
--- a/bls-register-backend/openspec/changes/archive/2026-02-04-optimize-kafka-consumption/tasks.md
+++ b/bls-register-backend/openspec/changes/archive/2026-02-04-optimize-kafka-consumption/tasks.md
@@ -0,0 +1,5 @@
+## 1. Implementation
+- [ ] Refactor `src/processor/index.js` to export `parseMessageToRows`
+- [ ] Implement `BatchProcessor` logic in `src/index.js`
+- [ ] Update `handleMessage` to use `BatchProcessor`
+- [ ] Verify performance improvement
--- a/bls-register-backend/openspec/changes/archive/2026-03-04-2026-03-03-refactor-partition-indexes/proposal.md
+++ b/bls-register-backend/openspec/changes/archive/2026-03-04-2026-03-03-refactor-partition-indexes/proposal.md
@@ -0,0 +1,11 @@
+# Proposal: Refactor Partition Indexes
+
+## Goal
+利用 PostgreSQL 默认的支持，改变每日分区创立时的索引策略，不再在代码中对每个分区单独创建索引。
+
+## Context
+当前 `PartitionManager` 在动态创建子分区后，会隐式调用查询在子分区上创建六个单列索引。由于我们使用的是 PostgreSQL 11+，且我们在初始化脚本中的主分区表 `onoffline.onoffline_record` 上已经创建了所有的索引，此主表上的索引会自动应用于所有的子分区，不需要我们在创建分区时另外手动添加。
+
+## Proposed Changes
+1. 在 `src/db/partitionManager.js` 中移除子分区显式创建索引的方法 `ensurePartitionIndexes` 以及针对已有子分区的循环索引检查函数 `ensureIndexesForExistingPartitions`。
+2. 在更新分区流程 `ensurePartitions` 以及 `ensurePartitionsForTimestamps` 中，移除对 `ensurePartitionIndexes` 的调用。
--- a/bls-register-backend/openspec/changes/archive/2026-03-04-2026-03-03-refactor-partition-indexes/specs/onoffline/spec.md
+++ b/bls-register-backend/openspec/changes/archive/2026-03-04-2026-03-03-refactor-partition-indexes/specs/onoffline/spec.md
@@ -0,0 +1,11 @@
+# Spec Delta: onoffline-backend
+
+## MODIFIED Requirements
+
+### Requirement: 数据库分区策略
+系统 SHALL 使用 Range Partitioning 按天分区，并自动维护未来 30 天的分区表，子表依赖 PostgreSQL 原生机制继承主表索引。
+
+#### Scenario: 分区预创建
+- **GIVEN** 系统启动或每日凌晨
+- **WHEN** 运行分区维护任务
+- **THEN** 确保数据库中存在未来 30 天的分区表，无需对子表显式创建单列表索引
--- a/bls-register-backend/openspec/changes/archive/2026-03-04-2026-03-03-refactor-partition-indexes/tasks.md
+++ b/bls-register-backend/openspec/changes/archive/2026-03-04-2026-03-03-refactor-partition-indexes/tasks.md
@@ -0,0 +1,6 @@
+# Tasks: Refactor Partition Indexes
+
+- [x] refactor `src/db/partitionManager.js`: remove `ensurePartitionIndexes` and `ensureIndexesForExistingPartitions`.
+- [x] refactor `src/db/partitionManager.js`: update `ensurePartitions` and `ensurePartitionsForTimestamps` to remove calls to `ensurePartitionIndexes`.
+- [x] refactor `src/db/initializer.js` (and any other occurrences) to reflect the removal.
+- [x] update openspec requirements to clarify that index propagation relies on PostgreSQL parent-table indexes.
--- a/bls-register-backend/openspec/changes/archive/2026-03-04-remove-runtime-db-provisioning/proposal.md
+++ b/bls-register-backend/openspec/changes/archive/2026-03-04-remove-runtime-db-provisioning/proposal.md
@@ -0,0 +1,14 @@
+# Change: remove runtime db provisioning
+
+## Why
+当前服务在运行时承担了建库、建表和分区维护职责，导致服务职责边界不清晰，也会引入启动阶段 DDL 风险。现已将该能力剥离到根目录 `SQL_Script/`，需要通过 OpenSpec 正式记录为规范变更。
+
+## What Changes
+- 移除服务启动阶段的数据库初始化与定时分区维护要求。
+- 移除服务在写入失败时自动创建缺失分区的要求。
+- 明确数据库结构与分区维护由外部脚本（`SQL_Script/`）负责。
+- 保留服务的核心职责：Kafka 消费、解析、写库、重试与监控。
+
+## Impact
+- Affected specs: `openspec/specs/onoffline/spec.md`
+- Affected code: `src/index.js`, `src/config/config.js`, `src/db/initializer.js`, `src/db/partitionManager.js`, `scripts/init_db.sql`, `scripts/verify_partitions.js`, `../SQL_Script/*`
--- a/bls-register-backend/openspec/changes/archive/2026-03-04-remove-runtime-db-provisioning/specs/onoffline/spec.md
+++ b/bls-register-backend/openspec/changes/archive/2026-03-04-remove-runtime-db-provisioning/specs/onoffline/spec.md
@@ -0,0 +1,32 @@
+## MODIFIED Requirements
+
+### Requirement: 数据库分区策略
+系统 SHALL 使用 Range Partitioning 按天分区；运行服务本身 SHALL NOT 执行建库、建表、分区创建或定时分区维护。
+
+#### Scenario: 服务启动不执行 DDL
+- **GIVEN** 服务进程启动
+- **WHEN** 进入 bootstrap 过程
+- **THEN** 仅初始化消费、处理、监控相关能力，不执行数据库创建、表结构初始化与分区创建
+
+#### Scenario: 分区由外部脚本维护
+- **GIVEN** 需要创建数据库对象或新增未来分区
+- **WHEN** 执行外部 SQL/JS 工具
+- **THEN** 通过根目录 `SQL_Script/` 完成建库和分区维护，而不是由服务运行时自动执行
+
+### Requirement: 批量消费与写入
+系统 SHALL 对 Kafka 消息进行缓冲，并按批次写入数据库，以提高吞吐量；当写入失败时，系统 SHALL 执行连接恢复重试与降级策略，但不在运行时创建数据库分区。
+
+#### Scenario: 批量写入
+- **GIVEN** 短时间内收到多条消息 (e.g., 500条)
+- **WHEN** 缓冲区满或超时 (e.g., 200ms)
+- **THEN** 执行一次批量数据库插入操作
+
+#### Scenario: 写入失败降级
+- **GIVEN** 批量写入因数据错误失败 (非连接错误)
+- **WHEN** 捕获异常
+- **THEN** 自动降级为逐条写入，以隔离错误数据并确保有效数据入库
+
+#### Scenario: 分区缺失错误处理
+- **GIVEN** 写入时数据库返回分区缺失错误
+- **WHEN** 服务处理该错误
+- **THEN** 服务记录错误并按既有错误处理机制处理，不在运行时执行分区创建
--- a/bls-register-backend/openspec/changes/archive/2026-03-04-remove-runtime-db-provisioning/tasks.md
+++ b/bls-register-backend/openspec/changes/archive/2026-03-04-remove-runtime-db-provisioning/tasks.md
@@ -0,0 +1,12 @@
+## 1. Implementation
+- [x] 1.1 Remove runtime DB initialization from bootstrap flow (`src/index.js`).
+- [x] 1.2 Remove scheduled partition maintenance job from runtime service.
+- [x] 1.3 Remove runtime missing-partition auto-fix behavior.
+- [x] 1.4 Remove legacy DB provisioning modules and scripts from service project.
+- [x] 1.5 Add external SQL/JS provisioning scripts under root `SQL_Script/` for DB/schema/partition management.
+- [x] 1.6 Update project docs to point DB provisioning to `SQL_Script/`.
+
+## 2. Validation
+- [x] 2.1 Run `npm run lint` in `bls-onoffline-backend`.
+- [x] 2.2 Run `npm run build` in `bls-onoffline-backend`.
+- [x] 2.3 Run `openspec validate remove-runtime-db-provisioning --strict`.