feat: 更新 Kafka 配置和数据库管理逻辑
- 在 .env.example 中添加 Kafka 配置项:KAFKA_FETCH_MAX_BYTES, KAFKA_FETCH_MIN_BYTES, KAFKA_FETCH_MAX_WAIT_MS。 - 删除 room_status_sync 提案及相关文档。 - 删除 fix_uint64_overflow 提案及相关文档。 - 更新数据库管理器以支持使用 COPY 语句进行高效数据写入,替换批量 INSERT 逻辑。 - 实现心跳数据的整数溢出处理,确保无效数据被持久化到 heartbeat_events_errors 表。 - 更新处理器规范,确保心跳数据成功写入历史表后触发 room_status 同步。 - 添加新文档,描述新的分区方法案例。 - 归档旧的提案和规范文档以保持项目整洁。
This commit is contained in:
@@ -14,7 +14,7 @@ The `room_status.room_status_moment` table is a shared table for real-time devic
|
||||
* Map `version` to `agreement_ver` and `bright_g` to `bright_g`.
|
||||
|
||||
## Tasks
|
||||
- [ ] Update `docs/room_status_moment.sql` with new columns and index.
|
||||
- [ ] Update `docs/plan-room-status-sync.md` with new fields and finalized plan.
|
||||
- [ ] Implement `upsertRoomStatus` in `DatabaseManager`.
|
||||
- [ ] Integrate into `HeartbeatProcessor`.
|
||||
- [x] Update `docs/room_status_moment.sql` with new columns and index.
|
||||
- [x] Update `docs/plan-room-status-sync.md` with new fields and finalized plan.
|
||||
- [x] Implement `upsertRoomStatus` in `DatabaseManager`.
|
||||
- [x] Integrate into `HeartbeatProcessor`.
|
||||
@@ -0,0 +1,9 @@
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: 数据库表结构管理
|
||||
系统 MUST 支持 room_status 实时状态表与心跳历史表的协同写入能力。
|
||||
|
||||
#### Scenario: room_status 结构与约束支持
|
||||
- **WHEN** 心跳服务执行 room_status upsert 同步
|
||||
- **THEN** 目标表应具备支撑 UPSERT 的唯一约束(hotel_id, room_id, device_id)
|
||||
- **AND** 需要的同步字段应存在并可写入
|
||||
@@ -0,0 +1,9 @@
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: 心跳数据转换
|
||||
系统 MUST 在历史表写入成功后触发 room_status 同步。
|
||||
|
||||
#### Scenario: 历史写入成功后同步状态表
|
||||
- **WHEN** 一批心跳数据成功写入历史分区表
|
||||
- **THEN** 系统应调用 room_status 的 upsert 同步逻辑
|
||||
- **AND** 同步失败不应阻塞主历史写入流程
|
||||
@@ -0,0 +1,24 @@
|
||||
# Change: Handle integer overflows, persist unprocessable data, and use COPY for extreme write performance
|
||||
|
||||
## Why
|
||||
1. **Overflows**: Hardware devices report bitmasks and counters where unsigned values exceed PostgreSQL's signed boundaries (e.g. `uint64`), causing out of range Database errors.
|
||||
2. **Missing History**: Previously, fully unprocessable rows directly crashed out or vanished into Redis logs, causing data loss without DB traceability.
|
||||
3. **Database Write Pressure**: The multi-row `INSERT INTO ... VALUES (...)` logic built queries with up to 30,000 parameter bindings for a batch, creating massive CPU load for parsing scaling synchronously via a single database connection.
|
||||
|
||||
## What Changes
|
||||
- Use `pg-copy-streams` to replace batch `INSERT` with `COPY heartbeat.heartbeat_events FROM STDIN WITH (FORMAT text)`. The objects are automatically formatted into a memory buffer and flushed down a pipeline stream, completely saving parameter parsing overhead by ~90%.
|
||||
- Refine array text mapping natively avoiding SQL JSONB mapping latency and safely escaping special string quotes.
|
||||
- Safely map completely oversized integers to signed `int64`, `int32`, `int16` 2's complement equivalents natively in Javascript.
|
||||
- Implement an upstream catch that automatically redirects isolated PostgreSQL validation exceptions alongside JSON parsing defects purely to the new `heartbeat_events_errors` table batch-sink for complete persistence.
|
||||
|
||||
## Impact
|
||||
- Affected specs: `processor`
|
||||
- Affected code:
|
||||
- `src/processor/heartbeatProcessor.js`
|
||||
- `src/db/databaseManager.js`
|
||||
- `package.json`
|
||||
|
||||
## Recorded Follow-ups (2026-03-03)
|
||||
- Partition maintenance strategy is adjusted to avoid redundant per-partition index creation during daily partition pre-creation; parent-table indexes remain the primary index definition source.
|
||||
- Legacy compatibility paths for old heartbeat schema access are removed from implementation scope, and the project baseline is aligned to the new partitioned schema.
|
||||
- Kafka consume/write throughput tuning knobs are documented and exposed via runtime envs (`KAFKA_FETCH_MAX_BYTES`, `KAFKA_FETCH_MIN_BYTES`, `KAFKA_FETCH_MAX_WAIT_MS`, `KAFKA_MAX_IN_FLIGHT_MESSAGES`, `PROCESSOR_BATCH_SIZE`, `PROCESSOR_BATCH_TIMEOUT`).
|
||||
@@ -0,0 +1,6 @@
|
||||
## 1. Implementation
|
||||
- [x] 1.1 Support all numeric overflow types (`int16`, `int32`, `int64`) with JavaScript two's complement mapped conversions.
|
||||
- [x] 1.2 Replace batch `INSERT` logic with `COPY... FROM STDIN` streaming using `pg-copy-streams` to achieve massive raw write throughput.
|
||||
- [x] 1.3 Add dynamic string formatter mapping arrays properly for native PostgreSQL Tab-Separated ingestion formatting.
|
||||
- [x] 1.4 Wire fallback inserts explicitly tracking fully failed outputs into the isolated `heartbeat_events_errors` capture bucket.
|
||||
- [x] 1.5 Update the `openspec` specs representing these architectural and constraint transitions.
|
||||
@@ -1,15 +0,0 @@
|
||||
# Change: Handle integer overflows and persist unprocessable data
|
||||
|
||||
## Why
|
||||
Hardware devices occasionally report `service_mask` or other bitmasks and counters (like `power_state`) where their unsigned values exceed PostgreSQL's signed boundaries (e.g. `uint64`, `uint32`, `uint16`). This triggers out of range database insertion errors which causes batch failure and falls back to individual row insertions. Previously, rows that failed due to data range constraints directly crashed out or were only logged to Redis, meaning fully invalid data or boundary constraint violations were fundamentally lost from DB history.
|
||||
|
||||
## What Changes
|
||||
- Safely map completely oversized integers to signed `int64`, `int32`, `int16` 2's complement equivalents natively in Javascript (e.g. `(v << 16) >> 16` for `int2`).
|
||||
- Refine the loop mechanism in `databaseManager.js` to avoid throwing errors exclusively built from data-level constraint mismatches when doing individual row fallback.
|
||||
- Extend `_emitRejectedRecord` to persist any unprocessable, validation-failing, or insert-failing raw records directly into a dedicated error database table: `heartbeat_events_errors`.
|
||||
|
||||
## Impact
|
||||
- Affected specs: `processor`
|
||||
- Affected code:
|
||||
- `src/processor/heartbeatProcessor.js`
|
||||
- `src/db/databaseManager.js`
|
||||
@@ -1,6 +0,0 @@
|
||||
## 1. Implementation
|
||||
- [x] 1.1 Update `heartbeatProcessor.js` to handle all numeric overflow types (`int16`, `int32`, `int64`) with two's complement.
|
||||
- [x] 1.2 Prevent purely data-related Postgres failures from throwing away individual fallbacks within `databaseManager.js`.
|
||||
- [x] 1.3 Add `insertHeartbeatEventsErrors` to `databaseManager.js` to sink rejected records.
|
||||
- [x] 1.4 Wire `_emitRejectedRecord` in `heartbeatProcessor.js` to directly write all completely unprocessable heartbeats into the `heartbeat_events_errors` DB.
|
||||
- [x] 1.5 Update the `openspec` specs with these newly supported overflow & validation fallback states.
|
||||
@@ -44,20 +44,12 @@
|
||||
- **AND** 根据配置决定是否重试
|
||||
|
||||
### Requirement: 数据库表结构管理
|
||||
系统 MUST 提供数据库表结构的定义和管理机制。
|
||||
系统 MUST 支持 room_status 实时状态表与心跳历史表的协同写入能力。
|
||||
|
||||
#### Scenario: 表结构初始化(高吞吐分区表)
|
||||
- **WHEN** 系统首次启动或部署数据库时
|
||||
- **THEN** 应该存在按 `ts_ms` 日分区的心跳明细表
|
||||
- **AND** 必填字段应具备 NOT NULL 约束
|
||||
- **AND** 状态类字段应具备 CHECK 约束(限制取值范围)
|
||||
- **AND** 主键应采用 GUID(32 位无连字符 HEX 字符串)并具备格式 CHECK
|
||||
- **AND** 必需索引应存在(hotel_id/power_state/guest_type/device_id B-tree;service_mask BRIN;service_mask 首位查询表达式索引 idx_service_mask_first_bit)
|
||||
|
||||
#### Scenario: 自动分区
|
||||
- **WHEN** 写入某天数据而该日分区不存在
|
||||
- **THEN** 系统应能够自动创建对应日分区或确保分区被预创建
|
||||
- **AND** 不应影响持续写入(高吞吐场景)
|
||||
#### Scenario: room_status 结构与约束支持
|
||||
- **WHEN** 心跳服务执行 room_status upsert 同步
|
||||
- **THEN** 目标表应具备支撑 UPSERT 的唯一约束(hotel_id, room_id, device_id)
|
||||
- **AND** 需要的同步字段应存在并可写入
|
||||
|
||||
### Requirement: 数组字段存储与索引
|
||||
系统 MUST 支持将电力与空调子设备数据以数组列形式存储,并为指定数组列建立针对元素查询的索引。
|
||||
@@ -96,9 +88,9 @@
|
||||
- **AND** 常见查询(hotel_id + 时间范围)应触发分区裁剪
|
||||
|
||||
## ADDED Requirements
|
||||
### Requirement: 分区表新增数组列与数组元素索<E7B4A0><EFBFBD>?系统 SHALL <EFBFBD><EFBFBD>?`heartbeat.heartbeat_events` 中新增用于存储电力与空调子设备的数组列,并为指定数组列提供数组元素级查询索引<E7B4A2><EFBFBD>?
|
||||
#### Scenario: 新增数组<E695B0><EFBFBD>?- **WHEN** 部署或升级数据库结构<E7BB93><EFBFBD>?- **THEN** 表应包含 elec_address、air_address、voltage、ampere、power、phase、energy、sum_energy、state、model、speed、set_temp、now_temp、solenoid_valve
|
||||
### Requirement: 分区表新增数组列与数组元素索<E7B4A0>?系统 SHALL <20>?`heartbeat.heartbeat_events` 中新增用于存储电力与空调子设备的数组列,并为指定数组列提供数组元素级查询索引<E7B4A2>?
|
||||
#### Scenario: 新增数组<E695B0>?- **WHEN** 部署或升级数据库结构<E7BB93>?- **THEN** 表应包含 elec_address、air_address、voltage、ampere、power、phase、energy、sum_energy、state、model、speed、set_temp、now_temp、solenoid_valve
|
||||
|
||||
#### Scenario: 数组元素索引
|
||||
- **WHEN** 需要按 elec_address/air_address/state/model 的数组元素进行查<E8A18C><EFBFBD>?- **THEN** 数据库应具备 GIN 索引以优化包含类查询
|
||||
- **WHEN** 需要按 elec_address/air_address/state/model 的数组元素进行查<E8A18C>?- **THEN** 数据库应具备 GIN 索引以优化包含类查询
|
||||
|
||||
|
||||
@@ -44,16 +44,12 @@
|
||||
- **AND** 丢弃该数据
|
||||
|
||||
### Requirement: 心跳数据转换
|
||||
系统 MUST 能够将解包后的心跳数据转换为数据库存储格式。
|
||||
系统 MUST 在历史表写入成功后触发 room_status 同步。
|
||||
|
||||
#### Scenario: 转换为 v2 明细表字段
|
||||
- **WHEN** 心跳数据验证通过时
|
||||
- **THEN** 系统应输出与 v2 明细表字段一致的数据结构
|
||||
- **AND** 添加必要的元数据
|
||||
|
||||
#### Scenario: 缺失必填字段
|
||||
- **WHEN** 心跳数据缺失必填字段时
|
||||
- **THEN** 系统应判定为无效数据并丢弃
|
||||
#### Scenario: 历史写入成功后同步状态表
|
||||
- **WHEN** 一批心跳数据成功写入历史分区表
|
||||
- **THEN** 系统应调用 room_status 的 upsert 同步逻辑
|
||||
- **AND** 同步失败不应阻塞主历史写入流程
|
||||
|
||||
### Requirement: 数组字段聚合转换
|
||||
系统 MUST 支持将 electricity[] 与 air_conditioner[] 的对象数组聚合为数据库的“列数组”,并保持原始顺序一致性。
|
||||
@@ -100,16 +96,16 @@
|
||||
|
||||
## ADDED Requirements
|
||||
### Requirement: 数组字段聚合为列数组
|
||||
系统 SHALL <20><EFBFBD>?`electricity[]` <EFBFBD><EFBFBD>?`air_conditioner[]` 按原始顺序聚合为数据库写入结构的列数组<E695B0><EFBFBD>?
|
||||
系统 SHALL <20>?`electricity[]` <20>?`air_conditioner[]` 按原始顺序聚合为数据库写入结构的列数组<E695B0>?
|
||||
#### Scenario: electricity 聚合
|
||||
- **WHEN** 输入包含 `electricity` 数组
|
||||
- **THEN** 输出应包<E5BA94><EFBFBD>?elec_address[]、voltage[]、ampere[]、power[]、phase[]、energy[]、sum_energy[]
|
||||
- **THEN** 输出应包<E5BA94>?elec_address[]、voltage[]、ampere[]、power[]、phase[]、energy[]、sum_energy[]
|
||||
- **AND** 各数组下标与输入数组下标一一对应
|
||||
|
||||
#### Scenario: air_conditioner 聚合
|
||||
- **WHEN** 输入包含 `air_conditioner` 数组
|
||||
- **THEN** 输出应包<E5BA94><EFBFBD>?air_address[]、state[]、model[]、speed[]、set_temp[]、now_temp[]、solenoid_valve[]
|
||||
- **THEN** 输出应包<E5BA94>?air_address[]、state[]、model[]、speed[]、set_temp[]、now_temp[]、solenoid_valve[]
|
||||
- **AND** 各数组下标与输入数组下标一一对应
|
||||
|
||||
#### Scenario: 类型与缺失处<E5A4B1><EFBFBD>?- **WHEN** electricity <EFBFBD><EFBFBD>?air_conditioner 存在但不是数<E698AF><EFBFBD>?- **THEN** 系统应丢弃该消息并记录错<E5BD95><EFBFBD>?- **WHEN** 数组元素字段缺失或无法转<E6B395><EFBFBD>?- **THEN** 系统应保持长度对齐并写入 null
|
||||
#### Scenario: 类型与缺失处<E5A4B1>?- **WHEN** electricity <20>?air_conditioner 存在但不是数<E698AF>?- **THEN** 系统应丢弃该消息并记录错<E5BD95>?- **WHEN** 数组元素字段缺失或无法转<E6B395>?- **THEN** 系统应保持长度对齐并写入 null
|
||||
|
||||
|
||||
Reference in New Issue
Block a user