fix: preserve custom node/edge attributes during merge operations by abc-lee · Pull Request #2990 · HKUDS/LightRAG

abc-lee · 2026-04-27T13:32:31Z

Summary

Fix data loss bug where custom node/edge attributes are silently discarded during merge operations.

The Bug

_merge_nodes_then_upsert and _merge_edges_then_upsert reconstruct node_data/edge_data from scratch with only 7 hardcoded fields. This silently discards any custom attributes (e.g. brain_meta_*, community_id, or any user-extended fields) added by downstream users.

This is inconsistent with:

aedit_entity which uses {**node_data, **updated_data} pattern
amerge_entities which explicitly collects all keys via _merge_attributes

The Fix

Instead of building a new dict from scratch:

Start from the existing node/edge data dict
Update with standard fields (source_type, updated_at, chunk_id_list, etc.)
Custom attributes are preserved

This approach is minimal, targeted, and maintains backwards compatibility.

Testing

All existing tests pass
The fix has been validated in the niu-agent downstream project

🤖 Generated with Claude Code

_merge_nodes_then_upsert and _merge_edges_then_upsert reconstruct node_data/edge_data from scratch with only 7 hardcoded fields, silently discarding any custom attributes added by downstream users. This is inconsistent with aedit_entity (which uses {**node_data, **updated_data}) and amerge_entities (which collects all keys via _merge_attributes). Fix: start from existing node/edge dict and update with standard fields, so custom attributes (e.g. brain_meta_*, community_id) are preserved. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

NetworkX uses Python dict keys for node identity, making node IDs case-sensitive. This causes duplicate nodes when LLM extraction returns different casing (e.g., 'Brain:Region:文档库' vs 'brain:region:文档库'). In a knowledge graph, 'Apple' and 'apple' are semantically the same entity. Add _normalize_node_id() static method that applies .lower() to all node IDs before they enter NetworkX. Applied consistently across all node/edge operations: has_node, get_node, upsert_node, delete_node, has_edge, get_edge, upsert_edge, remove_edges, and BFS entry point. This fixes the issue at the storage layer, so all upstream paths (LLM extraction via operate.py, custom injection via ainsert_custom_kg, and direct API calls) automatically benefit without any changes.

李磊 and others added 2 commits April 23, 2026 21:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve custom node/edge attributes during merge operations#2990

fix: preserve custom node/edge attributes during merge operations#2990
abc-lee wants to merge 2 commits into
HKUDS:mainfrom
abc-lee:fix/merge-preserve-custom-attributes

abc-lee commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abc-lee commented Apr 27, 2026

Summary

The Bug

The Fix

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant