[GH-1160] Fix consolidation pipeline silently destroying user-defined tags#1164
[GH-1160] Fix consolidation pipeline silently destroying user-defined tags#1164o-love wants to merge 2 commits intoMemMachine:mainfrom
Conversation
7fd4bc5 to
0bbb20d
Compare
ff984b4 to
07f2164
Compare
sscargal
left a comment
There was a problem hiding this comment.
The CI tests are failing with errors similar to this
FAILED packages/server/server_tests/memmachine_server/semantic_memory/test_semantic_ingestion.py::test_user_tags_preserved_after_ingestion_and_consolidation[in_memory_semantic_storage-count_cache_episode_storage-sqlalchemy_pg_engine] - ModuleNotFoundError: No module named 'memmachine'
This is likely caused by the package refactor PR that was merged yesterday. The PR needs to use the new package layout and names.
There was a problem hiding this comment.
Pull request overview
This PR fixes three compounding bugs in the semantic memory consolidation pipeline that caused user-defined tags (e.g. bugfix, decision, progress) to be silently destroyed and replaced by LLM-generated names (e.g. Productivity Style) during background consolidation, as reported in issue #1160.
Changes:
- Prompt fix:
build_consolidation_prompt()now accepts an optionaltagsdict and injects the valid tag list into the prompt;StructuredSemanticPrompt.consolidation_promptpassestags=self.tags. - Serializer fix: New
_features_to_consolidation_format()preservesmetadata.idin the LLM input so the model can correctly populatekeep_memories;llm_consolidate_featuresnow uses this serializer. - Tag validation fix:
_deduplicate_features()derivesoriginal_tagfrom input memories and overrides any LLM-invented tag on consolidated output features, logging a warning on mismatch.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
semantic_prompt_template.py |
Adds optional tags injection into consolidation prompt |
semantic_model.py |
Passes tags=self.tags to build_consolidation_prompt |
semantic_llm.py |
Adds _features_to_consolidation_format() that preserves IDs; wires it into llm_consolidate_features |
semantic_ingestion.py |
Enforces original tag from input memories in _deduplicate_features; warning log on LLM tag rename |
test_semantic_model.py |
Tests that tag names appear in consolidation prompt |
test_semantic_llm.py |
Tests that the new serializer preserves IDs |
test_semantic_ingestion.py |
Four new tests for keep-list deletion, empty keep list, tag-rename rejection, and end-to-end scenario |
Critical issue found: All five new monkeypatch.setattr calls in test_semantic_ingestion.py use the wrong module prefix "memmachine.semantic_memory.semantic_ingestion" instead of "memmachine_server.semantic_memory.semantic_ingestion". Every existing test in the same file and every import across the codebase uses the memmachine_server package name. With the wrong path, pytest's monkeypatch cannot locate the target and will raise a ModuleNotFoundError at runtime (or in some configurations silently fail to patch), meaning the LLM mocks are never installed and the four new tests do not actually validate the bug fixes they were written for.
Additionally, the warning log message in semantic_ingestion.py at lines 373–378 has its format arguments swapped: it prints "changed tag from 'Productivity Style' to 'bugfix'" (LLM-tag first), whereas the natural reading should be "changed tag from 'bugfix' to 'Productivity Style'" (original first, then LLM-invented).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
packages/server/server_tests/memmachine_server/semantic_memory/test_semantic_ingestion.py
Outdated
Show resolved
Hide resolved
packages/server/src/memmachine_server/semantic_memory/semantic_ingestion.py
Outdated
Show resolved
Hide resolved
packages/server/server_tests/memmachine_server/semantic_memory/test_semantic_ingestion.py
Outdated
Show resolved
Hide resolved
packages/server/server_tests/memmachine_server/semantic_memory/test_semantic_ingestion.py
Outdated
Show resolved
Hide resolved
packages/server/server_tests/memmachine_server/semantic_memory/test_semantic_ingestion.py
Show resolved
Hide resolved
2b91610 to
6146100
Compare
Three compounding bugs caused the consolidation LLM to replace user-defined tags (e.g. 'bugfix', 'progress') with invented names (e.g. 'Productivity Style'): 1. build_consolidation_prompt() never received valid tag names, so the LLM had no knowledge of which tags were allowed. 2. _features_to_llm_format() stripped metadata IDs before sending features to the consolidation LLM, making keep_memories unreliable. 3. _deduplicate_features() wrote LLM-returned tags directly to storage with no validation, allowing silent tag renaming. Fixes: pass tags to consolidation prompt, add ID-preserving serializer for consolidation, and enforce original tag on consolidated features.
6146100 to
47b2afc
Compare
Purpose of the change
Fix three compounding bugs in the semantic memory consolidation pipeline that cause user-defined tags (e.g.
bugfix,decision,progress) to be silently deleted and replaced by LLM-generated names (e.g.Productivity Style) during background consolidation.Description
Three bugs interact to produce the tag-loss behavior:
Consolidation prompt missing valid tags —
build_consolidation_prompt()was called with notagsargument, so the LLM never knew which tags were valid. The prompt said "do not create new tag names" but never listed the allowed set. Fixed by accepting an optionaltagsparameter and injecting it into the prompt.StructuredSemanticPrompt.consolidation_promptnow passestags=self.tags.Feature IDs stripped before consolidation —
llm_consolidate_features()used_features_to_llm_format()which serializes as{tag: {feature: value}}, discardingmetadata.id. The consolidation prompt asks the LLM to returnkeep_memoriesas a list of IDs it never received. Fixed by adding_features_to_consolidation_format()that preserves IDs, and wiring it intollm_consolidate_features().No tag validation on consolidated output —
_deduplicate_features()wrotef.tagfrom the LLM response directly to storage. Since consolidation groups features by tag, all input memories share one tag. Fixed by deriving the original tag from input memories and overriding any LLM-invented tag with a warning log.RawSemanticPromptcallers (writing_assistant, health, financial, CRM) provide their own hardcoded consolidation prompts and are not affected.Fixes/Closes
Fixes #1160
Type of change
How Has This Been Tested?
Tests added to existing test files:
test_semantic_model.py—TestStructuredSemanticPrompt.test_consolidation_prompt_should_include_user_tags: verifies tag names appear in the consolidation prompt.test_semantic_llm.py—TestConsolidationSerialization: verifies the new serializer preserves IDs and the update serializer still omits them.test_semantic_ingestion.py— 4 tests covering delete-by-keep-list behavior, empty keep_memories, tag rename rejection, and an end-to-end ingestion+consolidation scenario.Test Results: 1320 passed, 8 skipped, 0 failed.
ruff checkandruff formatclean.Checklist
Screenshots/Gifs
N/A
Further comments
The diff is ~470 lines (above the 400-line guideline) because the three bugs are tightly coupled — splitting them into separate PRs would lose coherence and risk partial fixes that still allow tag loss.