Service Architecture
The server is built around a “Shared Spec” architecture. Both the server and the client rely on a common set of Pydantic models (found inspec.py) to ensure that data validation is consistent across the network boundary.
Core Architectural Pillars
| Component | Responsibility |
|---|---|
| API Router | Manages RESTful endpoints, dependency injection (e.g., get_memmachine), and Prometheus metric collection. |
| Service Layer | Translates incoming API specifications (AddMemoriesSpec) into internal storage models (EpisodeEntry). |
| Memory Managers | Orchestrates the lifecycle of memory instances, ensuring that resources like vector database connections are cached and released efficiently. |
| Common Spec | Defines the “Contract” between the server and the SDK, including Enums for MemoryType and EpisodeType. |
Technical Organization
The server’s logic is partitioned to mirror the two primary memory types:Episodic Logic
Focuses on the high-frequency ingestion of conversational data. The server implementation handles:- Metadata Casting: Ensuring raw JSON metadata is properly typed for the vector store.
- Context Injection: Managing the
short_term_memoryandlong_term_memorycomponents within a single session.
Semantic Logic
Focuses on structured knowledge organization. The server implementation handles:- Set & Category Management: Creating the hierarchical structures (Sets -> Categories -> Tags) that organize long-term facts.
- Template Application: Managing category templates for consistent knowledge extraction.
Infrastructure Features
Health & Monitoring
The server includes built-in endpoints for container orchestration (Kubernetes/Docker):/health: Returns service status and semantic versioning./metrics: Exposes Prometheus-formatted metrics for request latency and memory ingestion counts.
Dependency Injection
The server utilizes FastAPI dependencies to manage theMemMachine core instance, allowing for safe asynchronous access to the underlying storage engines across concurrent API requests.

