X Tutup
Skip to main content
The MemMachine Server is a FastAPI-based backend designed to orchestrate complex memory operations across episodic and semantic stores. While the Client SDK provides a high-level interface for users, the Server Implementation handles the heavy lifting of reference counting, instance caching, and cross-store search coordination.

Service Architecture

The server is built around a “Shared Spec” architecture. Both the server and the client rely on a common set of Pydantic models (found in spec.py) to ensure that data validation is consistent across the network boundary.

Core Architectural Pillars

ComponentResponsibility
API RouterManages RESTful endpoints, dependency injection (e.g., get_memmachine), and Prometheus metric collection.
Service LayerTranslates incoming API specifications (AddMemoriesSpec) into internal storage models (EpisodeEntry).
Memory ManagersOrchestrates the lifecycle of memory instances, ensuring that resources like vector database connections are cached and released efficiently.
Common SpecDefines the “Contract” between the server and the SDK, including Enums for MemoryType and EpisodeType.

Technical Organization

The server’s logic is partitioned to mirror the two primary memory types:

Episodic Logic

Focuses on the high-frequency ingestion of conversational data. The server implementation handles:
  • Metadata Casting: Ensuring raw JSON metadata is properly typed for the vector store.
  • Context Injection: Managing the short_term_memory and long_term_memory components within a single session.

Semantic Logic

Focuses on structured knowledge organization. The server implementation handles:
  • Set & Category Management: Creating the hierarchical structures (Sets -> Categories -> Tags) that organize long-term facts.
  • Template Application: Managing category templates for consistent knowledge extraction.

Infrastructure Features

Health & Monitoring

The server includes built-in endpoints for container orchestration (Kubernetes/Docker):
  • /health: Returns service status and semantic versioning.
  • /metrics: Exposes Prometheus-formatted metrics for request latency and memory ingestion counts.

Dependency Injection

The server utilizes FastAPI dependencies to manage the MemMachine core instance, allowing for safe asynchronous access to the underlying storage engines across concurrent API requests.

Next Steps

To dive deeper into the specific implementation details of each module, explore the following sections:
X Tutup