Server Implementation

The MemMachine Server is a FastAPI-based backend designed to orchestrate complex memory operations across episodic and semantic stores. While the Client SDK provides a high-level interface for users, the Server Implementation handles the heavy lifting of reference counting, instance caching, and cross-store search coordination.

Service Architecture

The server is built around a “Shared Spec” architecture. Both the server and the client rely on a common set of Pydantic models (found in spec.py) to ensure that data validation is consistent across the network boundary.

Core Architectural Pillars

Component	Responsibility
API Router	Manages RESTful endpoints, dependency injection (e.g., `get_memmachine`), and Prometheus metric collection.
Service Layer	Translates incoming API specifications (`AddMemoriesSpec`) into internal storage models (`EpisodeEntry`).
Memory Managers	Orchestrates the lifecycle of memory instances, ensuring that resources like vector database connections are cached and released efficiently.
Common Spec	Defines the “Contract” between the server and the SDK, including Enums for `MemoryType` and `EpisodeType`.

Technical Organization

The server’s logic is partitioned to mirror the two primary memory types:

Episodic Logic

Focuses on the high-frequency ingestion of conversational data. The server implementation handles:

Metadata Casting: Ensuring raw JSON metadata is properly typed for the vector store.
Context Injection: Managing the short_term_memory and long_term_memory components within a single session.

Semantic Logic

Focuses on structured knowledge organization. The server implementation handles:

Set & Category Management: Creating the hierarchical structures (Sets -> Categories -> Tags) that organize long-term facts.
Template Application: Managing category templates for consistent knowledge extraction.

Infrastructure Features

Health & Monitoring

The server includes built-in endpoints for container orchestration (Kubernetes/Docker):

/health: Returns service status and semantic versioning.
/metrics: Exposes Prometheus-formatted metrics for request latency and memory ingestion counts.

Dependency Injection

The server utilizes FastAPI dependencies to manage the MemMachine core instance, allowing for safe asynchronous access to the underlying storage engines across concurrent API requests.

Next Steps

To dive deeper into the specific implementation details of each module, explore the following sections:

Memory Types

The atomic building blocks and Enums used by the server.

Episodic Memory

Deep dive into session management and search logic.

Semantic Memory

Understanding the knowledge graph and tag management.

Memory Manager

Reference counting and instance lifecycle details.

Introduction

MCP Tools

REST API

Typescript REST SDK

PythonSDK Client

Python SDK Server

Service Architecture

Core Architectural Pillars

Technical Organization

Episodic Logic

Semantic Logic

Infrastructure Features

Health & Monitoring

Dependency Injection

Next Steps

Memory Types

Episodic Memory

Semantic Memory

Memory Manager

Introduction

MCP Tools

REST API

Typescript REST SDK

PythonSDK Client

Python SDK Server

​Service Architecture

​Core Architectural Pillars

​Technical Organization

​Episodic Logic

​Semantic Logic

​Infrastructure Features

​Health & Monitoring

​Dependency Injection

​Next Steps

Memory Types

Episodic Memory

Semantic Memory

Memory Manager

Service Architecture

Core Architectural Pillars

Technical Organization

Episodic Logic

Semantic Logic

Infrastructure Features

Health & Monitoring

Dependency Injection

Next Steps