This page provides a high-level introduction to Feast, an open-source feature store for machine learning. It explains the core problems Feast solves, the fundamental architecture, and key abstractions that enable feature management for both training and serving. For detailed system architecture, see System Architecture. For setup instructions, see Getting Started and CLI.
Sources: README.md26-38 docs/getting-started/quickstart.md4-14
Feast (Feature Store) is a system for managing and serving machine learning features across the full ML lifecycle. It provides a unified interface to:
The central entry point is the FeatureStore class sdk/python/feast/feature_store.py106-117 which orchestrates all operations.
Sources: sdk/python/feast/feature_store.py106-267 README.md29-36
Feast addresses three critical challenges in production ML systems:
| Problem | Feast Solution |
|---|---|
| Training-Serving Skew | Point-in-time joins ensure historical feature values match what models saw during training. The same feature definitions power both offline (training) and online (serving) paths. |
| Online Feature Availability | Materialization jobs precompute and load features into low-latency online stores (Redis, DynamoDB, PostgreSQL, etc.) at scheduled intervals. |
| Feature Reuse & Versioning | A centralized Registry stores all feature definitions. FeatureService objects version the exact features used by each model. |
The get_historical_features() method sdk/python/feast/feature_store.py1585-1733 implements point-in-time correctness, while materialize() sdk/python/feast/feature_store.py1811-1943 handles the offline-to-online data flow.
Sources: README.md32-36 docs/getting-started/quickstart.md44-59 sdk/python/feast/feature_store.py1585-1943
Component Details:
| Component | File | Purpose |
|---|---|---|
FeatureStore | feature_store.py106 | Central orchestrator for all feature operations |
RepoConfig | repo_config.py251 | Configuration loaded from feature_store.yaml |
Registry | infra/registry/base_registry.py | Metadata catalog for feature definitions |
Provider | infra/provider.py49 | Abstraction layer for infrastructure operations |
OfflineStore | infra/offline_stores/offline_store.py | Interface for historical feature retrieval |
OnlineStore | infra/online_stores/online_store.py35 | Interface for low-latency feature serving |
Entity | entity.py | Primary keys used for joining features |
FeatureView | feature_view.py | Schema and source for a set of features |
OnDemandFeatureView | on_demand_feature_view.py | Real-time feature transformations |
FeatureService | feature_service.py | Versioned grouping of features for a model |
Sources: sdk/python/feast/feature_store.py106-267 sdk/python/feast/repo_config.py251-605 sdk/python/feast/infra/provider.py49-67
Key Methods:
feast apply: repo_operations.py432-459 → apply_total() registers feature definitions to the registryget_historical_features(): feature_store.py1585-1733 performs point-in-time joins for training datamaterialize(): feature_store.py1811-1943 copies features from offline to online storesget_online_features(): feature_store.py2107-2312 retrieves latest features for inferencepush(): feature_store.py2705-2948 ingests streaming features to online/offline storesSources: sdk/python/feast/feature_store.py1585-2948 sdk/python/feast/repo_operations.py432-459 sdk/python/feast/feature_server.py388-480
The FeatureStore class is the primary API entry point. It is initialized with a path to a feature_store.yaml configuration file:
Key responsibilities:
RepoConfig from YAML feature_store.py155-162Sources: sdk/python/feast/feature_store.py106-267
The Provider interface infra/provider.py49-67 abstracts infrastructure operations. The default PassthroughProvider infra/passthrough_provider.py58-90 delegates to configured stores:
| Operation | Provider Method | Delegates To |
|---|---|---|
| Historical retrieval | get_historical_features() | OfflineStore.get_historical_features() |
| Online read | online_read() | OnlineStore.online_read() |
| Materialization | materialize_single_feature_view() | BatchEngine + OnlineStore |
| Infrastructure updates | update_infra() | OnlineStore.update() |
Sources: sdk/python/feast/infra/provider.py49-426 sdk/python/feast/infra/passthrough_provider.py58-316
The Registry stores feature metadata and materialization history. Three implementations exist:
The registry type is configured in feature_store.yaml:
Sources: sdk/python/feast/infra/registry/registry.py sdk/python/feast/infra/registry/sql.py sdk/python/feast/repo_config.py39-44
Feast uses type-to-class mappings to resolve store implementations:
This enables declarative configuration where changing type: 'redis' to type: 'postgres' swaps implementations without code changes repo_config.py493-527
Sources: sdk/python/feast/repo_config.py89-105 sdk/python/feast/repo_config.py493-527
Features are defined declaratively using Python objects:
Running feast apply repo_operations.py432-459 parses these definitions using parse_repo() repo_operations.py114-220 and registers them to the registry via apply_total_with_repo_instance() repo_operations.py338-394
Sources: docs/getting-started/quickstart.md144-183 sdk/python/feast/repo_operations.py114-220 sdk/python/feast/repo_operations.py338-394
Feast configuration lives in feature_store.yaml:
The RepoConfig class repo_config.py251-605 validates and loads this configuration:
| Field | Type | Purpose |
|---|---|---|
project | str | Unique project identifier repo_config.py254-257 |
registry | RegistryConfig | Registry connection details repo_config.py267-272 |
provider | str | Infrastructure provider (local/gcp/aws) repo_config.py264-265 |
online_store | OnlineStoreConfig | Online store configuration repo_config.py275-276 |
offline_store | OfflineStoreConfig | Offline store configuration repo_config.py281-282 |
batch_engine | BatchEngineConfig | Materialization engine repo_config.py284-285 |
Sources: sdk/python/feast/repo_config.py251-605 docs/getting-started/quickstart.md108-117
Feast supports multiple deployment patterns:
DaskOfflineStore infra/offline_stores/dask.pyget_app() feature_server.py211-321The feature server exposes REST endpoints:
POST /get-online-features feature_server.py323-351POST /push feature_server.py388-480POST /materialize feature_server.py530-552Sources: sdk/python/feast/feature_server.py211-552 infra/feast-operator/README.md1-5 docs/how-to-guides/running-feast-in-production.md8-20
Feast defines a canonical type system in Protocol Buffers protos/feast/types/Value.proto with bidirectional conversions to external systems:
The type_map.py module type_map.py provides conversion functions between these type systems, enabling heterogeneous data pipelines (e.g., BigQuery → Redis, Snowflake → PostgreSQL).
Sources: sdk/python/feast/type_map.py protos/feast/types/Value.proto
For detailed architectural information, see System Architecture.
To set up Feast and run your first feature pipeline, see Getting Started and CLI.
To contribute to Feast development, see Contributing and Development Setup.
For specific component documentation:
Sources: README.md244-251 docs/SUMMARY.md9-45
Refresh this wiki
This wiki was recently refreshed. Please wait 5 days to refresh again.