Feature Store

Centralized service for feature computation, storage, and serving with train/serve parity. Feast, Tecton, internal feature platforms. The substrate that prevents train/serve skew.

One feature definition. Two serving paths (training-time + inference-time). The store guarantees the value at inference is the same the model trained on. Status: STUB — promoted to OUTLINE in Y5 Phase 40.

What this pattern is

A feature store is the centralized service for feature definition, computation, storage, and serving in ML systems. Its load-bearing property is train/serve parity — a feature defined once (e.g., “user’s average order value over the last 30 days”) is computed by the same logic whether the model is training (offline, against historical data, with point-in-time correctness) or serving (online, against a low-latency store). Feast is the OSS canonical implementation: feature definitions in code, offline store on Iceberg/Snowflake/BigQuery, online store on pgvector/Redis/DynamoDB. Internal feature platforms at Uber (Michelangelo), Airbnb (Bighead), and other frontier labs follow the same shape.

Without a feature store, training pipelines and inference pipelines duplicate feature logic. Months later, a subtle difference appears between training (computed in Spark) and serving (computed in the application). The model performs worse online than offline; nobody can find the discrepancy. The feature store is the discipline that eliminates the class of bug.

The pattern’s central abstraction is the feature view. A feature view names a set of features, their computation, their source data, and their freshness requirements. At training time, the view materializes as a batch join against historical data (point-in-time correct). At inference time, the view materializes as a low-latency lookup against the online store (pre-computed by the pipeline). Both paths return the same values given the same inputs, guaranteed by the shared definition.

Feature stores also solve organizational problems, not just technical ones. Multiple teams need the same features (user tenure, order history, session behavior). Without a store, each team recomputes them, produces subtly different values, and ends up with feature sprawl. With a store, features are defined once, reused by many models, and centrally versioned. Feature discovery becomes a UI browse rather than tribal knowledge.

Concrete instances in the wild

  • Feast. OSS canonical implementation. K8s-native, supports Iceberg/Snowflake/BigQuery offline + pgvector/Redis/DynamoDB online. basecamp default.
  • Tecton. Commercial feature platform (co-founded by Feast originators). Managed, more feature-rich.
  • Uber Michelangelo. The originating feature platform. Documented in blog posts (2017+).
  • Airbnb Bighead / Chronon. Airbnb’s feature platform. Chronon is the OSS version.
  • Databricks Feature Store. Databricks’ native feature store, integrated with MLflow.
  • AWS SageMaker Feature Store. AWS-managed feature store.
  • GCP Vertex AI Feature Store. GCP-managed equivalent.
  • Hopsworks. OSS feature store with focus on time-series features.
  • Featureform. OSS feature-as-code framework, sits on top of existing infrastructure.
  • Meta FBLearner Flow (historic). Meta’s internal feature platform. Public papers describe design.

Why this pattern matters

Without a feature store, ML deployment produces a specific and painful failure mode. The model trains on Spark-computed features. The application serves predictions using its own feature computation. Training and serving drift subtly — a floating-point rounding difference, a missing-value handling difference, a timezone-conversion difference. Offline evals look great. Online performance is significantly worse. Nobody can find the discrepancy because they’re looking at “the model” instead of “the features feeding the model.”

With a feature store, the class of bug goes away. Both paths reference the same feature definition. Both compute the same values. Both cache the same intermediate results. Train/serve parity is guaranteed by construction, not by hoping the two implementations stay in sync. Model performance in production matches offline evals.

The pattern also enables faster model development. New models can reuse existing features. Feature discovery is a UI browse. Feature reuse is a Python import. Iteration speed on model architecture goes up because the feature engineering burden is shared. Feature quality goes up because features get scrutinized by multiple teams (one team’s model regression triggers investigation of shared features).

Feature stores also solve operational problems. Fresh feature values (compute the user’s session length as of now) require low-latency online serving. Historical feature values (train on how the user looked six months ago) require point-in-time correct batch computation. Both requirements traditionally required separate infrastructure. Feature stores unify them behind one abstraction.

For LLM workflows, the pattern shows up differently but still matters. Vector embeddings are a kind of feature. RAG retrieval context is a kind of feature. Model prompt templates are a kind of feature. Each needs versioning, discovery, and train/serve consistency. Feature-store principles apply even when the “features” don’t look like classical ML features.

The failure modes to know: feature stores become bottlenecks if online serving latency is too high (need aggressive caching and materialization); features accumulate without deprecation (thousands of unused features); teams work around the store rather than through it (needs organizational commitment); the store itself becomes critical infrastructure whose downtime breaks all serving. Each has known patterns for prevention.

Depth progression

STUB     ← you are here.
OUTLINE  Promoted when Y5 Phase 40 deploys Feast on basecamp.
DEEP     Promoted after Y5 Phase 40 — at least one model trained AND served
         via Feast, with verified train/serve parity.

Preview: what OUTLINE will answer

When Y5 Phase 40 promotes this entry to OUTLINE, it will name:

  • PROBLEM. How do you compute features once and serve them both to training (offline, point-in-time correct) and inference (online, low-latency)?
  • PRINCIPLES. One feature definition, two serving paths. Point-in-time correctness for training. Low-latency serving for inference. Shared definition guarantees parity. Feature reuse across models. Versioning and lineage per feature.
  • TRADE-OFFS. OSS self-hosted (Feast — flexible, ops burden) vs managed (Tecton, Databricks — easy, cost). Batch-only vs streaming feature computation. Framework-based (Feast declarative) vs SDK-based (Tecton Python). Materialized-online (fast, storage cost) vs computed-on-demand (slow, cheap).
  • TOOLS (time-stamped as of 2026-06): Feast (basecamp default), Tecton, Uber Michelangelo (internal), Airbnb Chronon (OSS), Databricks Feature Store, AWS SageMaker Feature Store, GCP Vertex AI Feature Store, Hopsworks, Featureform.

The DEEP promotion, after Y5 Phase 40 with a model trained and served via Feast, will add MASTERY (operating Feast on basecamp), COMPARE (Feast vs Tecton vs Databricks Feature Store), OPERATE (a specific train/serve parity verification), and CONTRIBUTE (a Feast connector or documentation improvement).

Canonical references

  • Feast documentation. Free at feast.dev.
  • Uber Michelangelo blog posts (2017-2020). Free at eng.uber.com.
  • Airbnb Chronon papers and blog posts. Free.
  • Chip Huyen, Designing Machine Learning Systems, chapter on feature engineering.
  • Willem Pienaar’s (Feast originator) talks on feature store patterns. Free.

Cross-references