Skip to content

Feature Stores

zenml.feature_stores special

Feature Store

Feature stores allow data teams to serve data via an offline store and an online low-latency store where data is kept in sync between the two. It also offers a centralized registry where features (and feature schemas) are stored for use within a team or wider organization.

As a data scientist working on training your model, your requirements for how you access your batch / 'offline' data will almost certainly be different from how you access that data as part of a real-time or online inference setting. Feast solves the problem of developing train-serve skew where those two sources of data diverge from each other.

base_feature_store

BaseFeatureStore (StackComponent, ABC) pydantic-model

Base class for all ZenML feature stores.

Source code in zenml/feature_stores/base_feature_store.py
class BaseFeatureStore(StackComponent, ABC):
    """Base class for all ZenML feature stores."""

    TYPE: ClassVar[StackComponentType] = StackComponentType.FEATURE_STORE
    FLAVOR: ClassVar[str]

    @abstractmethod
    def get_historical_features(
        self,
        entity_df: Union[pd.DataFrame, str],
        features: List[str],
        full_feature_names: bool = False,
    ) -> pd.DataFrame:
        """Returns the historical features for training or batch scoring.

        Args:
            entity_df: The entity dataframe or entity name.
            features: The features to retrieve.
            full_feature_names: Whether to return the full feature names.

        Returns:
            The historical features as a Pandas DataFrame.
        """

    @abstractmethod
    def get_online_features(
        self,
        entity_rows: List[Dict[str, Any]],
        features: List[str],
        full_feature_names: bool = False,
    ) -> Dict[str, Any]:
        """Returns the latest online feature data.

        Args:
            entity_rows: The entity rows to retrieve.
            features: The features to retrieve.
            full_feature_names: Whether to return the full feature names.

        Returns:
            The latest online feature data as a dictionary.
        """
get_historical_features(self, entity_df, features, full_feature_names=False)

Returns the historical features for training or batch scoring.

Parameters:

Name Type Description Default
entity_df Union[pandas.core.frame.DataFrame, str]

The entity dataframe or entity name.

required
features List[str]

The features to retrieve.

required
full_feature_names bool

Whether to return the full feature names.

False

Returns:

Type Description
DataFrame

The historical features as a Pandas DataFrame.

Source code in zenml/feature_stores/base_feature_store.py
@abstractmethod
def get_historical_features(
    self,
    entity_df: Union[pd.DataFrame, str],
    features: List[str],
    full_feature_names: bool = False,
) -> pd.DataFrame:
    """Returns the historical features for training or batch scoring.

    Args:
        entity_df: The entity dataframe or entity name.
        features: The features to retrieve.
        full_feature_names: Whether to return the full feature names.

    Returns:
        The historical features as a Pandas DataFrame.
    """
get_online_features(self, entity_rows, features, full_feature_names=False)

Returns the latest online feature data.

Parameters:

Name Type Description Default
entity_rows List[Dict[str, Any]]

The entity rows to retrieve.

required
features List[str]

The features to retrieve.

required
full_feature_names bool

Whether to return the full feature names.

False

Returns:

Type Description
Dict[str, Any]

The latest online feature data as a dictionary.

Source code in zenml/feature_stores/base_feature_store.py
@abstractmethod
def get_online_features(
    self,
    entity_rows: List[Dict[str, Any]],
    features: List[str],
    full_feature_names: bool = False,
) -> Dict[str, Any]:
    """Returns the latest online feature data.

    Args:
        entity_rows: The entity rows to retrieve.
        features: The features to retrieve.
        full_feature_names: Whether to return the full feature names.

    Returns:
        The latest online feature data as a dictionary.
    """