Skip to content

Top-level API for pareto-front

Pareto-front calculation functions.

pareto_collection(data, n)

Recursively collect pareto fronts up till n samples.

Parameters:

Name Type Description Default
data Union[numpy.ndarray, pandas.core.frame.DataFrame]

Data on which to progressively collect pareto fronts.

required
n int

Number of samples to collected.

required

Returns:

Type Description
DataFrame

A pandas DataFrame of length n.

Source code in pareto_front/__init__.py
def pareto_collection(data: Union[np.ndarray, pd.DataFrame], n: int) -> pd.DataFrame:
    """Recursively collect pareto fronts up till n samples.

    :param data: Data on which to progressively collect pareto fronts.
    :param n: Number of samples to collected.
    :returns: A pandas DataFrame of length `n`.
    """
    in_consideration = data.copy()
    samples = pd.DataFrame()

    while len(samples) < n:
        pfront = pareto_front(in_consideration)
        samples = pd.concat([samples, pfront])
        remaining_index = in_consideration.index.difference(pfront.index)
        in_consideration = in_consideration.loc[remaining_index]

    samples = samples.head(n)
    return samples

pareto_indices(data)

Return the Pareto efficient row subset of a columnar dataset.

Inspired from: https://stackoverflow.com/questions/32791911/fast-calculation-of-pareto-front-in-python

Parameters:

Name Type Description Default
data DataFrame

A numpy array of shape (n_samples, n_dims).

required

Returns:

Type Description
Index

All samples which lie on the pareto front considering all dimensions.

Source code in pareto_front/__init__.py
def pareto_indices(data: pd.DataFrame) -> pd.Index:
    """
    Return the Pareto efficient row subset of a columnar dataset.

    Inspired from: https://stackoverflow.com/questions/32791911/fast-calculation-of-pareto-front-in-python

    :param data: A numpy array of shape (n_samples, n_dims).
    :returns: All samples which lie on the pareto front considering all dimensions.
    """
    pareto_front_indices = data.sum(axis=1).sort_values(ascending=False).index
    data = data.loc[pareto_front_indices]
    undominated = np.ones(data.shape[0], dtype=bool)
    for i in range(data.shape[0]):
        n = data.shape[0]
        if i >= n:
            break
        # We use `.iloc` here b/c the sorted order of values is important.
        undominated[i + 1 : n] = (data.iloc[i + 1 :] > data.iloc[i]).any(1)
        pareto_front_indices = pareto_front_indices[undominated[:n]]
        data = data.loc[undominated[:n]]
    return pareto_front_indices