Top-level API for pareto-front
Pareto-front calculation functions.
pareto_collection(data, n)
Recursively collect pareto fronts up till n samples.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
Union[numpy.ndarray, pandas.core.frame.DataFrame] |
Data on which to progressively collect pareto fronts. |
required |
n |
int |
Number of samples to collected. |
required |
Returns:
Type | Description |
---|---|
DataFrame |
A pandas DataFrame of length |
Source code in pareto_front/__init__.py
def pareto_collection(data: Union[np.ndarray, pd.DataFrame], n: int) -> pd.DataFrame:
"""Recursively collect pareto fronts up till n samples.
:param data: Data on which to progressively collect pareto fronts.
:param n: Number of samples to collected.
:returns: A pandas DataFrame of length `n`.
"""
in_consideration = data.copy()
samples = pd.DataFrame()
while len(samples) < n:
pfront = pareto_front(in_consideration)
samples = pd.concat([samples, pfront])
remaining_index = in_consideration.index.difference(pfront.index)
in_consideration = in_consideration.loc[remaining_index]
samples = samples.head(n)
return samples
pareto_indices(data)
Return the Pareto efficient row subset of a columnar dataset.
Inspired from: https://stackoverflow.com/questions/32791911/fast-calculation-of-pareto-front-in-python
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
DataFrame |
A numpy array of shape (n_samples, n_dims). |
required |
Returns:
Type | Description |
---|---|
Index |
All samples which lie on the pareto front considering all dimensions. |
Source code in pareto_front/__init__.py
def pareto_indices(data: pd.DataFrame) -> pd.Index:
"""
Return the Pareto efficient row subset of a columnar dataset.
Inspired from: https://stackoverflow.com/questions/32791911/fast-calculation-of-pareto-front-in-python
:param data: A numpy array of shape (n_samples, n_dims).
:returns: All samples which lie on the pareto front considering all dimensions.
"""
pareto_front_indices = data.sum(axis=1).sort_values(ascending=False).index
data = data.loc[pareto_front_indices]
undominated = np.ones(data.shape[0], dtype=bool)
for i in range(data.shape[0]):
n = data.shape[0]
if i >= n:
break
# We use `.iloc` here b/c the sorted order of values is important.
undominated[i + 1 : n] = (data.iloc[i + 1 :] > data.iloc[i]).any(1)
pareto_front_indices = pareto_front_indices[undominated[:n]]
data = data.loc[undominated[:n]]
return pareto_front_indices