mlm_insights.core.metrics.regression_metrics package¶

Submodules¶

mlm_insights.core.metrics.regression_metrics.max_error module¶

class mlm_insights.core.metrics.regression_metrics.max_error.MaxError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', max_of_residual: float = 0.0)¶

Bases: DatasetMetricBase

MaxError metric computes the maximum residual error. This is a dataset level metric
It is an accurate metric which can process any column type and only numerical (int, float) data types.
This metric falls under regression category, and is used for predictive modeling problems that involve
predicting a numeric value or to measure error/performance etc for regression models etc.
Both Ground truth and Prediction target columns should not contain any NaN values otherwise
InvalidTargetPredictionException will be thrown

Configuration¶

None

Parameters¶

y_true: array-like of shape (n_samples,): Ground truth (correct) target values.
y_predarray-like of shape (n_samples,): Estimated target values.

Returns¶

max_errorfloat: A positive floating point value (the best value is 0.0).

Exceptions¶

MissingRequiredParameterException
InvalidTargetPredictionException

Examples

from mlm_insights.builder.builder_component import MetricDetail, EngineDetail
from mlm_insights.builder.insights_builder import InsightsBuilder
from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType
from mlm_insights.core.metrics.regression_metrics.max_error import MaxError
from mlm_insights.core.metrics.metric_metadata import MetricMetadata
import pandas as pd

def main():
    input_schema = {
        'square_feet': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.INPUT),
        'house_price_prediction': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.PREDICTION),
        'house_price_target': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.TARGET)
    }
    data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23],
     {'house_price_target': [1, 2, 3, 4, 5],
     'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}})
    metric_details = MetricDetail(univariate_metric={},
                                  dataset_metrics=[MetricMetadata(klass=MaxError)])

    runner = InsightsBuilder().                 with_input_schema(input_schema).                 with_data_frame(data_frame=data_frame).                 with_metrics(metrics=metric_details).                 with_engine(engine=EngineDetail(engine_name="native")).                 build()

    profile_json = runner.run().profile.to_json()
    dataset_metrics = profile_json['dataset_metrics']
    print(dataset_metrics["MaxError"])
    # {'value': 1.1}
if __name__ == "__main__":
    main()

Returns the standard metric result as:
{
    'metric_name': 'MaxError',
    'metric_description': 'MaxError metric computes the maximum residual error',
    'variable_count': 1,
    'variable_names': ['max_error'],
    'variable_types': [CONTINUOUS],
    'variable_dtypes': [FLOAT],
    'variable_dimensions': [0],
    'metric_data': [1.1],
    'metadata': {},
    'error': None
}

compute(dataset: DataFrame, **kwargs: Any) → None¶

Computes Max error of residual for the passed in dataset

Parameters¶

datasetpd.DataFrame: DataFrame object for either the entire dataset for a partition on which a Metric is being computed

classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) → MaxError¶

Create a MaxError metric using the configuration and kwargs

Parameters¶

configOptional[Dict[str, ConfigParameter]]: Metric configuration

get_result(**kwargs: Any) → Dict[str, Any]¶: Returns the computed value of the metric

Returns¶

Dict[str, Any]: Dictionary with key as string and value as any metric property.

get_standard_metric_result(**kwargs: Any) → StandardMetricResult¶: This method returns metric output in standard format.

Returns¶

StandardMetricResult

max_of_residual: float = 0.0¶

merge(other_metric: MaxError, **kwargs: Any) → MaxError¶

Merge two MaxError into one, without mutating the others.

Parameters¶

other_metricMaxError: Other MaxError that need be merged.

Returns¶

MaxError: MaxError: A new instance of MaxError

prediction_column: str = 'y_predict'¶

target_column: str = 'y_true'¶

mlm_insights.core.metrics.regression_metrics.mean_absolute_error module¶

class mlm_insights.core.metrics.regression_metrics.mean_absolute_error.MeanAbsoluteError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', total_count: int = 0, sum_of_residuals: float = 0.0)¶

Bases: DatasetMetricBase

Computes Mean Absolute Error regression loss. This is a dataset level metric
It is an accurate metric which can process any column type and only numerical (int, float) data types.
This metric falls under regression category, and is used for predictive modeling problems that involve
predicting a numeric value or to measure error/performance etc for regression models.
Both Ground truth and Prediction target columns should not contain any NaN values otherwise
InvalidTargetPredictionException will be thrown

Configuration¶

None

Parameters¶

y_truearray-like of shape (n_samples,) or (n_samples, n_outputs): Ground truth (correct) target values.
y_predarray-like of shape (n_samples,) or (n_samples, n_outputs): Estimated target values.

Returns¶

float: Mean Absolute Error

Exceptions¶

MissingRequiredParameterException
InvalidTargetPredictionException

Examples

from mlm_insights.builder.builder_component import MetricDetail, EngineDetail
from mlm_insights.builder.insights_builder import InsightsBuilder
from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType
from mlm_insights.core.metrics.regression_metrics.mean_absolute_error import MeanAbsoluteError
from mlm_insights.core.metrics.metric_metadata import MetricMetadata
import pandas as pd


def main():
    input_schema = {
        'square_feet': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.INPUT),
        'house_price_prediction': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.PREDICTION),
        'house_price_target': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.TARGET)
    }
    data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23],
     {'house_price_target': [1, 2, 3, 4, 5],
     'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}})
    metric_details = MetricDetail(univariate_metric={},
                                  dataset_metrics=[MetricMetadata(klass=MeanAbsoluteError)])

    runner = InsightsBuilder().                 with_input_schema(input_schema).                 with_data_frame(data_frame=data_frame).                 with_metrics(metrics=metric_details).                 with_engine(engine=EngineDetail(engine_name="native")).                 build()

    profile_json = runner.run().profile.to_json()
    dataset_metrics = profile_json['dataset_metrics']
    print(dataset_metrics["MeanAbsoluteError"])
if __name__ == "__main__":
    main()

Returns the standard metric result as:
{
    'metric_name': 'MeanAbsoluteError',
    'metric_description': 'MaxError metric computes the maximum residual error',
    'variable_count': 1,
    'variable_names': ['mean_absolute_error'],
    'variable_types': [CONTINUOUS],
    'variable_dtypes': [FLOAT],
    'variable_dimensions': [0],
    'metric_data': [1.1],
    'metadata': {},
    'error': None
}

compute(dataset: DataFrame, **kwargs: Any) → None¶

Computes Numerator for the Mean Absolute Error for the passed in dataset

Parameters¶

datasetpd.DataFrame: DataFrame object for either the entire dataset for a partition on which a Metric is being computed

classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) → MeanAbsoluteError¶

Create a MeanAbsoluteError metric using the configuration and kwargs

Parameters¶

configOptional[Dict[str, ConfigParameter]]: Metric configuration
features_metadata: FeatureMetadata: Contains input schema for each feature, supplied as a keyword argument

get_result(**kwargs: Any) → Dict[str, Any]¶: Returns the computed value of the metric

Returns¶

Dict[str, Any]: Dictionary with key as string and value as any metric property.

get_standard_metric_result(**kwargs: Any) → StandardMetricResult¶: This method returns metric output in standard format.

Returns¶

StandardMetricResult

merge(other_metric: MeanAbsoluteError, **kwargs: Any) → MeanAbsoluteError¶

Merge two MeanAbsoluteError into one, without mutating the others.

Parameters¶

other_metricMeanAbsoluteError: Other MeanAbsoluteError that need be merged.

Returns¶

MeanAbsoluteError: MeanAbsoluteError: A new instance of MeanAbsoluteError

prediction_column: str = 'y_predict'¶

sum_of_residuals: float = 0.0¶

target_column: str = 'y_true'¶

total_count: int = 0¶

mlm_insights.core.metrics.regression_metrics.mean_absolute_percentage_error module¶

class mlm_insights.core.metrics.regression_metrics.mean_absolute_percentage_error.MeanAbsolutePercentageError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', total_count: int = 0, sum_of_relative_error: float = 0.0)¶

Bases: DatasetMetricBase

Mean absolute percentage error (MAPE) regression loss. This is a dataset level metric
It is an accurate metric which can process any column type and only numerical (int, float) data types.
This metric falls under regression category, and is used for predictive modeling problems that involve
predicting a numeric value or to measure error/performance etc. for regression models.
Both Ground truth and Prediction target columns should not contain any NaN values otherwise
InvalidTargetPredictionException will be thrown

Configuration¶

None

Parameters¶

y_truearray-like of shape (n_samples,) or (n_samples, n_outputs): Ground truth (correct) target values.
y_predarray-like of shape (n_samples,) or (n_samples, n_outputs): Estimated target values.

Returns¶

float: Mean absolute percentage error (MAPE) regression loss

Exceptions¶

MissingRequiredParameterException
InvalidTargetPredictionException

Examples

from mlm_insights.builder.builder_component import MetricDetail, EngineDetail
from mlm_insights.builder.insights_builder import InsightsBuilder
from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType
from mlm_insights.core.metrics.regression_metrics.mean_absolute_percentage_error import MeanAbsolutePercentageError
from mlm_insights.core.metrics.metric_metadata import MetricMetadata
import pandas as pd


def main():
    input_schema = {
        'square_feet': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.INPUT),
        'house_price_prediction': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.PREDICTION),
        'house_price_target': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.TARGET)
    }
    data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23],
     {'house_price_target': [1, 2, 3, 4, 5],
     'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}})
    metric_details = MetricDetail(univariate_metric={},
                                  dataset_metrics=[MetricMetadata(klass=MeanAbsolutePercentageError)])

    runner = InsightsBuilder().                 with_input_schema(input_schema).                 with_data_frame(data_frame=data_frame).                 with_metrics(metrics=metric_details).                 with_engine(engine=EngineDetail(engine_name="native")).                 build()

    profile_json = runner.run().profile.to_json()
    dataset_metrics = profile_json['dataset_metrics']
    print(dataset_metrics["MeanAbsolutePercentageError"])
if __name__ == "__main__":
    main()

Returns the standard metric result as:
{
    'metric_name': 'MeanAbsolutePercentageError',
    'metric_description': 'Mean absolute percentage error (MAPE) regression loss',
    'variable_count': 1,
    'variable_names': ['mean_absolute_percentage_error'],
    'variable_types': [CONTINUOUS],
    'variable_dtypes': [FLOAT],
    'variable_dimensions': [0],
    'metric_data': [1.1],
    'metadata': {},
    'error': None
}

compute(dataset: DataFrame, **kwargs: Any) → None¶

Computes Sum of Relative Error for the Mean absolute percentage Error for the passed in dataset

Parameters¶

datasetpd.DataFrame: DataFrame object for either the entire dataset for a partition on which a Metric is being computed

classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) → MeanAbsolutePercentageError¶

Create a MeanAbsolutePercentageError metric using the configuration and kwargs

Parameters¶

configOptional[Dict[str, ConfigParameter]]: Metric configuration

get_result(**kwargs: Any) → Dict[str, Any]¶: Returns the computed value of the metric

Returns¶

Dict[str, Any]: Dictionary with key as string and value as any metric property.

get_standard_metric_result(**kwargs: Any) → StandardMetricResult¶: This method returns metric output in standard format.

Returns¶

StandardMetricResult

merge(other_metric: MeanAbsolutePercentageError, **kwargs: Any) → MeanAbsolutePercentageError¶

Merge two MeanAbsolutePercentageError metrics into one, without mutating the others.

Parameters¶

other_metricMeanAbsolutePercentageError: Other MeanAbsolutePercentageError that needs be merged.

Returns¶

MeanAbsolutePercentageError: MeanAbsolutePercentageError: A new instance of MeanAbsolutePercentageError

prediction_column: str = 'y_predict'¶

sum_of_relative_error: float = 0.0¶

target_column: str = 'y_true'¶

total_count: int = 0¶

mlm_insights.core.metrics.regression_metrics.mean_squared_error module¶

class mlm_insights.core.metrics.regression_metrics.mean_squared_error.MeanSquaredError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', total_count: int = 0, sum_of_squared_residuals: float = 0.0)¶

Bases: DatasetMetricBase

Computes Mean Squared Error regression loss. This is a dataset level metric
It is an accurate metric which can process any column type and only numerical (int, float) data types.
This metric falls under regression category, and is used for predictive modeling problems that involve
predicting a numeric value or to measure error/performance etc. for regression models.
Both Ground truth and Prediction target columns should not contain any NaN values otherwise
InvalidTargetPredictionException will be thrown

Configuration¶

None

Parameters¶

y_truearray-like of shape (n_samples,) or (n_samples, n_outputs): Ground truth (correct) target values.
y_predarray-like of shape (n_samples,) or (n_samples, n_outputs): Estimated target values.

Returns¶

float: Mean Squared Error

Exceptions¶

MissingRequiredParameterException
InvalidTargetPredictionException

Examples

from mlm_insights.builder.builder_component import MetricDetail, EngineDetail
from mlm_insights.builder.insights_builder import InsightsBuilder
from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType
from mlm_insights.core.metrics.regression_metrics.mean_squared_error import MeanSquaredError
from mlm_insights.core.metrics.metric_metadata import MetricMetadata
import pandas as pd


def main():
    input_schema = {
        'square_feet': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.INPUT),
        'house_price_prediction': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.PREDICTION),
        'house_price_target': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.TARGET)
    }
    data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23],
     {'house_price_target': [1, 2, 3, 4, 5],
     'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}})
    metric_details = MetricDetail(univariate_metric={},
                                  dataset_metrics=[MetricMetadata(klass=MeanSquaredError)])

    runner = InsightsBuilder().                 with_input_schema(input_schema).                 with_data_frame(data_frame=data_frame).                 with_metrics(metrics=metric_details).                 with_engine(engine=EngineDetail(engine_name="native")).                 build()

    profile_json = runner.run().profile.to_json()
    dataset_metrics = profile_json['dataset_metrics']
    print(dataset_metrics["MeanSquaredError"])
if __name__ == "__main__":
    main()

Returns the standard metric result as:
{
    'metric_name': 'MeanSquaredError',
    'metric_description': 'Computes Mean Squared Error regression loss',
    'variable_count': 1,
    'variable_names': ['mean_squared_error'],
    'variable_types': [CONTINUOUS],
    'variable_dtypes': [FLOAT],
    'variable_dimensions': [0],
    'metric_data': [1.1],
    'metadata': {},
    'error': None
}

compute(dataset: DataFrame, **kwargs: Any) → None¶

Computes Numerator for the Mean MeanSquaredError Error for the passed in dataset

Parameters¶

datasetpd.DataFrame: DataFrame object for either the entire dataset for a partition on which a Metric is being computed

classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) → MeanSquaredError¶

Create a MeanSquaredError metric using the configuration and kwargs

Parameters¶

configOptional[Dict[str, ConfigParameter]]: Metric configuration

get_result(**kwargs: Any) → Dict[str, Any]¶: Returns the computed value of the metric

Returns¶

Dict[str, Any]: Dictionary with key as string and value as any metric property.

get_standard_metric_result(**kwargs: Any) → StandardMetricResult¶: This method returns metric output in standard format.

Returns¶

StandardMetricResult

merge(other_metric: MeanSquaredError, **kwargs: Any) → MeanSquaredError¶

Merge two MeanSquaredError into one, without mutating the others.

Parameters¶

other_metricMeanSquaredError: Other MeanSquaredError that need be merged.

Returns¶

MeanSquaredError: MeanSquaredError: A new instance of MeanSquaredError

prediction_column: str = 'y_predict'¶

sum_of_squared_residuals: float = 0.0¶

target_column: str = 'y_true'¶

total_count: int = 0¶

mlm_insights.core.metrics.regression_metrics.mean_squared_log_error module¶

class mlm_insights.core.metrics.regression_metrics.mean_squared_log_error.MeanSquaredLogError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', total_count: int = 0, sum_of_squared_log: float = 0.0)¶

Bases: DatasetMetricBase

Computes Mean Squared Log Error regression loss. This Metric must be used when both target and prediction values are positive.
This is a dataset level metric. It is an accurate metric which can process any column type and only numerical (int, float) data types.
This metric falls under regression category, and is used for predictive modeling problems that involve
predicting a numeric value or to measure error/performance etc. for regression models.
Both Ground truth and Prediction target columns should not contain any NaN values otherwise
InvalidTargetPredictionException will be thrown

Configuration¶

None

Parameters¶

y_truearray-like of shape (n_samples,) or (n_samples, n_outputs): Ground truth (correct) target values.
y_predarray-like of shape (n_samples,) or (n_samples, n_outputs): Estimated target values.

Returns¶

float: Mean Squared Log Error

Exceptions¶

MissingRequiredParameterException
InvalidTargetPredictionException

Examples

from mlm_insights.builder.builder_component import MetricDetail, EngineDetail
from mlm_insights.builder.insights_builder import InsightsBuilder
from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType
from mlm_insights.core.metrics.regression_metrics.mean_squared_log_error import MeanSquaredLogError
from mlm_insights.core.metrics.metric_metadata import MetricMetadata
import pandas as pd


def main():
    input_schema = {
        'square_feet': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.INPUT),
        'house_price_prediction': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.PREDICTION),
        'house_price_target': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.TARGET)
    }
    data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23],
     {'house_price_target': [1, 2, 3, 4, 5],
     'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}})
    metric_details = MetricDetail(univariate_metric={},
                                  dataset_metrics=[MetricMetadata(klass=MeanSquaredLogError)])

    runner = InsightsBuilder().                 with_input_schema(input_schema).                 with_data_frame(data_frame=data_frame).                 with_metrics(metrics=metric_details).                 with_engine(engine=EngineDetail(engine_name="native")).                 build()

    profile_json = runner.run().profile.to_json()
    dataset_metrics = profile_json['dataset_metrics']
    print(dataset_metrics["MeanSquaredLogError"])
if __name__ == "__main__":
    main()

Returns the standard metric result as:
{
    'metric_name': 'MeanSquaredLogError',
    'metric_description': 'Computes Mean Squared Log Error regression loss',
    'variable_count': 1,
    'variable_names': ['mean_squared_log_error'],
    'variable_types': [CONTINUOUS],
    'variable_dtypes': [FLOAT],
    'variable_dimensions': [0],
    'metric_data': [1.1],
    'metadata': {},
    'error': None
}

compute(dataset: DataFrame, **kwargs: Any) → None¶

Computes Numerator for the MeanSquaredLogError for the passed in dataset

Parameters¶

datasetpd.DataFrame: DataFrame object for either the entire dataset for a partition on which a Metric is being computed

classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) → MeanSquaredLogError¶

Create a MeanSquaredLogError metric using the configuration and kwargs

Parameters¶

configOptional[Dict[str, ConfigParameter]]: Metric configuration

get_result(**kwargs: Any) → Dict[str, Any]¶: Returns the computed value of the metric

Returns¶

Dict[str, Any]: Dictionary with key as string and value as any metric property.

get_standard_metric_result(**kwargs: Any) → StandardMetricResult¶: This method returns metric output in standard format.

Returns¶

StandardMetricResult

merge(other_metric: MeanSquaredLogError, **kwargs: Any) → MeanSquaredLogError¶

Merge two MeanSquaredLogError into one, without mutating the others.

Parameters¶

other_metricMeanSquaredLogError: Other MeanSquaredLogError that need be merged.

Returns¶

MeanSquaredLogErrorMeanSquaredLogError: A new instance of MeanSquaredLogError

prediction_column: str = 'y_predict'¶

sum_of_squared_log: float = 0.0¶

target_column: str = 'y_true'¶

total_count: int = 0¶

mlm_insights.core.metrics.regression_metrics.r2_score module¶

class mlm_insights.core.metrics.regression_metrics.r2_score.R2Score(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', total_count: int = 0, sum_of_squared_residuals: float = 0.0)¶

Bases: DatasetMetricBase

Computes R2-Score between target and prediction columns.
In the particular case when target is constant, the :math: R2 score
 is not finite: it is either NaN (perfect predictions) or -Inf
(imperfect predictions). By default, these cases are replaced with 1.0 (perfect predictions) or 0.0 (imperfect predictions) respectively.
This is a dataset level and accurate metric which can process any column type and only numerical (int, float) data types.
This metric falls under regression category, and is used for predictive modeling problems that involve
predicting a numeric value or to measure error/performance etc. for regression models.
Both Ground truth and Prediction target columns should not contain any NaN values otherwise
InvalidTargetPredictionException will be thrown

Configuration¶

None

Parameters¶

y_truearray-like of shape (n_samples,) or (n_samples, n_outputs): Ground truth (correct) target values.
y_predarray-like of shape (n_samples,) or (n_samples, n_outputs): Estimated target values.

Returns¶

float: R2 score

Exceptions¶

MissingRequiredParameterException
InvalidTargetPredictionException

Examples

from mlm_insights.builder.builder_component import MetricDetail, EngineDetail
from mlm_insights.builder.insights_builder import InsightsBuilder
from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType
from mlm_insights.core.metrics.regression_metrics.r2_score import R2Score
from mlm_insights.core.metrics.metric_metadata import MetricMetadata
import pandas as pd


def main():
    input_schema = {
        'square_feet': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.INPUT),
        'house_price_prediction': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.PREDICTION),
        'house_price_target': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.TARGET)
    }
    data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23],
     {'house_price_target': [1, 2, 3, 4, 5],
     'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}})
    metric_details = MetricDetail(univariate_metric={},
                                  dataset_metrics=[MetricMetadata(klass=R2Score)])

    runner = InsightsBuilder().                 with_input_schema(input_schema).                 with_data_frame(data_frame=data_frame).                 with_metrics(metrics=metric_details).                 with_engine(engine=EngineDetail(engine_name="native")).                 build()

    profile_json = runner.run().profile.to_json()
    dataset_metrics = profile_json['dataset_metrics']
    print(dataset_metrics["R2Score"])
if __name__ == "__main__":
    main()

Returns the standard metric result as:
{
    'metric_name': 'R2Score',
    'metric_description': 'Computes R2-Score between target and prediction columns',
    'variable_count': 1,
    'variable_names': ['r2_score'],
    'variable_types': [CONTINUOUS],
    'variable_dtypes': [FLOAT],
    'variable_dimensions': [0],
    'metric_data': [1.1],
    'metadata': {},
    'error': None
}

compute(dataset: DataFrame, **kwargs: Any) → None¶

Computes Numerator for the Mean R2Score Error for the passed in dataset

Parameters¶

datasetpd.DataFrame: DataFrame object for either the entire dataset for a partition on which a Metric is being computed

classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) → R2Score¶

Factory Method to create an object. The configuration will be available in config.

Returns¶

MetricBase: An Instance of MetricBase.

get_required_shareable_feature_components(**kwargs: Any) → Dict[str, List[SFCMetaData]]¶: Returns the Shareable Feature Components that a Metric requires to compute its state and values Metrics which do not require SFC need not override this property

Returns¶

Dict where feature_name as key and List of SFCMetadata as value. Each SFCMetadata must contain the klass attribute which points to the SFC class

get_result(**kwargs: Any) → Dict[str, Any]¶: Returns the computed value of the metric

Returns¶

Dict[str, Any]: Dictionary with key as string and value as any metric property.

get_standard_metric_result(**kwargs: Any) → StandardMetricResult¶: This method returns metric output in standard format.

Returns¶

StandardMetricResult

merge(other_metric: R2Score, **kwargs: Any) → R2Score¶

Merge two R2Score into one, without mutating the others.

Parameters¶

other_metricR2Score: Other R2Score that need be merged.

Returns¶

R2Score: R2Score: A new instance of R2Score

prediction_column: str = 'y_predict'¶

sum_of_squared_residuals: float = 0.0¶

target_column: str = 'y_true'¶

total_count: int = 0¶

mlm_insights.core.metrics.regression_metrics.root_mean_squared_error module¶

class mlm_insights.core.metrics.regression_metrics.root_mean_squared_error.RootMeanSquaredError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', prediction_score_column: str = 'y_score', total_count: int = 0, sum_of_squared_residuals: float = 0.0)¶

Bases: DatasetMetricBase

Computes Root Mean Square Error regression loss. This is a dataset level metric
It is an accurate metric which can process any column type and only numerical (int, float) data types.
This metric falls under regression category, and is used for predictive modeling problems that involve
predicting a numeric value or to measure error/performance etc. for regression models.
Both Ground truth and Prediction target columns should not contain any NaN values otherwise
InvalidTargetPredictionException will be thrown

Configuration¶

None

Parameters¶

y_truearray-like of shape (n_samples,) or (n_samples, n_outputs): Ground truth (correct) target values.
y_predarray-like of shape (n_samples,) or (n_samples, n_outputs): Estimated target values.

Returns¶

float: Root Mean Squared Error

Exceptions¶

MissingRequiredParameterException
InvalidTargetPredictionException

Examples

from mlm_insights.builder.builder_component import MetricDetail, EngineDetail
from mlm_insights.builder.insights_builder import InsightsBuilder
from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType
from mlm_insights.core.metrics.regression_metrics.root_mean_squared_error import RootMeanSquaredError
from mlm_insights.core.metrics.metric_metadata import MetricMetadata
import pandas as pd


def main():
    input_schema = {
        'square_feet': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.INPUT),
        'house_price_prediction': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.PREDICTION),
        'house_price_target': FeatureType(
            data_type=DataType.FLOAT,
            variable_type=VariableType.CONTINUOUS,
            column_type=ColumnType.TARGET)
    }
    data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23],
     {'house_price_target': [1, 2, 3, 4, 5],
     'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}})
    metric_details = MetricDetail(univariate_metric={},
                                  dataset_metrics=[MetricMetadata(klass=RootMeanSquaredError)])

    runner = InsightsBuilder().                 with_input_schema(input_schema).                 with_data_frame(data_frame=data_frame).                 with_metrics(metrics=metric_details).                 with_engine(engine=EngineDetail(engine_name="native")).                 build()

    profile_json = runner.run().profile.to_json()
    dataset_metrics = profile_json['dataset_metrics']
    print(dataset_metrics["RootMeanSquaredError"])
if __name__ == "__main__":
    main()

Returns the standard metric result as:
{
    'metric_name': 'RootMeanSquaredError',
    'metric_description': 'Computes Root Mean Squared Error regression loss',
    'variable_count': 1,
    'variable_names': ['root_mean_squared_error'],
    'variable_types': [CONTINUOUS],
    'variable_dtypes': [FLOAT],
    'variable_dimensions': [0],
    'metric_data': [1.1],
    'metadata': {},
    'error': None
}

compute(dataset: DataFrame, **kwargs: Any) → None¶

Computes Numerator for the Root Mean Square Error for the passed in dataset

Parameters¶

datasetpd.DataFrame: DataFrame object for either the entire dataset for a partition on which a Metric is being computed

classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) → RootMeanSquaredError¶

Create a RootMeanSquareError metric using the configuration and kwargs

Parameters¶

configOptional[Dict[str, ConfigParameter]]: Metric configuration

get_result(**kwargs: Any) → Dict[str, Any]¶: Returns the computed value of the metric

Returns¶

Dict[str, Any]: Dictionary with key as string and value as any metric property.

get_standard_metric_result(**kwargs: Any) → StandardMetricResult¶: This method returns metric output in standard format.

Returns¶

StandardMetricResult

merge(other_metric: RootMeanSquaredError, **kwargs: Any) → RootMeanSquaredError¶

Merge two RootMeanSquaredError metrics into one, without mutating the others.

Parameters¶

other_metricRootMeanSquaredError: Other RootMeanSquaredError that needs be merged.

Returns¶

RootMeanSquaredErrorRootMeanSquaredError: A new instance of RootMeanSquaredError

prediction_column: str = 'y_predict'¶

prediction_score_column: str = 'y_score'¶

sum_of_squared_residuals: float = 0.0¶

target_column: str = 'y_true'¶

total_count: int = 0¶