mlm_insights.core.profiles package

Submodules

mlm_insights.core.profiles.profile module

class mlm_insights.core.profiles.profile.Profile(tags: Dict[str, str] | None = None)

Bases: object

Profile is a data structure to store summary about data.

It includes the profile header as well as information about features, metrics and SFCs.

add_dataset_metric(dataset_metric_metadata: MetricMetadata, input_schema: Dict[str, FeatureType]) Profile

Add a dataset metric to Profile

Parameters

dataset_metric_metadataMetricMetadata

Metadata used to define and configure a metric

Returns

Profile

It returns itself, after adding new feature.

add_feature(feature: Feature) Profile

Add a feature to Profile

Parameters

featureFeature

Details about the Feature.

Returns

Profile

It returns itself, after adding new feature.

compute(input_data_frame: DataFrame, input_schema: Dict[str, FeatureType]) None

Take the input dataframe to calculate summary of the data and metrics for each feature.

Parameters

input_data_framepd.DataFrame

Input data on which metric and summary get created.

input_schemaDict

Input Schema.

delete_tag(key: str) Profile

Delete a Tag from the current Profile based on the provided key. If the key doesn’t exist, this method does nothing

Parameters

keystring

Key of the tag to be deleted

Returns

Profile

get_dataset_metric(dataset_metric_metadata: MetricMetadata) DatasetMetricBase

Get the dataset metric using metric metadata

Parameters

dataset_metric_metadataMetricMetadata

Name of the feature

Returns

DatasetMetricBase

get_feature(name: str) Feature

Get the Feature object using name

Parameters

namestr

Name of the feature

Returns

Feature

Raises

KeyError: It will raise error if Feature with input name is not present

get_tag(key: str) Tag | None

Get Tag based on the key. If the key is not present, returns None

Parameters

keystring

Key of the tag to be fetched

Returns

Tag or None

get_tags() List[Tag]

Return all the tags for the current Profile

Returns

Tag or None

marshall() bytes

Serialize the profile to byte string.

Returns

str: Serialized Byte String

merge(other_profile: Profile) Profile

Merge two profiles without mutating the input profiles.

Parameters

other_profile : Profile

Returns

Profile

Merged Profiles

to_json(reference_profile: Profile | None = None, output_version: Version = Version.V2) Dict[str, Any]

Create a json View of profile that contains the metrics and summary of the dataset.

Parameters

  • reference_profile: optional
    • Pass the reference profile to compare with, by default None

  • output_version: Version
    • Version of the output format

Returns

Dict : Dictionary of Dataset result

to_pandas(reference_profile: Profile | None = None) DataFrame

Create a Panda View of profile that contains the metrics and summary of the dataset.

Parameters

  • reference_profile: optional
    • Pass the reference profile to compare with, by default None

Returns

pd.DataFrame:

Pandas Dataframe view of profile.

classmethod unmarshall(serialized_str: bytes) Profile

Deserialize the Byte String to profile.

Parameters

serialized_strbytes

Byte String

Returns

Profile:

Deserialized Profile

upsert_tag(tag: Tag) Profile

Insert a tag if it doesn’t exist else update the existing tag. Tag key is used to identify the relevant operation

Parameters

tagTag

Tag to insert/update to the current profile

Returns

Profile

class mlm_insights.core.profiles.profile.Version(value)

Bases: Enum

An enumeration.

V1 = 'v1'
V2 = 'v2'

mlm_insights.core.profiles.profile_data_message module

class mlm_insights.core.profiles.profile_data_message.ProfileDataset

Bases: object

add_dataset_metric(dataset_metric_metadata: MetricMetadata, **kwargs: Any) None

Add a dataset metric to ProfileDataset

Parameters

dataset_metric_metadataMetricMetadata

Metadata used to define and configure a metric

add_sdc(sdc_metadata: SDCMetaData, **kwargs: Any) None
classmethod deserialize(message: ProfileDatasetMessage) ProfileDataset
get_dataset_metric(dataset_metric_metadata: MetricMetadata) DatasetMetricBase

Get the dataset metric using metric metadata

Parameters

dataset_metric_metadataMetricMetadata

Name of the feature

Returns

DatasetMetricBase

get_sdc(sdc_metadata: SDCMetaData) ShareableDatasetComponent
serialize() ProfileDatasetMessage

mlm_insights.core.profiles.profile_header module

class mlm_insights.core.profiles.profile_header.ProfileHeader(tags: Dict[str, str])

Bases: object

Profile Header contains metadata about the profile. The information in header is non-merge-able and get recalculated everytime.

serialize() ProfileHeaderMessage
update_tags(tags: Dict[str, str]) None

mlm_insights.core.profiles.tags module

class mlm_insights.core.profiles.tags.Tag(key: str, value: str)

Bases: object

Data structure to represent Profile tag to allow clients to add custom metadata to profiles

key: str
value: str

Module contents