Skip to content

getml_mlflow.marshalling

pipeline.download_artifact_pipeline

Downloads a getML pipeline artifact from an MLflow run and saves it as a new project.

This function downloads a pipeline artifact from MLflow and creates a new project with a name derived from the original project name and the pipeline ID: "original_project_name-pipeline_id".

If the project already exists (e.g., when calling this function multiple times with the same parameters), the existing project will be overwritten with the downloaded artifacts.

Experimental feature

This feature is experimental and may change in future releases.

PARAMETER DESCRIPTION
mlflow_client

An MLflow client instance to interact with MLflow.

TYPE: MlflowClient

run_id

The ID of the MLflow run containing the pipeline artifacts.

TYPE: str

pipeline_id

The ID of the pipeline to be downloaded.

TYPE: str

original_project_name

The name of the original getML project the pipeline was saved from. If None, uses the current project name.

TYPE: Optional[str] DEFAULT: None

projects_path

Path where getML projects are stored. Defaults to $HOME/.getML/projects.

TYPE: Path DEFAULT: DEFAULT_GETML_PROJECTS_PATH

RETURNS DESCRIPTION
Tuple[str, str]

A tuple containing: - The name of the newly created getML project - The ID of the downloaded pipeline

Example
# Initialize MLflow client
client = MlflowClient(tracking_uri="http://localhost:5000")

run_id = "abcdef1234567890"
pipeline_id = "l2TCiD"

# Download pipeline artifact from a specific run. This creates a new project 
# named "interstate94-l2TCiD" with the pipeline
new_project, pipeline_id = getml_mlflow.marshalling.pipeline.download_artifact_pipeline(
    client, run_id, pipeline_id, original_project_name="interstate94"
    )

# You can now switch to the new project and load the pipeline
getml.project.set_project(new_project)
pipeline = getml.pipeline.load(pipeline_id)
Source code in getml_mlflow/marshalling/pipeline.py
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
def download_artifact_pipeline(
    mlflow_client: MlflowClient,
    run_id: str,
    pipeline_id: str,
    *,
    original_project_name: Optional[str] = None,
    projects_path: Path = DEFAULT_GETML_PROJECTS_PATH,
) -> Tuple[str, str]:
    """
    Downloads a getML pipeline artifact from an MLflow run and saves it as a new project.

    This function downloads a pipeline artifact from MLflow and creates a new project with
    a name derived from the original project name and the pipeline ID: 
    "original_project_name-pipeline_id".

    If the project already exists (e.g., when calling this function multiple times with 
    the same parameters), the existing project will be overwritten with the downloaded 
    artifacts.

    ??? warning "Experimental feature"
        This feature is experimental and may change in future releases.

    Args:
        mlflow_client: An MLflow client instance to interact with MLflow.

        run_id: The ID of the MLflow run containing the pipeline artifacts.

        pipeline_id: The ID of the pipeline to be downloaded.

        original_project_name: The name of the original getML project the pipeline was 
            saved from. If None, uses the current project name.

        projects_path: Path where getML projects are stored. Defaults to 
            `$HOME/.getML/projects`.

    Returns:
        A tuple containing:
            - The name of the newly created getML project
            - The ID of the downloaded pipeline

    Example:
        ```python
        # Initialize MLflow client
        client = MlflowClient(tracking_uri="http://localhost:5000")

        run_id = "abcdef1234567890"
        pipeline_id = "l2TCiD"

        # Download pipeline artifact from a specific run. This creates a new project 
        # named "interstate94-l2TCiD" with the pipeline
        new_project, pipeline_id = getml_mlflow.marshalling.pipeline.download_artifact_pipeline(
            client, run_id, pipeline_id, original_project_name="interstate94"
            )

        # You can now switch to the new project and load the pipeline
        getml.project.set_project(new_project)
        pipeline = getml.pipeline.load(pipeline_id)
        ```

    """

    if original_project_name is None:
        original_project_name = getml.project.name

    with TemporaryDirectory() as temp_dir:
        mlflow_client.download_artifacts(
            run_id, f"pipeline/{original_project_name}", temp_dir
        )
        new_project_name: str = f"{original_project_name}-{pipeline_id}"
        project_path: Path = projects_path / new_project_name
        temp_project_path: Path = Path(temp_dir) / "pipeline" / original_project_name
        temp_project_path.rename(project_path)

    return (new_project_name, pipeline_id)

pipeline.switch_to_artifact_pipeline

Downloads an artifact pipeline from MLflow, switches to the newly created project, and loads the pipeline.

This function simplifies the workflow of retrieving a pipeline stored as an MLflow artifact. It downloads the pipeline into a new getML project (named as "original_project_name-pipeline_id"), automatically switches to that project, and loads the pipeline for immediate use.

Experimental feature

This function is experimental and may change in future releases.

PARAMETER DESCRIPTION
mlflow_client

The MLflow client instance to use for retrieving the artifact.

TYPE: MlflowClient

run_id

The ID of the MLflow run containing the pipeline artifact.

TYPE: str

pipeline_id

The ID of the pipeline to download.

TYPE: str

original_project_name

The name of the original project. If None, the current project name is used. Defaults to None.

TYPE: Optional[str] DEFAULT: None

projects_path

Path to the getML projects directory. Defaults to $HOME/.getML/projects.

TYPE: Path DEFAULT: DEFAULT_GETML_PROJECTS_PATH

RETURNS DESCRIPTION
Pipeline

The loaded pipeline object in the newly created project.

TYPE: Pipeline

Examples:

import mlflow
from mlflow.tracking import MlflowClient
import getml_mlflow
import getml

# Connect to MLflow
client = MlflowClient("http://localhost:5000")

# Download pipeline from run and switch to new project
pipeline = getml_mlflow.marshalling.pipeline.switch_to_artifact_pipeline(
           client,
           "2960ee40202744daa64aa83d180f0b2f",
           "uPe3hR"
        )

# Pipeline is ready to use
predictions = pipeline.predict(container.test)
Source code in getml_mlflow/marshalling/pipeline.py
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
def switch_to_artifact_pipeline(
    mlflow_client: MlflowClient,
    run_id: str,
    pipeline_id: str,
    *,
    original_project_name: Optional[str] = None,
    projects_path: Path = DEFAULT_GETML_PROJECTS_PATH,
) -> Pipeline:
    """
    Downloads an artifact pipeline from MLflow, switches to the newly created project,
    and loads the pipeline.

    This function simplifies the workflow of retrieving a pipeline stored as an MLflow 
    artifact. It downloads the pipeline into a new getML project (named as 
    "original_project_name-pipeline_id"), automatically switches to that project, and 
    loads the pipeline for immediate use.

    ??? warning "Experimental feature"
        This function is experimental and may change in future releases.

    Args:
        mlflow_client: The MLflow client instance to use for retrieving the artifact.

        run_id: The ID of the MLflow run containing the pipeline artifact.

        pipeline_id: The ID of the pipeline to download.

        original_project_name: The name of the original project. If None, the current 
            project name is used. Defaults to None.

        projects_path: Path to the getML projects directory. Defaults to 
            `$HOME/.getML/projects`.

    Returns:
        Pipeline: The loaded pipeline object in the newly created project.

    Examples:
        ```python
        import mlflow
        from mlflow.tracking import MlflowClient
        import getml_mlflow
        import getml

        # Connect to MLflow
        client = MlflowClient("http://localhost:5000")

        # Download pipeline from run and switch to new project
        pipeline = getml_mlflow.marshalling.pipeline.switch_to_artifact_pipeline(
                   client,
                   "2960ee40202744daa64aa83d180f0b2f",
                   "uPe3hR"
                )

        # Pipeline is ready to use
        predictions = pipeline.predict(container.test)
        ```

    """
    if original_project_name is None:
        original_project_name = getml.project.name

    project_name, pipeline_id = download_artifact_pipeline(
        mlflow_client=mlflow_client,
        run_id=run_id,
        pipeline_id=pipeline_id,
        original_project_name=original_project_name,
        projects_path=projects_path,
    )
    getml.project.switch(project_name)
    return getml.pipeline.load(pipeline_id)