getml.engine

This module is a collection of utility functions for the overall communication and the session management of the getML Engine.

In order to log into the Engine, you have to open your favorite internet browser and enter http://localhost:1709 in the navigation bar. This tells it to connect to a local TCP socket at port 1709 opened by the getML Monitor. This will only be possible from within the same device!

Example

First of all, you need to start the getML Engine. Next, you need to create a new project or load an existing one.

getml.engine.list_projects()
getml.engine.set_project('test')

After doing all calculations for today you can shut down the getML Engine.

print(getml.engine.is_alive())
getml.engine.shutdown()

Note

The Python process and the getML Engine must be located on the same machine. If you intend to run the Engine on a remote host, make sure to start your Python session on that device as well. Also, when using SSH sessions, make sure to start Python using python & followed by disown or using nohup python. This ensures the Python process and all the script it has to run won't be killed the moment your remote connection becomes unstable, and you are able to recover them later on (see remote_access).

All data frame objects and models in the getML Engine are bundled in projects. When loading an existing project, the current memory of the Engine will be flushed and all changes applied to DataFrame instances after calling their save method will be lost. Afterwards, all Pipeline will be loaded into memory automatically. The data frame objects will not be loaded automatically since they consume significantly more memory than the pipelines. They can be loaded manually using load_data_frame or load.

The getML Engine reflects the separation of data into individual projects on the level of the filesystem too. All data belonging to a single project is stored in a dedicated folder in the 'projects' directory located in '.getML' in your home folder. These projects can be copied and shared between different platforms and architectures without any loss of information. However, you must copy the entire project and not just individual data frames or pipelines.

delete_project

delete_project(name: str)

Deletes a project.

PARAMETER	DESCRIPTION
`name`	Name of your project. TYPE: `str`

Note

All data and models contained in the project directory will be permanently lost.

Source code in getml/engine/helpers.py

def delete_project(name: str):
    """Deletes a project.

    Args:
        name:
            Name of your project.

    Note:
        All data and models contained in the project directory will be
        permanently lost.

    """
    _delete_project(name)

is_alive

is_alive() -> bool

Checks if an instance of the getML Engine is running.

RETURNS	DESCRIPTION
`bool`	`True` if the getML Engine is running and ready to accept commands and
`bool`	`False` otherwise.

Source code in getml/communication.py

def is_engine_alive() -> bool:
    """
    Checks if an instance of the getML Engine is running.

    Returns:
        `True` if the getML Engine is running and ready to accept commands and
        `False` otherwise.

    """

    # no engine without monitor/cli
    if not is_monitor_alive():
        return False

    if not _list_projects_impl(running_only=True):
        return False

    try:
        with send_and_get_socket({"type_": "is_alive"}):
            return True
    except ConnectionRefusedError:
        return False

is_engine_alive

is_engine_alive() -> bool

Checks if an instance of the getML Engine is running.

RETURNS	DESCRIPTION
`bool`	`True` if the getML Engine is running and ready to accept commands and
`bool`	`False` otherwise.

Source code in getml/communication.py

def is_engine_alive() -> bool:
    """
    Checks if an instance of the getML Engine is running.

    Returns:
        `True` if the getML Engine is running and ready to accept commands and
        `False` otherwise.

    """

    # no engine without monitor/cli
    if not is_monitor_alive():
        return False

    if not _list_projects_impl(running_only=True):
        return False

    try:
        with send_and_get_socket({"type_": "is_alive"}):
            return True
    except ConnectionRefusedError:
        return False

launch

launch(
    allow_push_notifications: bool = True,
    allow_remote_ips: bool = False,
    home_directory: Optional[str] = None,
    http_port: Optional[int] = None,
    in_memory: bool = True,
    install: bool = False,
    launch_browser: bool = True,
    log: bool = False,
    project_directory: Optional[
        Union[PathLike, str]
    ] = None,
    proxy_url: Optional[str] = None,
    token: Optional[str] = None,
    quiet: bool = False,
)

Launches the getML Engine.

PARAMETER	DESCRIPTION
`allow_push_notifications`	Whether you want to allow the getML Monitor to send push notifications to your desktop. TYPE: `bool` DEFAULT: `True`
`allow_remote_ips`	Whether you want to allow remote IPs to access the http-port. TYPE: `bool` DEFAULT: `False`
`home_directory`	The directory which should be treated as the home directory by getML. getML will create a hidden folder named '.getML' in said directory. All binaries will be installed there. TYPE: `Optional[str]` DEFAULT: `None`
`http_port`	The local port of the getML Monitor. This port can only be accessed from your local computer, unless you set `allow_remote_ips=True`. TYPE: `Optional[int]` DEFAULT: `None`
`in_memory`	Whether you want the Engine to process everything in memory. TYPE: `bool` DEFAULT: `True`
`install`	Reinstalls getML, even if it is already installed. TYPE: `bool` DEFAULT: `False`
`launch_browser`	Whether you want to automatically launch your browser. TYPE: `bool` DEFAULT: `True`
`log`	Whether you want the Engine log to appear in the logfile (more detailed logging). The Engine log also appears in the 'Log' page of the Monitor. TYPE: `bool` DEFAULT: `False`
`project_directory`	The directory in which to store all of your projects. TYPE: `Optional[Union[PathLike, str]]` DEFAULT: `None`
`proxy_url`	The URL of any proxy server that that redirects to the getML Monitor. TYPE: `Optional[str]` DEFAULT: `None`
`token`	The token used for authentication. Authentication is required when remote IPs are allowed to access the Monitor. If authentication is required and no token is passed, a random hexcode will be generated as the token. TYPE: `Optional[str]` DEFAULT: `None`
`quiet`	Whether to suppress output. TYPE: `bool` DEFAULT: `False`

Source code in getml/engine/_launch.py

def launch(
    allow_push_notifications: bool = True,
    allow_remote_ips: bool = False,
    home_directory: Optional[str] = None,
    http_port: Optional[int] = None,
    in_memory: bool = True,
    install: bool = False,
    launch_browser: bool = True,
    log: bool = False,
    project_directory: Optional[Union[PathLike, str]] = None,
    proxy_url: Optional[str] = None,
    token: Optional[str] = None,
    quiet: bool = False,
):
    """
    Launches the getML Engine.

    Args:
      allow_push_notifications:
        Whether you want to allow the getML Monitor to send push notifications to your desktop.

      allow_remote_ips:
        Whether you want to allow remote IPs to access the http-port.

      home_directory:
        The directory which should be treated as the home directory by getML.
        getML will create a hidden folder named '.getML' in said directory.
        All binaries will be installed there.

      http_port:
        The local port of the getML Monitor.
        This port can only be accessed from your local computer,
        unless you set `allow_remote_ips=True`.

      in_memory:
        Whether you want the Engine to process everything in memory.

      install:
        Reinstalls getML, even if it is already installed.

      launch_browser:
        Whether you want to automatically launch your browser.

      log:
        Whether you want the Engine log to appear in the logfile (more detailed logging).
        The Engine log also appears in the 'Log' page of the Monitor.

      project_directory:
        The directory in which to store all of your projects.

      proxy_url:
        The URL of any proxy server that that redirects to the getML Monitor.

      token:
        The token used for authentication.
        Authentication is required when remote IPs are allowed to access the Monitor.
        If authentication is required and no token is passed,
        a random hexcode will be generated as the token.

      quiet:
        Whether to suppress output.
    """

    if comm.is_monitor_alive():
        print("getML Engine is already running.")
        return
    if (os_ := platform.system().lower()) not in NATIVELY_SUPPORTED_OSES:
        raise OSError(
            OS_NOT_SUPPORTED_NATIVELY_ERROR_MSG_TEMPLATE.format(
                os=os_,
                docker_docs_url=DOCKER_DOCS_URL,
                compose_file_url=COMPOSE_FILE_URL,
            )
        )
    executable_path = locate_executable()
    if not executable_path:
        raise OSError(
            COULD_NOT_FIND_EXECUTABLE_ERROR_MSG_TEMPLATE.format(
                install_locations=[str(p) for p in INSTALL_LOCATIONS],
                install_docs_url=INSTALL_DOCS_URL,
            )
        )
    getml_dir = (
        Path(home_directory) if home_directory is not None else Path.home() / ".getML"
    )
    project_dir = (
        Path(project_directory)
        if project_directory is not None
        else getml_dir / "projects"
    )
    log_dir = create_log_dir(getml_dir)
    log_file = log_dir / f"getml_{datetime.now():%Y%m%d%H%M%S}.log"
    cmd = _Options(
        allow_push_notifications=allow_push_notifications,
        allow_remote_ips=allow_remote_ips,
        home_directory=getml_dir,
        http_port=http_port,
        in_memory=in_memory,
        install=install,
        launch_browser=launch_browser,
        log=log,
        project_directory=project_dir,
        proxy_url=proxy_url,
        token=token,
    ).to_cmd()
    if not quiet:
        print(f"Launching {' '.join(cmd)} in {executable_path.parent}...")
    subprocess.Popen(
        cmd,
        cwd=executable_path.parent,
        shell=False,
        stdout=log_file.open("w"),
        stderr=subprocess.STDOUT,
    )
    for _ in range(int(MAX_LAUNCH_WAIT_TIME / HEALTH_CHECK_INTERVAL)):
        if comm.is_monitor_alive():
            if not quiet:
                print(
                    f"Launched the getML Engine. The log output will be stored in {log_file}"
                )
            return
        sleep(HEALTH_CHECK_INTERVAL)
    raise TimeoutError(
        ENGINE_DID_NOT_RESPOND_IN_TIME_ERROR_MSG_TEMPLATE.format(log_file=log_file)
    )

list_projects

list_projects() -> List[str]

List all projects on the getML Engine.

RETURNS	DESCRIPTION
`List[str]`	Lists the name of all the projects.

Source code in getml/engine/helpers.py

def list_projects() -> List[str]:
    """
    List all projects on the getML Engine.

    Returns:
            Lists the name of all the projects.
    """
    return _list_projects_impl(running_only=False)

list_running_projects

list_running_projects() -> List[str]

List all projects on the getML Engine that are currently running.

RETURNS	DESCRIPTION
`List[str]`	Lists the name of all the projects currently running.

Source code in getml/engine/helpers.py

def list_running_projects() -> List[str]:
    """
    List all projects on the getML Engine that are currently running.

    Returns:
        Lists the name of all the projects currently running.
    """
    return _list_projects_impl(running_only=True)

set_project

set_project(name: str)

Creates a new project or loads an existing one.

If there is no project called name present on the Engine, a new one will be created.

PARAMETER	DESCRIPTION
`name`	Name of the new project. TYPE: `str`

Source code in getml/engine/helpers.py

def set_project(name: str):
    """Creates a new project or loads an existing one.

    If there is no project called `name` present on the Engine, a new one will
    be created.

    Args:
           name: Name of the new project.
    """
    _set_project(name)

shutdown

shutdown()

Shuts down the getML Engine.

Warning

All changes applied to the DataFrame after calling their save method will be lost.

Source code in getml/engine/helpers.py

def shutdown():
    """Shuts down the getML Engine.

    Warning:
        All changes applied to the [`DataFrame`][getml.DataFrame]
        after calling their [`save`][getml.DataFrame.save]
        method will be lost.

    """
    _shutdown()

suspend_project

suspend_project(name: str)

Suspends a project that is currently running.

PARAMETER	DESCRIPTION
`name`	Name of your project. TYPE: `str`

Source code in getml/engine/helpers.py

def suspend_project(name: str):
    """Suspends a project that is currently running.

    Args:
        name:
            Name of your project.
    """
    _suspend_project(name)