seaflow.runners package

Submodules

seaflow.runners.dagrunner module

Module Introduction

This is a process pipeline and directed acyclic graph runtime class, with the main functions of using data caching and mounting objects to run process pipelines and directed acyclic graphs. The main technologies include networkx, metaprogramming, hook, and ray.

  • Design mode:

    simple factory mode

  • Key points:

    1. network

    2. Metaprogramming Technology

    3. hook technology

  • Main functions:

    1. Algorithmic arrangement

    2. Automatic optimization, including caching and heterogeneous parallel computing

Usage examples

Class Description

class seaflow.runners.dagrunner.DAGBaseRunner(scheduler, catalog, hook_manager)

Bases: object

Class Introduction:

This is an abstract class of Directed acyclic graph process scheduling runtime, which mainly uses data caching and mounting objects to run process scheduling objects. The main technical operations are instantiation.

Attribute Function:

Define an initialization method for process scheduling and running class attributes.

Parameters:
  • scheduler (object) - Process scheduling object.

  • catalog (object) - Directory Object.

  • hook_manager (object) - Mount Object.

abstract classmethod execute(is_release)

Method Function:

Define an abstract method for executing process scheduling.

Parameters:
  • is_release (bool) - Whether to clear the cache after executing the node.

class seaflow.runners.dagrunner.DAGRunnerFunction(scheduler=<seaflow.schedulers.dag.DAGer object>, catalog=<seaflow.ios.io.LocalDataCatalog object>, hook_manager=<seaflow.hooks.hook.HookManager object>)

Bases: DAGBaseRunner

Class Introduction:

This is a directed acyclic graph executor that schedules execution in a parameterless function manner. Its main functions include algorithm orchestration, automatic cache optimization, and parallel computing. Its main technologies include networkx, ray, DAG, and metaprogramming techniques.

Attribute Function:

Define an initialization method for process pipeline runtime class properties, inherited from BaseRunne.

Parameters:

-scheduler (object) - Process scheduling object. -catalog (object) - Directory Object. -hook_manager (object) - Mount Object. -done_nodes (object) - Completed node collection. -remaining_nodes (object) - Remaining node set.

execute(init_method='kahn', dispath_method='kahn', exec_method='ray', is_release=True)

Method Function:

Define a method for executing algorithm DAG.

Parameters:
  • cuber_controller (object) - The target controller, here is the instance itself self.

  • init_method (str) - Initialization method.

  • dispath_method (str) - Scheduling methods.

  • exec_method (str) - operating method.

Returns:

  • result (str) - Execution success result information.

execute_algorithm_with_ray(run_list)

Method Function:

Define an execution method using the ray computing engine, including three steps: collecting parameters, executing, and pushing results.

Parameters:
  • run_list (list) - Single Run Task List.

Returns:

  • result (str) - Execution success result information.

class seaflow.runners.dagrunner.DAGRunnerNode(scheduler=<seaflow.schedulers.dag.DAGer object>, catalog=<seaflow.ios.io.LocalDataCatalog object>, hook_manager=<seaflow.hooks.hook.HookManager object>)

Bases: DAGBaseRunner

Class Introduction:

This is a directed acyclic graph executor that schedules execution in a node object manner that supports parameter dependency. Its main functions include algorithm orchestration, automatic optimization of caching, and parallel computing. Its main technologies include networkx, ray, DAG, and metaprogramming techniques.

Attribute Function:

Define an initialization method for process pipeline runtime class properties, inherited from BaseRunne.

Parameters:

-scheduler (object) - Process scheduling object. -catalog (object) - Directory Object. -hook_manager (object) - Mount Object. -done_nodes (object) - Completed node collection. -remaining_nodes (object) - Remaining node set.

execute(init_method='kahn', dispath_method='kahn', exec_method='ray', is_release=True)

Method Function:

Define a method for executing algorithm DAG.

Parameters:
  • cuber_controller (object) - The target controller, here is the instance itself self.

  • init_method (str) - Initialization method.

  • dispath_method (str) - Scheduling methods.

  • exec_method (str) - operating method.

Returns:

  • result (str) - Execution success result information.

execute_algorithm_with_ray(run_list)

Method Function:

Define an execution method using the ray computing engine, including three steps: collecting parameters, executing, and pushing results.

Parameters:
  • run_list (list) - Single Run Task List.

Returns:

  • result (str) - Execution success result information.

seaflow.runners.sequencerunner module

Module Introduction

This is a process scheduling and running class, with the main functions of using data caching and mounting objects to run process pipeline objects. The main technical operations are instantiation.

  • Design mode:

    nothing

  • Key points:

    1. Data caching and hook technology

  • Main functions:

    1. Process Scheduling Runner

Usage examples

Class Description

class seaflow.runners.sequencerunner.BaseRunner(scheduler, catalog, hook_manager)

Bases: object

Class Introduction:

This is an abstract class of process scheduling runtime, which mainly uses data caching and mounting objects to run process scheduling objects. The main technical operations are instantiation.

Attribute Function:

Define an initialization method for process scheduling and running class attributes.

Parameters:
  • scheduler (object) - Process scheduling object.

  • catalog (object) - Directory Object.

  • hook_manager (object) - Mount Object.

abstract classmethod execute(is_release)

Method Function:

Define an abstract method for executing process scheduling.

Parameters:
  • is_release (bool) - Whether to clear the cache after executing the node.

class seaflow.runners.sequencerunner.SequentialRunner(scheduler, catalog, hook_manager)

Bases: BaseRunner

Class Introduction:

This is a specific implementation class for a process pipeline runner, which mainly uses data caching and mounting objects to run process pipeline objects. The main technical operations are instantiation.

Attribute Function:

Define an initialization method for process pipeline runtime class properties, inherited from BaseRunne.

Parameters:

-scheduler (object) - Process scheduling object. -catalog (object) - Directory Object. -hook_manager (object) - Mount Object. -done_nodes (object) - Completed node collection. -remaining_nodes (object) - Remaining node set.

execute(is_release=True)

Method Function:

Defining a specific implementation method for executing a process pipeline requires the use of scheduler objects, catalog objects, and hooks_ Manager object.

Parameters:
  • is_release (bool) - Whether to clear the cache after executing the node.

Returns:

  • result (str) - Running result information.

Module contents