Workflows

Primitive components

Workflows are directed acyclic graphs (DAGs) of individual MUSES module executions, or “processes”, constructed using composite structures called “components” that consist of three primitive elements:

  • process: the fundamental unit of workflow unit that executes a single MUSES module

  • group: a set of components that execute in parallel

  • chain: a set of components that execute sequentially

Composition

A workflow specification is a data object with two key-value pairs: processes and components. Process names and component names share a namespace in which they must be unique.

Processes

Definition

The value of processes is an unordered list of process objects defining unique MUSES module configurations.

A process definition consists of

  • module: (string) Label that identifies which MUSES module to execute. The available values are defined in the Calculation Engine configuration file and can also be viewed on the module list webpage.

  • name: (string) Unique label within the workflow associated with this particular configuration of the specified module.

  • config: (object) An object specified by key-value pairs that defines the configuration of the specified module. See the module-specific documentation for details of the config object schema.

There can be several processes defined that execute the same module but with distinct configurations. In the examples below there are workflows in which the Lepton module is invoked several times but configured differently depending on which EoS module preceeds it.

Inputs

There are three options for providing external input files to a process (aside from the module config), illustrated in the example workflow snippet below. All inputs are specified under an inputs or pipes mapping, where the keys are named according to the declared label of the target module input. The schema for the input source spec varies according to the input option used as explained below.

The first process shows how to input an uploaded file. The uuid is an random string assigned by the CE upon upload that uniquely identifies it. If the upload is not owned by the user submitting the workflow, the upload must be set to public by the owner. The checksum string is the md5sum of the file, which must be known in advance. If the uploaded file does not match the specified checksum, the workflow will fail.

The second process shows how to input a file generated by a previously executed workflow. In general for this to work, the referenced job must be saved to avoid it being purged by the periodic garbage collection performed by the CE to conserve disk space. Because a job typically generates multiple output files, the path string is required to uniquely identify the desired file.

The third process shows how to input a file generated by a previous process in the current workflow. The module and process values must match the module name and process name of an item in the processes list to uniquely identify the process that generated the desired file. The label string indicated which of the consuming process’s inputs should receive the input file.

processes:
- name: crust_dft_eos
  module: crust_dft
  inputs:
    EOS_table:
      type: upload
      uuid: d1ed1c63-6192-4ac9-9cb1-a7d82dc27b72
      checksum: 164575f9d84c3ac087780e0219ee2e8a
  config:
    output_format: CSV

- name: lepton-crust_dft
  module: lepton
  inputs:
    input_eos:
      type: job
      uuid: 57388fe3-6932-4b45-b1d0-63463cc828ac
      path: /cmf/opt/output/CMF_output_for_Lepton_baryons.csv
  config:
    global:
      use_charge_neutrality: true

- name: qlimr-crust_dft
  module: qlimr
  pipes:
    eos:
      label: eos_beta_equilibrium
      module: lepton
      process: lepton-crust_dft
  config:
    inputs:
      R_start: 0.0004

Components

The value of components is an ordered list of component objects that are specified recursively. This means that the first component in the list may only reference processes, and subsequent component definitions may reference processes and/or components previously defined. Any component used in the definition of another component is called a subcomponent, and because components can only reference previously defined components in the list, there can be no circular references. Thus, the last component defined in this list is actually the entire workflow that is executed; any subcomponent defined but not recursively referenced within this top-level component is ignored.

A component definition consists of

  • type: (string) Either chain or group.

  • name: (string) Unique label within the workflow referencing this component in subsequent component definitions.

  • sequence/group: (list) A list of process or component names. If the component is type chain, then sequence is the key and the value is a list of subcomponents to be executed sequentially. If the component is type group, then group is the key and the value is a list of subcomponents to be executed in parallel.

Examples

The examples below use YAML format to define workflow configurations, because this format is easy to read for humans while supporting the rigorous syntax required to unambiguously define a data structure suitable for machines. Ultimately the workflow definition must be rendered in JSON format suitable for the Calculation Engine API, but as demonstrated in the tutorial, this conversion can be done transparently by any number of libraries such as Python requests.

Chain

A chain is a sequence of components that are executed in order, where previous components in the sequence must successfully complete before the next component is processed. Chain components are required when components are causally dependent on one another.

In the example below, the Chiral EFT module pipes its output to the Lepton module, which must only run if the first process completes successfully.

processes:
  - name: chiral_eft_eos
    module: chiral_eft
    config:
      run_name: 'test_chiral_eft_lepton'
      chiraleft_parameters:
        fitted_parameter_set: 'n3lo-450'
      calculation_options:
        use_multithreading: true
        use_quadratic_asymmetry_expansion: true
      eos_grid:
        density_start: 0.032
        density_end: 0.32
        density_step: 0.032
        isospin_asymmetry_start: 0.0
        isospin_asymmetry_end: 1.0
        isospin_asymmetry_step: 0.25
  - name: lepton-module
    module: lepton
    config:
      global:
        use_beta_equilibrium: true
        use_charge_neutrality: false
        verbose: 2
      output:
        output_derivatives: true
        output_hdf5: false
      particles:
        use_electron: true
        use_muon: true
    pipes:
      input_eos:
        label: ChEFT_Output_Lepton
        module: chiral_eft
        process: chiral_eft_eos
components:
  - type: chain
    name: workflow
    sequence:
      - chiral_eft_eos
      - lepton-module

Group

A group is set of components that are allowed to run in parallel. Concurrent execution is not actually guaranteed, however, because that depends on the Calculation Engine task queue system and dynamic worker load. Parallel here means that the output of the components in a group do not causally depend on one another.

In the example below, there are two chains that execute in parallel. One chain outputs the EoS generated by CMF to the Lepton module. The other chain outputs the EoS generated by Chiral EFT to an independent Lepton module process.

Note:

  • The name of the Lepton module processes must be unique for unambiguous reference when defining components.

  • The order of the process definitions in the processes block does not matter.

  • The order of the chain components (chain1 and chain2) in the group1 definition does not matter.

processes:
  - name: chiral_eft_eos
    module: chiral_eft
    config:
      run_name: 'test_chiral_eft_lepton'
      chiraleft_parameters:
        fitted_parameter_set: 'n3lo-450'
      calculation_options:
        use_multithreading: true
        use_quadratic_asymmetry_expansion: true
      eos_grid:
        density_start: 0.032
        density_end: 0.32
        density_step: 0.032
        isospin_asymmetry_start: 0.0
        isospin_asymmetry_end: 1.0
        isospin_asymmetry_step: 0.25
  - name: cmf
    module: cmf_solver
    config:
      variables:
        chemical_optical_potentials:
          muB_begin: 1000.0
          muB_end: 1400.0
          muB_step: 200.0
          muQ_begin: -300.0
          muQ_end: 0.0
          muQ_step: 150.0
      output_options:
        include_output_lepton: true
  - name: lepton1
    module: lepton
    config:
      global:
        use_beta_equilibrium: true
        use_charge_neutrality: false
    pipes:
      input_eos:
        label: ChEFT_Output_Lepton
        module: chiral_eft
        process: chiral_eft_eos
  - name: lepton2
    module: lepton
    config:
      global:
        use_beta_equilibrium: false
    pipes:
      input_eos:
        label: CMF_for_Lepton_baryons_only
        module: cmf_solver
        process: cmf
components:
  - type: chain
    name: chain1
    sequence:
      - chiral_eft_eos
      - lepton1
  - type: chain
    name: chain2
    sequence:
      - cmf
      - lepton2
  - type: group
    name: group1
    group:
      - chain1
      - chain2

Singleton

A so-called “singleton” workflow consists of a single process. This means that the workflow is identical whether a “group” or a “chain” component is defined.

processes:
  - name: cmf
    module: cmf_solver
    config:
      variables:
        chemical_optical_potentials:
          muB_begin: 1000.0
          muB_end: 1400.0
          muB_step: 200.0
          use_hyperons: false
          use_decuplet: false
          use_quarks: false
components:
  - type: group
    name: run_cmf_test
    group:
      - cmf

Complex workflow

An incomplete “sketch” of the config for the complex workflow depicted in the diagram below is provided here. The purpose is to illustrate how to break down the desired structure and construct it logically piece by piece.

It is left as an exercise for the reader to complete the missing “pipes” connecting the output of processes to the consuming processes and to add the missing config: specification for each process.

complex_workflow_example.png

processes:
  - name: chiral
    module: chiral_eft
  - name: cmf
    module: cmf_solver
  - name: crust
    module: crust_dft
  - name: lepton1
    module: lepton
    pipes:
      input_eos:
        label: CMF_for_Lepton_baryons_only
        module: cmf_solver
        process: cmf
  - name: lepton2
    module: lepton
    pipes:
      input_eos:
        label: CMF_for_Lepton_baryons_only
        module: cmf_solver
        process: cmf
  - name: lepton3
    module: lepton
    pipes:
      input_eos:
        label: ChEFT_Output_Lepton
        module: chiral_eft
        process: chiral
  - name: lepton4
    module: lepton
    pipes:
      input_eos:
        label: e4mma w/o lepton
        module: crust_dft
        process: crust
  - name: synthesis1
    module: synthesis
    pipes: {}
  - name: synthesis2
    module: synthesis
    pipes: {}
  - name: synthesis3
    module: synthesis
    pipes: {}
  - name: qlimr
    module: qlimr
    pipes:
      input_eos:
        label: CMF_for_Lepton_baryons_only
        module: cmf_solver
        process: cmf
  - name: flavor
    module: flavor_equilibration
    pipes: {}
components:
  - type: group
    name: group-leptons
    group:
      - lepton1
      - lepton2
  - type: chain
    name: chain-cmf-leptons
    sequence:
      - cmf
      - group-leptons
      - synthesis1
  - type: chain
    name: chain-chiral-lepton
    sequence:
      - chiral
      - lepton3
  - type: chain
    name: chain-crust-lepton
    sequence:
      - crust
      - lepton4
  - type: group
    name: group-chiral-crust
    group:
      - chain-chiral-lepton
      - chain-crust-lepton
  - type: chain
    name: chain-chiral-crust-synthesis
    sequence:
      - group-chiral-crust
      - synthesis2
  - type: group
    name: eos-syntheses
    group:
      - chain-cmf-leptons
      - chain-chiral-crust-synthesis
  - type: chain
    name: final-synthesis
    sequence:
      - eos-syntheses
      - synthesis3
  - type: group
    name: group-observables
    group:
      - qlimr
      - flavor
  - type: chain
    name: final-observables
    sequence:
      - final-synthesis
      - group-observables