Workflows
Primitive components
Workflows are directed acyclic graphs (DAGs) of individual MUSES module executions, or “processes”, constructed using composite structures called “components” that consist of three primitive elements:
process: the fundamental unit of workflow unit that executes a single MUSES module
group: a set of components that execute in parallel
chain: a set of components that execute sequentially
Composition
A workflow specification is a data object with two key-value pairs: processes
and components
. Process names and component names share a namespace in which they must be unique.
Processes
Definition
The value of processes
is an unordered list of process objects defining unique MUSES module configurations.
A process definition consists of
module
: (string) Label that identifies which MUSES module to execute. The available values are defined in the Calculation Engine configuration file and can also be viewed on the module list webpage.name
: (string) Unique label within the workflow associated with this particular configuration of the specified module.config
: (object) An object specified by key-value pairs that defines the configuration of the specified module. See the module-specific documentation for details of the config object schema.
There can be several processes defined that execute the same module but with distinct configurations. In the examples below there are workflows in which the Lepton module is invoked several times but configured differently depending on which EoS module preceeds it.
Inputs
There are three options for providing external input files to a process (aside from the module config), illustrated in the example workflow snippet below. All inputs are specified under an inputs
or pipes
mapping, where the keys are named according to the declared label of the target module input. The schema for the input source spec varies according to the input option used as explained below.
The first process shows how to input an uploaded file. The uuid
is an random string assigned by the CE upon upload that uniquely identifies it. If the upload is not owned by the user submitting the workflow, the upload must be set to public by the owner. The checksum
string is the md5sum of the file, which must be known in advance. If the uploaded file does not match the specified checksum, the workflow will fail.
The second process shows how to input a file generated by a previously executed workflow. In general for this to work, the referenced job must be saved to avoid it being purged by the periodic garbage collection performed by the CE to conserve disk space. Because a job typically generates multiple output files, the path
string is required to uniquely identify the desired file.
The third process shows how to input a file generated by a previous process in the current workflow. The module
and process
values must match the module name and process name of an item in the processes
list to uniquely identify the process that generated the desired file. The label
string indicated which of the consuming process’s inputs should receive the input file.
processes:
- name: crust_dft_eos
module: crust_dft
inputs:
EOS_table:
type: upload
uuid: d1ed1c63-6192-4ac9-9cb1-a7d82dc27b72
checksum: 164575f9d84c3ac087780e0219ee2e8a
config:
output_format: CSV
- name: lepton-crust_dft
module: lepton
inputs:
input_eos:
type: job
uuid: 57388fe3-6932-4b45-b1d0-63463cc828ac
path: /cmf/opt/output/CMF_output_for_Lepton_baryons.csv
config:
global:
use_charge_neutrality: true
- name: qlimr-crust_dft
module: qlimr
pipes:
eos:
label: eos_beta_equilibrium
module: lepton
process: lepton-crust_dft
config:
inputs:
R_start: 0.0004
Components
The value of components
is an ordered list of component objects that are specified recursively. This means that the first component in the list may only reference processes, and subsequent component definitions may reference processes and/or components previously defined. Any component used in the definition of another component is called a subcomponent, and because components can only reference previously defined components in the list, there can be no circular references. Thus, the last component defined in this list is actually the entire workflow that is executed; any subcomponent defined but not recursively referenced within this top-level component is ignored.
A component definition consists of
type
: (string) Eitherchain
orgroup
.name
: (string) Unique label within the workflow referencing this component in subsequent component definitions.sequence
/group
: (list) A list of process or component names. If the component is typechain
, thensequence
is the key and the value is a list of subcomponents to be executed sequentially. If the component is typegroup
, thengroup
is the key and the value is a list of subcomponents to be executed in parallel.
Examples
The examples below use YAML format to define workflow configurations, because this format is easy to read for humans while supporting the rigorous syntax required to unambiguously define a data structure suitable for machines. Ultimately the workflow definition must be rendered in JSON format suitable for the Calculation Engine API, but as demonstrated in the tutorial, this conversion can be done transparently by any number of libraries such as Python requests
.
Chain
A chain is a sequence of components that are executed in order, where previous components in the sequence must successfully complete before the next component is processed. Chain components are required when components are causally dependent on one another.
In the example below, the Chiral EFT module pipes its output to the Lepton module, which must only run if the first process completes successfully.
processes:
- name: chiral_eft_eos
module: chiral_eft
config:
run_name: 'test_chiral_eft_lepton'
chiraleft_parameters:
fitted_parameter_set: 'n3lo-450'
calculation_options:
use_multithreading: true
use_quadratic_asymmetry_expansion: true
eos_grid:
density_start: 0.032
density_end: 0.32
density_step: 0.032
isospin_asymmetry_start: 0.0
isospin_asymmetry_end: 1.0
isospin_asymmetry_step: 0.25
- name: lepton-module
module: lepton
config:
global:
use_beta_equilibrium: true
use_charge_neutrality: false
verbose: 2
output:
output_derivatives: true
output_hdf5: false
particles:
use_electron: true
use_muon: true
pipes:
input_eos:
label: ChEFT_Output_Lepton
module: chiral_eft
process: chiral_eft_eos
components:
- type: chain
name: workflow
sequence:
- chiral_eft_eos
- lepton-module
Group
A group is set of components that are allowed to run in parallel. Concurrent execution is not actually guaranteed, however, because that depends on the Calculation Engine task queue system and dynamic worker load. Parallel here means that the output of the components in a group do not causally depend on one another.
In the example below, there are two chains that execute in parallel. One chain outputs the EoS generated by CMF to the Lepton module. The other chain outputs the EoS generated by Chiral EFT to an independent Lepton module process.
Note:
The name of the Lepton module processes must be unique for unambiguous reference when defining components.
The order of the process definitions in the
processes
block does not matter.The order of the chain components (
chain1
andchain2
) in thegroup1
definition does not matter.
processes:
- name: chiral_eft_eos
module: chiral_eft
config:
run_name: 'test_chiral_eft_lepton'
chiraleft_parameters:
fitted_parameter_set: 'n3lo-450'
calculation_options:
use_multithreading: true
use_quadratic_asymmetry_expansion: true
eos_grid:
density_start: 0.032
density_end: 0.32
density_step: 0.032
isospin_asymmetry_start: 0.0
isospin_asymmetry_end: 1.0
isospin_asymmetry_step: 0.25
- name: cmf
module: cmf_solver
config:
variables:
chemical_optical_potentials:
muB_begin: 1000.0
muB_end: 1400.0
muB_step: 200.0
muQ_begin: -300.0
muQ_end: 0.0
muQ_step: 150.0
output_options:
include_output_lepton: true
- name: lepton1
module: lepton
config:
global:
use_beta_equilibrium: true
use_charge_neutrality: false
pipes:
input_eos:
label: ChEFT_Output_Lepton
module: chiral_eft
process: chiral_eft_eos
- name: lepton2
module: lepton
config:
global:
use_beta_equilibrium: false
pipes:
input_eos:
label: CMF_for_Lepton_baryons_only
module: cmf_solver
process: cmf
components:
- type: chain
name: chain1
sequence:
- chiral_eft_eos
- lepton1
- type: chain
name: chain2
sequence:
- cmf
- lepton2
- type: group
name: group1
group:
- chain1
- chain2
Singleton
A so-called “singleton” workflow consists of a single process. This means that the workflow is identical whether a “group” or a “chain” component is defined.
processes:
- name: cmf
module: cmf_solver
config:
variables:
chemical_optical_potentials:
muB_begin: 1000.0
muB_end: 1400.0
muB_step: 200.0
use_hyperons: false
use_decuplet: false
use_quarks: false
components:
- type: group
name: run_cmf_test
group:
- cmf
Complex workflow
An incomplete “sketch” of the config for the complex workflow depicted in the diagram below is provided here. The purpose is to illustrate how to break down the desired structure and construct it logically piece by piece.
It is left as an exercise for the reader to complete the missing “pipes” connecting the output of processes to the consuming processes and to add the missing config:
specification for each process.
processes:
- name: chiral
module: chiral_eft
- name: cmf
module: cmf_solver
- name: crust
module: crust_dft
- name: lepton1
module: lepton
pipes:
input_eos:
label: CMF_for_Lepton_baryons_only
module: cmf_solver
process: cmf
- name: lepton2
module: lepton
pipes:
input_eos:
label: CMF_for_Lepton_baryons_only
module: cmf_solver
process: cmf
- name: lepton3
module: lepton
pipes:
input_eos:
label: ChEFT_Output_Lepton
module: chiral_eft
process: chiral
- name: lepton4
module: lepton
pipes:
input_eos:
label: e4mma w/o lepton
module: crust_dft
process: crust
- name: synthesis1
module: synthesis
pipes: {}
- name: synthesis2
module: synthesis
pipes: {}
- name: synthesis3
module: synthesis
pipes: {}
- name: qlimr
module: qlimr
pipes:
input_eos:
label: CMF_for_Lepton_baryons_only
module: cmf_solver
process: cmf
- name: flavor
module: flavor_equilibration
pipes: {}
components:
- type: group
name: group-leptons
group:
- lepton1
- lepton2
- type: chain
name: chain-cmf-leptons
sequence:
- cmf
- group-leptons
- synthesis1
- type: chain
name: chain-chiral-lepton
sequence:
- chiral
- lepton3
- type: chain
name: chain-crust-lepton
sequence:
- crust
- lepton4
- type: group
name: group-chiral-crust
group:
- chain-chiral-lepton
- chain-crust-lepton
- type: chain
name: chain-chiral-crust-synthesis
sequence:
- group-chiral-crust
- synthesis2
- type: group
name: eos-syntheses
group:
- chain-cmf-leptons
- chain-chiral-crust-synthesis
- type: chain
name: final-synthesis
sequence:
- eos-syntheses
- synthesis3
- type: group
name: group-observables
group:
- qlimr
- flavor
- type: chain
name: final-observables
sequence:
- final-synthesis
- group-observables