Modules

A MUSES Module is scientific calculation software packaged according to the MUSES framework for compatibility with the Calculation Engine (CE) such that it can be used in processing workflows.

Note

In this document, “Module” will be capitalized to distinguish it from the generic computer science term and avoid confusion when discussing, for example, Python modules.

Glossary

  • script - a file whose statements are executed line-by-line to take specific actions or generate particular outputs.

  • module - a collection of related functions and classes that are reusable. Multiple scripts may import the same module to reduce redundant code.

  • library - a collection of modules and scripts with a well-defined interface designed to provide a specific set of capabilities.

  • application - similar to a library, but is executed to provide a service or perform a set of tasks.

  • package - a self-contained unit of software that can be installed, typically via a package manager (apt, brew, pip, npm). A package can install a library or an application.

MUSES framework and Module registration

The MUSES framework comprises a set of requirements for packaging a calculation library into a Module such that it can be included in workflows executed by the Calculation Engine. The framework is expressed here as instructions for how to register a Module with the CE.

Registering a Module requires opening a merge request on the CE source code repo to update the /config/config.yaml file with specific information as detailed below.

Create a list item under modules declaring the name of your Module

modules:
- name: "lepton"

The subsequent explanations will not show the full indentation but will assume key-value pairs in parallel with the Module name.

Container image

When running workflows, the CE runs Module images as containers and does not actually use the source code directly. This disentangles the runtime software environment of the CE and the Modules and improves consistency of operation across computing platforms. In an section called image, specify the container registry, repository, and tag where your Module image can be downloaded. It must be publicly accessible.

image:
  registry: "registry.gitlab.com"
  repo: "nsf-muses/module-cmf/lepton-module"
  tag: "v0.9.2"

Source code repo

Specify the Git repo URL and immutable tag/commit hosting the source code to your Module. This must the source code built into your Module image. You must be able to reference a CI/CD pipeline that produced the image. External manually built images are not acceptable. If necessary, the CE developers can build the image from source themselves and push it to a public registry.

source:
  url: "https://gitlab.com/nsf-muses/module-cmf/lepton-module"
  targetRevision: "v0.9.2"

Module Documentation

Include a docs section in which you declare the path to your Module documentation, relative to the root of the Git repo declared above. The documentation must be compatible with Spinx so that it can be compiled for inclusion in this documentation.

docs:
  path: "docs/src"

Execution command

Specify the command to run in the container to execute the calculation.

command:
  - '/bin/sh'
  - '-c'
  - 'python3 validate_config.py && ./lepton'

Inputs

A Module can label specific input files at static filepaths to allow files generated by previous processes in a workflow to “pipe” data into the Module. See Workflows for more information about declaring pipes.

Specify the input files as shown in the example code. Each file spec includes:

  • (required) a label consisting only of lowercase alphanumeric characters and underscores

  • (required) a description for documentation purposes

  • (required) a path within the container to which the file will be mounted prior to execution.

  • (optional) a required boolean value denoting whether the input file must exist (default is true)

inputs:
  - label: config
    required: true
    description: "Configuration of the module runtime options"
    path: "/opt/input/config.yaml"
  - label: input_eos
    required: false
    description: "Nuclear table input file grided in temperature and chemical potentials"
    path: "/opt/input/nuclear_grid.csv"

The configuration information for how to run a Module must be accepted as a YAML-formatted file at a static filepath, labeled config to which the CE will write the runtime configuration specified by the user.

Outputs

Output data generated by a Module execution must be a set of one or more files.

A Module can label specific output files for use by subsequent processes in a workflow. The syntax is described in the next section. These labeled output files can be referenced in the “piped” input spec of workflow process definitions. See Workflows for more information about declaring pipes.

Specify the output files as shown in the example code. Each file spec includes:

  • (required) a label consisting only of lowercase alphanumeric characters and underscores

  • (required) a description for documentation purposes

  • (required) a path within the container to which the file will be expected after execution.

outputs:
  - label: status
    description: "Module execution status"
    path: "/opt/output/status.yaml"
  - label: eos_charge_neutrality
    description: "Grid with charge neutral matter. Full output"
    path: "/opt/output/charge_neutrality_eos.csv"
  - label: lepton_eos
    description: "Leptonic EoS"
    path: "/opt/output/lepton_eos.csv"

One output YAML-formatted file labeled status is required at a static filepath whose OpenAPI schema is below. This status file is parsed by the CE after the Module container stops in order to fetch information about the process from the Module. Standard HTTP status codes are used to signal success or failure mode.

openapi: 3.0.0
components:
  schemas:
    Status:
      title: Status
      description: Status of module execution including a code and optionally a message
      type: object
      required:
        - code
      properties:
        message:
          type: string
          default: ""
        code:
          type: integer
          enum: [200, 400, 500]
          description: >
            Status code:
            * `200` - Success/OK
            * `400` - Bad request/client error
            * `500` - Internal error
          default: 200