Skip to content

[Track] enable e2e process to add new training machine #1910

@sunya-ch

Description

@sunya-ch

What would you like to be added?

This is a broken down issue from #1906 to focus on the first task.
We must have an automation ready for new machine to be integrated into Kepler-metal-ci training and validation report.
In addition, we need to prepare a dispatch CI to push the trained model to kepler-model-db to make it available there and also to kepler as well.

Previous issue: sustainable-computing-io/kepler-model-server#258

Why is this needed?

Action items

  • clean up unused workflow files
  • generalize the training and validation flow to support different action to create a runner
  • add a machine layer for validation result page
  • manual workflow to fetch kepler-model-db to add the trained model and push a PR signed by the machine-specific account

Next step (future action items)

  • dispatch the PR push workflow on release
  • dispatch the PR to kepler repo once the new trained model merged.

Metadata

Metadata

Assignees

No one assigned

    Labels

    gh-actionThis issue is related to kepler-actionkind/featureNew feature or requestmetal-ciThis issue is related to kepler-metal-cimodel-dbThis issue is related to kepler-model-dbmodel-serverThis issue is related to kepler-model-server

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions