the wrong loss func was chosen at evaluation

The loss func is always `moe_loss_func` as can be seen [here](https://github.com/stanford-futuredata/Megatron-LM/blob/3a9e3d8de308e6f6398b59d16a8bd7177374f121/pretrain_gpt.py#L128). But the loss is only calculated when training, which can be seen [here](https://github.com/stanford-futuredata/megablocks/blob/f05609ce69c1e1a7dd008c49cf435ef74df84b69/megablocks/layers/moe.py#L427-L428). We should fallback to the original loss func during evaluation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

the wrong loss func was chosen at evaluation #93

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

the wrong loss func was chosen at evaluation #93

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions