Skip to content

refactoring branch such that ML model related #182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: dtensor
Choose a base branch
from

Conversation

vedhasua
Copy link

stuff is out of the openpmd-streaming-continual-learning.py and put into a separate directory

out of the openpmd-streaming-continual-learning.py
and put into a separate directory
@vedhasua vedhasua marked this pull request as draft April 14, 2025 13:17
@vedhasua vedhasua requested a review from jkelling April 14, 2025 13:17
@@ -5,5 +5,5 @@ module load gcc/12.2.0 cuda/12.1 openmpi/4.1.5-cuda121-gdr ucx/1.14.0-gdr \
# openpmd/0.15.2-cuda121-blosc2-py3122
# for (re-)instaling openpmd-api
export openPMD_USE_MPI=ON
source /home/kelling/checkout/insitumlNp2Torch26Env/bin/activate
source /home/pandit52/venvs/Ism/bin/activate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No edit wars: do not commit local config changes (at least not into PR) .

from inSituML.ks_models import INNModel


class ModelFinal(nn.Module):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a more generic concept of the model to be trained is great. Note though, this class remains a somewhat specific instance, i.e. it has encoder, decoder, inner model. Please rename to something more descriptive, to respect this.

Comment on lines +8 to +9
from inSituML.encoder_decoder import Encoder
from inSituML.encoder_decoder import Conv3DDecoder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encoder and Decoder class should be declared in model_config (not wholly defined, just imported from model library in general)

Comment on lines +86 to +100
ndim_tot=config["ndim_tot"],
ndim_x=config["ndim_x"],
ndim_y=config["ndim_y"],
ndim_z=config["ndim_z"],
loss_fit=fit,
loss_latent=MMD_multiscale,
loss_backward=MMD_multiscale,
lambd_predict=config["lambd_predict"],
lambd_latent=config["lambd_latent"],
lambd_rev=config["lambd_rev"],
zeros_noise_scale=config["zeros_noise_scale"],
y_noise_scale=config["y_noise_scale"],
hidden_size=config["hidden_size"],
activation=config["activation"],
num_coupling_layers=config["num_coupling_layers"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These params should be from a dictionary in model_config. Most of them already are, maybe this could be renamed (more descriptive than "config", maybe "inner_model_config"). The details can then also be left out here, by just passing **model_config.inner_model_config to the inner model ctor.

Comment on lines +146 to +168
optimizer = optim.Adam(
[
{
"params": model.base_network.parameters(),
"lr": lr * config["lrAEmult"],
},
{"params": model.inner_model.parameters()},
], # model.parameters()
lr=lr,
betas=config["betas"],
eps=config["eps"],
weight_decay=config["weight_decay"],
)
if ("lr_annealingRate" not in config) or config[
"lr_annealingRate"
] is None:
scheduler = None
else:
scheduler = torch.optim.lr_scheduler.StepLR(
optimizer, step_size=500, gamma=config["lr_annealingRate"]
)

return optimizer, scheduler, model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going beyond what should be in a model_factory module, because they are parameters to the training, though it is OK for the load_objects functions. Rename the file.

Optimizer and LR scheduler should also be configurable in module_config, but with defaults available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants