-
Notifications
You must be signed in to change notification settings - Fork 0
refactoring branch such that ML model related #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dtensor
Are you sure you want to change the base?
Conversation
out of the openpmd-streaming-continual-learning.py and put into a separate directory
@@ -5,5 +5,5 @@ module load gcc/12.2.0 cuda/12.1 openmpi/4.1.5-cuda121-gdr ucx/1.14.0-gdr \ | |||
# openpmd/0.15.2-cuda121-blosc2-py3122 | |||
# for (re-)instaling openpmd-api | |||
export openPMD_USE_MPI=ON | |||
source /home/kelling/checkout/insitumlNp2Torch26Env/bin/activate | |||
source /home/pandit52/venvs/Ism/bin/activate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No edit wars: do not commit local config changes (at least not into PR) .
from inSituML.ks_models import INNModel | ||
|
||
|
||
class ModelFinal(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a more generic concept of the model to be trained is great. Note though, this class remains a somewhat specific instance, i.e. it has encoder, decoder, inner model. Please rename to something more descriptive, to respect this.
from inSituML.encoder_decoder import Encoder | ||
from inSituML.encoder_decoder import Conv3DDecoder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Encoder and Decoder class should be declared in model_config (not wholly defined, just imported from model library in general)
ndim_tot=config["ndim_tot"], | ||
ndim_x=config["ndim_x"], | ||
ndim_y=config["ndim_y"], | ||
ndim_z=config["ndim_z"], | ||
loss_fit=fit, | ||
loss_latent=MMD_multiscale, | ||
loss_backward=MMD_multiscale, | ||
lambd_predict=config["lambd_predict"], | ||
lambd_latent=config["lambd_latent"], | ||
lambd_rev=config["lambd_rev"], | ||
zeros_noise_scale=config["zeros_noise_scale"], | ||
y_noise_scale=config["y_noise_scale"], | ||
hidden_size=config["hidden_size"], | ||
activation=config["activation"], | ||
num_coupling_layers=config["num_coupling_layers"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These params should be from a dictionary in model_config. Most of them already are, maybe this could be renamed (more descriptive than "config", maybe "inner_model_config"). The details can then also be left out here, by just passing **model_config.inner_model_config
to the inner model ctor.
optimizer = optim.Adam( | ||
[ | ||
{ | ||
"params": model.base_network.parameters(), | ||
"lr": lr * config["lrAEmult"], | ||
}, | ||
{"params": model.inner_model.parameters()}, | ||
], # model.parameters() | ||
lr=lr, | ||
betas=config["betas"], | ||
eps=config["eps"], | ||
weight_decay=config["weight_decay"], | ||
) | ||
if ("lr_annealingRate" not in config) or config[ | ||
"lr_annealingRate" | ||
] is None: | ||
scheduler = None | ||
else: | ||
scheduler = torch.optim.lr_scheduler.StepLR( | ||
optimizer, step_size=500, gamma=config["lr_annealingRate"] | ||
) | ||
|
||
return optimizer, scheduler, model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going beyond what should be in a model_factory
module, because they are parameters to the training, though it is OK for the load_objects
functions. Rename the file.
Optimizer and LR scheduler should also be configurable in module_config, but with defaults available.
stuff is out of the openpmd-streaming-continual-learning.py and put into a separate directory