Skip to content

install_enroot_pyxis.sh install fails intermittently on HyperPod #678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nghtm opened this issue May 15, 2025 · 2 comments
Open

install_enroot_pyxis.sh install fails intermittently on HyperPod #678

nghtm opened this issue May 15, 2025 · 2 comments

Comments

@nghtm
Copy link
Collaborator

nghtm commented May 15, 2025

curl: (6) Could not resolve host: [github.com](http://github.com/)

"Traceback (most recent call last):
  File "/tmp/air-voiceforce-hyperpod-lifecycle-2/src/lifecycle_script.py", line 211, in <module>
    main(args)
  File "/tmp/air-voiceforce-hyperpod-lifecycle-2/src/lifecycle_script.py", line 184, in main
    ExecuteBashScript("./utils/install_enroot_pyxis.sh").run(node_type)
  File "/tmp/air-voiceforce-hyperpod-lifecycle-2/src/lifecycle_script.py", line 31, in run
    result.check_returncode()
  File "/usr/lib/python3.10/subprocess.py", line 457, in check_returncode
    raise CalledProcessError(self.returncode, self.args, self.stdout,"

subprocess.CalledProcessError: Command '['sudo', 'bash', './utils/install_enroot_pyxis.sh', <SlurmNodeType.COMPUTE_NODE: 'compute'>]' returned non-zero exit status 6.
@nghtm
Copy link
Collaborator Author

nghtm commented May 15, 2025

install_enroot_pyxis.sh.txt

Created this script to add exponential backoff to the curl commands. (Uploaded as .txt because github does not allow .sh attachments)

@amanshanbhag
Copy link
Collaborator

Seems similar to #675, where nodes aren't able to reach internet randomly during LCS execution. Weird that all previous packages were installed successfully (including apt and pip)

@nghtm nghtm mentioned this issue May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants