Skip to content

Document how to use a GCP service account in Airflow with SkyPilot #6291

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jul 23, 2025

Conversation

kevinmingtarja
Copy link
Collaborator

@kevinmingtarja kevinmingtarja commented Jul 17, 2025

This PRs adds an example for setting up a SkyPilot task in an Airflow pipeline to use a GCP service account stored as an Airflow connection.

Main changes:

  • Switch to @task.virtualenv to run the sky SDK calls in its own virtual environment. This was mainly to get around a dependency conflict, as Airflow is not compatible with sqlalchemy 2.0 (Fix all deprecations for SQLAlchemy 2.0 apache/airflow#28723)
  • Related, removed task_failure_callback, because sky commands can only work inside the venv, and on_failure_callback functions cannot be run inside a venv. But anyways I think this is fine, because we set down=True when calling launch, so the SkyPilot clusters will automatically spin down once the task finishes (regardless of success/failure)
  • Fetch GCP service account JSON key from an Airflow connection, and mount it on the SkyPilot cluster
  • Added data_preprocessing_gcp_sa.yaml (also replicated to add example yaml for using dynamically mounted GCP service accounts mock-train-workflow#4)
  • Added a new screenshot, and updated all screenshots based on the latest Airflow UI, for consistency

Manually tested using a Remote SkyPilot API server running on GKE and Airflow running locally.

Tested (run the relevant ones):

  • Code formatting: install pre-commit (auto-check on commit) or bash format.sh
  • Any manual or new tests for this PR (please specify below)
  • All smoke tests: /smoke-test (CI) or pytest tests/test_smoke.py (local)
  • Relevant individual tests: /smoke-test -k test_name (CI) or pytest tests/test_smoke.py::test_name (local)
  • Backward compatibility: /quicktest-core (CI) or pytest tests/smoke_tests/test_backward_compat.py (local)

@kevinmingtarja kevinmingtarja changed the title document how to use a GCP service account in Airflow with SkyPilot Document how to use a GCP service account in Airflow with SkyPilot Jul 17, 2025
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kevinmingtarja!

@Michaelvll Michaelvll requested a review from SeungjinYang July 21, 2025 21:57
@kevinmingtarja kevinmingtarja merged commit 96d5fc1 into master Jul 23, 2025
15 checks passed
@kevinmingtarja kevinmingtarja deleted the airflow-gcp-service-account-example branch July 23, 2025 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants