Skip to content

Commit 91e6f82

Browse files
authored
Merge pull request #6 from zenml-io/feature/OSSK-569-accelerated-template
Accelerated template
2 parents e379d3a + a38ec25 commit 91e6f82

22 files changed

+388
-130
lines changed

.github/actions/llm_finetuning_template_test/action.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ runs:
7878
- name: Run pytests
7979
shell: bash
8080
run: |
81-
pytest ./local_checkout/tests
81+
pytest -s ./local_checkout/tests
8282
8383
- name: Clean-up
8484
shell: bash

copier.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,10 @@ steps_of_finetuning:
6262
type: int
6363
help: The number of steps of finetuning job.
6464
default: 300
65+
use_fast_tokenizer:
66+
type: bool
67+
help: Wether to use the fast tokenization or not, make sure your base model supports that
68+
default: false
6569
cuda_version:
6670
type: str
6771
help: The available cuda version. (Only relevant when using a remote orchestrator)

template/README.md

Lines changed: 36 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,11 @@ pip install -r requirements.txt
3434

3535
### 👷 Combined feature engineering and finetuning pipeline
3636

37+
> [!WARNING]
38+
> All steps of this pipeline have a `clean_gpu_memory(force=True)` at the beginning. This is used to ensure that the memory is properly cleared after previous steps.
39+
>
40+
> This functionality might affect other GPU processes running on the same environment, so if you don't want to clean the GPU memory between the steps, you can delete those utility calls from all steps.
41+
3742
The easiest way to get started with just a single command is to run the finetuning pipeline with the `orchestrator_finetune.yaml` configuration file, which will do data preparation, model finetuning, evaluation with [Rouge](https://huggingface.co/spaces/evaluate-metric/rouge) and promotion:
3843

3944
```shell
@@ -50,6 +55,17 @@ When running the pipeline like this, the trained model will be stored in the Zen
5055
<br/>
5156
</div>
5257

58+
### ⚡ Accelerate your finetuning
59+
60+
Do you want to benefit from multi-GPU-training with Distributed Data Parallelism (DDP)? Then you can use other configuration files prepared for this purpose.
61+
For example, `orchestrator_finetune.yaml` can run a finetuning of the [`{{ model_repository }}`](https://huggingface.co/{{ model_repository }}) powered by [Hugging Face Accelerate](https://huggingface.co/docs/accelerate/en/index) on all GPUs available in the environment. To do so, just call:
62+
63+
```shell
64+
python run.py --config orchestrator_finetune.yaml --accelerate
65+
```
66+
67+
Under the hood, the finetuning step will spin up the accelerated job using the step code, which will run on all available GPUs.
68+
5369
## ☁️ Running with a step operator in the stack
5470

5571
To finetune an LLM on remote infrastructure, you can either use a remote orchestrator or a remote step operator. Follow these steps to set up a complete remote stack:
@@ -80,26 +96,26 @@ The project loosely follows [the recommended ZenML project structure](https://do
8096

8197
```
8298
.
83-
├── configs # pipeline configuration files
84-
│ ├── orchestrator_finetune.yaml # default local or remote orchestrator
85-
│ └── remote_finetune.yaml # default step operator configuration
99+
├── configs # pipeline configuration files
100+
│ ├── orchestrator_finetune.yaml # default local or remote orchestrator configuration
101+
│ └── remote_finetune.yaml # default step operator configuration
86102
├── materializers
87-
│ └── directory_materializer.py # custom materializer to push whole directories to the artifact store and back
88-
├── pipelines # `zenml.pipeline` implementations
89-
│ └── train.py # Finetuning and evaluation pipeline
90-
├── steps # logically grouped `zenml.steps` implementations
91-
│ ├── evaluate_model.py # evaluate base and finetuned models using Rouge metrics
92-
│ ├── finetune.py # finetune the base model
93-
│ ├── prepare_datasets.py # load and tokenize dataset
94-
── promote.py # promote good models to target environment
95-
── utils # utility functions
96-
├── callbacks.py # custom callbacks
97-
│ ├── cuda.py # helpers for CUDA
98-
│ ├── loaders.py # loaders for models and data
99-
│ ├── logging.py # logging helpers
100-
│ └── tokenizer.py # load and tokenize
103+
│ └── directory_materializer.py # custom materializer to push whole directories to the artifact store and back
104+
├── pipelines # `zenml.pipeline` implementations
105+
│ └── train.py # Finetuning and evaluation pipeline
106+
├── steps # logically grouped `zenml.steps` implementations
107+
│ ├── evaluate_model.py # evaluate base and finetuned models using Rouge metrics
108+
│ ├── finetune.py # finetune the base model
109+
│ ├── log_metadata.py # helper step to ensure that model metadata is always logged
110+
── prepare_datasets.py # load and tokenize dataset
111+
│ └── promote.py # promote good models to target environment
112+
├── utils # utility functions
113+
│ ├── callbacks.py # custom callbacks
114+
│ ├── loaders.py # loaders for models and data
115+
│ ├── logging.py # logging helpers
116+
│ └── tokenizer.py # load and tokenize
101117
├── .dockerignore
102-
├── README.md # this file
103-
├── requirements.txt # extra Python dependencies
104-
└── run.py # CLI tool to run pipelines on ZenML Stack
118+
├── README.md # this file
119+
├── requirements.txt # extra Python dependencies
120+
└── run.py # CLI tool to run pipelines on ZenML Stack
105121
```

template/configs/orchestrator_finetune.yaml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,14 +14,18 @@ settings:
1414
parent_image: pytorch/pytorch:2.2.2-{{ cuda_version }}-cudnn8-runtime
1515
requirements: requirements.txt
1616
python_package_installer: uv
17+
python_package_installer_args:
18+
system: null
19+
apt_packages:
20+
- git
1721
environment:
1822
PJRT_DEVICE: CUDA
1923
USE_TORCH_XLA: "false"
2024
MKL_SERVICE_FORCE_INTEL: "1"
2125

2226
parameters:
2327
base_model_id: {{ model_repository }}
24-
use_fast: False
28+
use_fast: {{ use_fast_tokenizer }}
2529
load_in_4bit: True
2630
system_prompt: |
2731
{{ system_prompt.split("\n") | join("\n ") }}
@@ -32,7 +36,6 @@ steps:
3236
dataset_name: {{ dataset_name }}
3337

3438
finetune:
35-
enable_step_logs: False
3639
parameters:
3740
max_steps: {{ steps_of_finetuning }}
3841
eval_steps: {{ steps_of_finetuning // 10 }}

template/configs/remote_finetune.yaml

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,14 +14,18 @@ settings:
1414
parent_image: pytorch/pytorch:2.2.2-{{ cuda_version }}-cudnn8-runtime
1515
requirements: requirements.txt
1616
python_package_installer: uv
17+
python_package_installer_args:
18+
system: null
19+
apt_packages:
20+
- git
1721
environment:
1822
PJRT_DEVICE: CUDA
1923
USE_TORCH_XLA: "false"
2024
MKL_SERVICE_FORCE_INTEL: "1"
2125

2226
parameters:
2327
base_model_id: {{ model_repository }}
24-
use_fast: False
28+
use_fast: {{ use_fast_tokenizer }}
2529
load_in_4bit: True
2630
system_prompt: |
2731
{{ system_prompt.split("\n") | join("\n ") }}
@@ -32,17 +36,28 @@ steps:
3236
dataset_name: {{ dataset_name }}
3337

3438
finetune:
35-
enable_step_logs: False
39+
retry:
40+
max_retries: 3
41+
delay: 10
42+
backoff: 2
3643
step_operator: {{ step_operator }}
3744
parameters:
3845
max_steps: {{ steps_of_finetuning }}
3946
eval_steps: {{ steps_of_finetuning // 10 }}
4047
bf16: {{ bf16 }}
4148

4249
evaluate_finetuned:
50+
retry:
51+
max_retries: 3
52+
delay: 10
53+
backoff: 2
4354
step_operator: {{ step_operator }}
4455

4556
evaluate_base:
57+
retry:
58+
max_retries: 3
59+
delay: 10
60+
backoff: 2
4661
step_operator: {{ step_operator }}
4762

4863
promote:

template/pipelines/train.py

Lines changed: 35 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# {% include 'template/license_header' %}
22

33

4-
from steps import evaluate_model, finetune, prepare_data, promote
4+
from steps import evaluate_model, finetune, prepare_data, promote, log_metadata_from_step_artifact
55
from zenml import pipeline
66

77

@@ -13,7 +13,7 @@ def {{ product_name.replace("-","_") }}_full_finetune(
1313
load_in_8bit: bool = False,
1414
load_in_4bit: bool = False,
1515
):
16-
"""Pipeline for finetuning an LLM with peft.
16+
"""Pipeline for finetuning an LLM with PEFT.
1717
1818
It will run the following steps:
1919
@@ -22,36 +22,59 @@ def {{ product_name.replace("-","_") }}_full_finetune(
2222
- evaluate_model: evaluate the base and finetuned model
2323
- promote: promote the model to the target stage, if evaluation was successful
2424
"""
25+
if not load_in_8bit and not load_in_4bit:
26+
raise ValueError(
27+
"At least one of `load_in_8bit` and `load_in_4bit` must be True."
28+
)
29+
if load_in_4bit and load_in_8bit:
30+
raise ValueError("Only one of `load_in_8bit` and `load_in_4bit` can be True.")
31+
2532
datasets_dir = prepare_data(
26-
base_model_id=base_model_id,
33+
base_model_id=base_model_id,
2734
system_prompt=system_prompt,
2835
use_fast=use_fast,
2936
)
30-
ft_model_dir = finetune(
37+
38+
evaluate_model(
3139
base_model_id,
40+
system_prompt,
3241
datasets_dir,
42+
None,
3343
use_fast=use_fast,
34-
load_in_4bit=load_in_4bit,
3544
load_in_8bit=load_in_8bit,
45+
load_in_4bit=load_in_4bit,
46+
id="evaluate_base",
3647
)
37-
evaluate_model(
48+
log_metadata_from_step_artifact(
49+
"evaluate_base",
50+
"base_model_rouge_metrics",
51+
after=["evaluate_base"],
52+
id="log_metadata_evaluation_base"
53+
)
54+
55+
ft_model_dir = finetune(
3856
base_model_id,
39-
system_prompt,
4057
datasets_dir,
41-
ft_model_dir,
4258
use_fast=use_fast,
4359
load_in_8bit=load_in_8bit,
4460
load_in_4bit=load_in_4bit,
45-
id="evaluate_finetuned",
4661
)
62+
4763
evaluate_model(
4864
base_model_id,
4965
system_prompt,
5066
datasets_dir,
51-
None,
67+
ft_model_dir,
5268
use_fast=use_fast,
5369
load_in_8bit=load_in_8bit,
5470
load_in_4bit=load_in_4bit,
55-
id="evaluate_base",
71+
id="evaluate_finetuned",
5672
)
57-
promote(after=["evaluate_finetuned", "evaluate_base"])
73+
log_metadata_from_step_artifact(
74+
"evaluate_finetuned",
75+
"finetuned_model_rouge_metrics",
76+
after=["evaluate_finetuned"],
77+
id="log_metadata_evaluation_finetuned"
78+
)
79+
80+
promote(after=["log_metadata_evaluation_finetuned", "log_metadata_evaluation_base"])
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# {% include 'template/license_header' %}
2+
3+
from steps import (
4+
evaluate_model,
5+
finetune,
6+
prepare_data,
7+
promote,
8+
log_metadata_from_step_artifact,
9+
)
10+
from zenml import pipeline
11+
from zenml.integrations.huggingface.steps import run_with_accelerate
12+
13+
14+
@pipeline
15+
def {{ product_name.replace("-","_") }}_full_finetune(
16+
system_prompt: str,
17+
base_model_id: str,
18+
use_fast: bool = True,
19+
load_in_8bit: bool = False,
20+
load_in_4bit: bool = False,
21+
):
22+
"""Pipeline for finetuning an LLM with PEFT powered by Accelerate.
23+
24+
It will run the following steps:
25+
26+
- prepare_data: prepare the datasets and tokenize them
27+
- finetune: finetune the model
28+
- evaluate_model: evaluate the base and finetuned model
29+
- promote: promote the model to the target stage, if evaluation was successful
30+
"""
31+
if not load_in_8bit and not load_in_4bit:
32+
raise ValueError(
33+
"At least one of `load_in_8bit` and `load_in_4bit` must be True."
34+
)
35+
if load_in_4bit and load_in_8bit:
36+
raise ValueError("Only one of `load_in_8bit` and `load_in_4bit` can be True.")
37+
38+
datasets_dir = prepare_data(
39+
base_model_id=base_model_id,
40+
system_prompt=system_prompt,
41+
use_fast=use_fast,
42+
)
43+
44+
evaluate_model(
45+
base_model_id,
46+
system_prompt,
47+
datasets_dir,
48+
None,
49+
use_fast=use_fast,
50+
load_in_8bit=load_in_8bit,
51+
load_in_4bit=load_in_4bit,
52+
id="evaluate_base",
53+
)
54+
log_metadata_from_step_artifact(
55+
"evaluate_base",
56+
"base_model_rouge_metrics",
57+
after=["evaluate_base"],
58+
id="log_metadata_evaluation_base"
59+
)
60+
61+
ft_model_dir = run_with_accelerate(finetune)(
62+
base_model_id=base_model_id,
63+
dataset_dir=datasets_dir,
64+
use_fast=use_fast,
65+
load_in_8bit=load_in_8bit,
66+
load_in_4bit=load_in_4bit,
67+
)
68+
69+
evaluate_model(
70+
base_model_id,
71+
system_prompt,
72+
datasets_dir,
73+
ft_model_dir,
74+
use_fast=use_fast,
75+
load_in_8bit=load_in_8bit,
76+
load_in_4bit=load_in_4bit,
77+
id="evaluate_finetuned",
78+
)
79+
log_metadata_from_step_artifact(
80+
"evaluate_finetuned",
81+
"finetuned_model_rouge_metrics",
82+
after=["evaluate_finetuned"],
83+
id="log_metadata_evaluation_finetuned"
84+
)
85+
86+
promote(after=["log_metadata_evaluation_finetuned", "log_metadata_evaluation_base"])

0 commit comments

Comments
 (0)