Is finetune.py incompatible with older GPUs?

Hi, while running on a Colab A100 instance I noticed that the VRAM consumed by finetune.py was only about 5 GB for starcoderbase-1b so I attempted it on my local machine which has a GTX 1070 card (8 GB VRAM, Pascal architecture).  This didn't work, and I got a similar error when attempting again with either starcoderbase-1B or starcoderbase-3B on a larger, but still older GPU (NVIDIA Quadro P6000; 24GB VRAM).  Here is the error:

RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float

At first I thought this might be due to some difference in architecture (Pascal vs. Ampere) but this is contradicted by the fact that I have a Kaggle Code notebook which can fine-tune Starcoder with two P100 GPUs, which is also Pascal.

Is there some other explanation for this?

Longer stacktrace attached.
[dump.txt](https://github.com/bigcode-project/starcoder/files/14651127/dump.txt)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is finetune.py incompatible with older GPUs? #156

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is finetune.py incompatible with older GPUs? #156

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions