Skip to content

Is finetune.py incompatible with older GPUs? #156

@umm-maybe

Description

@umm-maybe

Hi, while running on a Colab A100 instance I noticed that the VRAM consumed by finetune.py was only about 5 GB for starcoderbase-1b so I attempted it on my local machine which has a GTX 1070 card (8 GB VRAM, Pascal architecture). This didn't work, and I got a similar error when attempting again with either starcoderbase-1B or starcoderbase-3B on a larger, but still older GPU (NVIDIA Quadro P6000; 24GB VRAM). Here is the error:

RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float

At first I thought this might be due to some difference in architecture (Pascal vs. Ampere) but this is contradicted by the fact that I have a Kaggle Code notebook which can fine-tune Starcoder with two P100 GPUs, which is also Pascal.

Is there some other explanation for this?

Longer stacktrace attached.
dump.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions