Better inference based on starcode2-3b model

I am new to starcode.

when I run the follow demo:

```
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

checkpoint = "./starcoder2-3b"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto", torch_dtype=torch.bfloat16)

inputs = tokenizer.encode("def is_prime(n):", return_tensors="pt").to("cuda")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))
```

it returns:
```
def is_prime():
    """
    This function checks if a number is prime or not.
    """
```

it doesn`t finish. so I SET the **max_length=120**, then it returns:
```
def is_prime():
    """
    This function checks if a number is prime or not.
    """
    num = int(input("Enter a number: "))
    if num > 1:
        for i in range(2, num):
            if (num % i) == 0:
                print(num, "is not a prime number")
                break
        else:
            print(num, "is a prime number")
    else:
        print(num, "is not a prime number")


is_prime()
<file_sep>/README.md
# Python-
```
The part 
```
is_prime()
<file_sep>/README.md
# Python-
```
is redundant. now my solution is:
```
generated_code = tokenizer.decode(outputs[0])
if "<file_sep>" in generated_code:
    generated_code = generated_code.split("<file_sep>")[0]
print(generated_code)
```
But I don`t think it a good idea. I want the model to return the results in one go without generating redundant parts. How can I do that? Could you give me some advice?




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Better inference based on starcode2-3b model #154

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Better inference based on starcode2-3b model #154

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions