Skip to content

bug: tokenizer filed cannot be extracted from BenchmarkConfig #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dmartinol opened this issue May 21, 2025 · 0 comments
Open

bug: tokenizer filed cannot be extracted from BenchmarkConfig #15

dmartinol opened this issue May 21, 2025 · 0 comments

Comments

@dmartinol
Copy link

The tokenizer option is extracted from task_config.metadata, and task_config of type BenchmarkConfig that does not have the metadata field:
https://github.com/meta-llama/llama-stack/blob/2890243107c74a7a88b82595db49e9540d0a0561/llama_stack/apis/eval/eval.py#L50

The following code is not extracting the tokenizer then:

        if (
            hasattr(task_config, "metadata")
            and task_config.metadata
            and "tokenizer" in task_config.metadata
        ):
            tokenizer_value = task_config.metadata.get("tokenizer")
            if isinstance(tokenizer_value, str) and tokenizer_value:
                logger.debug(f"Using custom tokenizer from metadata: {tokenizer_value}")
                model_args.append(ModelArg(name="tokenizer", value=tokenizer_value))

We should instead collect it from the stored_benchmark field that has the metadata attribute.
BTW: for the same reason, the initial extraction logic in _collect_env_vars, _extract_git_source and _extract_pvc_name is probably destined to fail.

@dmartinol dmartinol changed the title bug: tokenizer option is not bug: tokenizer filed cannot be extracted from BenchmarkConfig May 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant