bug: tokenizer filed cannot be extracted from BenchmarkConfig #15

dmartinol · 2025-05-21T14:19:40Z

The tokenizer option is extracted from task_config.metadata, and task_config of type BenchmarkConfig that does not have the metadata field:
https://github.com/meta-llama/llama-stack/blob/2890243107c74a7a88b82595db49e9540d0a0561/llama_stack/apis/eval/eval.py#L50

The following code is not extracting the tokenizer then:

        if (
            hasattr(task_config, "metadata")
            and task_config.metadata
            and "tokenizer" in task_config.metadata
        ):
            tokenizer_value = task_config.metadata.get("tokenizer")
            if isinstance(tokenizer_value, str) and tokenizer_value:
                logger.debug(f"Using custom tokenizer from metadata: {tokenizer_value}")
                model_args.append(ModelArg(name="tokenizer", value=tokenizer_value))

We should instead collect it from the stored_benchmark field that has the metadata attribute.
BTW: for the same reason, the initial extraction logic in _collect_env_vars, _extract_git_source and _extract_pvc_name is probably destined to fail.

The text was updated successfully, but these errors were encountered:

dmartinol changed the title ~~bug: tokenizer option is not~~ bug: tokenizer filed cannot be extracted from BenchmarkConfig May 21, 2025

christinaexyou mentioned this issue May 29, 2025

fix: Add tokenizer options #18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: tokenizer filed cannot be extracted from BenchmarkConfig #15

bug: tokenizer filed cannot be extracted from BenchmarkConfig #15

dmartinol commented May 21, 2025

bug: tokenizer filed cannot be extracted from BenchmarkConfig #15

bug: tokenizer filed cannot be extracted from BenchmarkConfig #15

Comments

dmartinol commented May 21, 2025