Skip to content

Model client streaming from the selector of SelectorGroupChat #6145

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
yingjiewei opened this issue Mar 29, 2025 · 7 comments · May be fixed by #6286
Open

Model client streaming from the selector of SelectorGroupChat #6145

yingjiewei opened this issue Mar 29, 2025 · 7 comments · May be fixed by #6286
Labels
help wanted Extra attention is needed

Comments

@yingjiewei
Copy link

yingjiewei commented Mar 29, 2025

Feature Request

We can enable streaming for SelectoGroupChat's built-in selector by introducing an option in SelectorGroupChat, e.g., model_client_stream so the model client will be used in streaming model. It will use create_stream rather than create.

As the next step, we can enable streaming of orchestration events through run_stream so the streaming output will be visible from consumer of run_stream. Issue here: #6161

--- Below is the original bug report ---

What happened?

Describe the bug
Some llm models only support stream = True. The assistant agent supports this very well by setting model_client_stream = True. But the OpenAIChatCompletionClient does not allow to pass stream = True to it. Therefore, it's not very possible to use llm models which only supports stream = True.

To Reproduce

    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'code': 'invalid_parameter_error', 'param': None, 'message': 'This model only support stream mode, please enable the stream parameter to access the model. ', 'type': 'invalid_request_error'}, 'id': 

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Which packages was the bug in?

Python AgentChat (autogen-agentchat>=0.4.0)

AutoGen library version.

Python dev (main branch)

Other library version.

No response

Model used

No response

Model provider

None

Other model provider

No response

Python version

None

.NET version

None

Operating system

None

@SongChiYoung
Copy link
Contributor

This is an interesting issue

Would it be possible to provide a more specific reproduction example, including which model/client configuration triggers this error?

I'd love to help investigate further.

@yingjiewei
Copy link
Author

yingjiewei commented Mar 30, 2025

@SongChiYoung, many thanks for helping.

here is my code:

import asyncio
from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.ui import Console
from autogen_agentchat.teams import RoundRobinGroupChat, SelectorGroupChat

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_core.memory import ListMemory, MemoryContent, MemoryMimeType

def get_model_client() -> OpenAIChatCompletionClient:
    return OpenAIChatCompletionClient(
        model="qwq-plus",
        api_key="",
        base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
        model_info={
            "json_output": False,
            "vision": False,
            "function_calling": True,
            "family": "unknown"
        }
    )

model_client = get_model_client()


async def create_assistant():
    # 确保此函数内所有异步操作都使用await
    # 示例内存初始化(根据文献[6]优化内存管理)
    user_memory = ListMemory()
    await user_memory.add(
        MemoryContent(
            content="The user system is linux",
            mime_type=MemoryMimeType.TEXT
        )
    )

    # 初始化模型客户端(需补充实际配置)
    # model_client = model_client

    return AssistantAgent(
        name="assistant_agent",
        model_client=model_client,
        memory=[user_memory],  # 直接传递内存实例[6](@ref)
        system_message='''...''',  # 保持原有系统消息
        model_client_stream=True
    )

async def create_assistant2():
    # 确保此函数内所有异步操作都使用await
    # 示例内存初始化(根据文献[6]优化内存管理)
    user_memory = ListMemory()
    await user_memory.add(
        MemoryContent(
            content="The user system is linux",
            mime_type=MemoryMimeType.TEXT
        )
    )

    # 初始化模型客户端(需补充实际配置)
    # model_client = model_client

    return AssistantAgent(
        name="assistant_agent2",
        model_client=model_client,
        memory=[user_memory],  # 直接传递内存实例[6](@ref)
        system_message='''...''',  # 保持原有系统消息
        model_client_stream=True
    )


async def main():
    # 异步初始化所有代理(根据文献[3]优化并行初始化)
    assistant= await create_assistant()
    assistant2= await create_assistant2()

    # 构建团队(根据文献[7]优化团队配置)
    team = SelectorGroupChat(
        [assistant, assistant2],
        termination_condition=TextMentionTermination("APPROVE"),
        model_client=model_client,
        selector_prompt="""...""",  # 保持原有提示模板
    )

    # 执行任务流(根据文献[8]优化流处理)
    # try:
    stream = team.run_stream(task=input("Hello! How can I help?\n"))
    #stream = team.run_stream(task=input("Hello! How can I help?\n"))
    await Console(stream, output_stats=True)
    # except asyncio.CancelledError:
    #     print("Operation cancelled by user")
    # finally:
    #     await team.close()  # 确保资源释放

if __name__ == "__main__":
    # 标准事件循环入口(根据文献[4][9]规范)
    asyncio.run(main())

You can save this in a file, say, test_flow.py and then run python test_flow.py. The type something after the pop up.

From alibaba: the most powerful model qwq-plus only supports stream: https://www.alibabacloud.com/help/zh/model-studio/user-guide/qwq?spm=a3c0i.23458820.2359477120.1.35166e9b4ZlN2I

The error is:

    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'code': 'invalid_parameter_error', 'param': None, 'message': 'This model only support stream mode, please enable the stream parameter to access the model. ', 'type': 'invalid_request_error'}, 'id': 'chatcmpl-fb835624-c232-90e1-81f6-0b3b19c1e633', 'request_id': 'fb835624-c232-90e1-81f6-0b3b19c1e633'}

@SongChiYoung
Copy link
Contributor

Just sharing a thought from an architectural perspective:

Rather than adding ad-hoc fixes or modifying SelectorGroupChat specifically for this, I'm considering whether it would make more sense — once PR #6063 is merged — to handle this kind of use case by configuring stream=True per model as needed.

The reason I hesitate to embed special logic for QwQ or similar agents into GroupChat (or any group structures) is that future use cases or new types of GroupChats may again require exposing stream or other model-specific flags — which could become hard to maintain.

Curious to hear thoughts from maintainers on this!

@yingjiewei
Copy link
Author

By the way, the current restriction on stream = True on the model side is that OpenAIChatCompletionClient does not allow setting stream = True.

@SongChiYoung
Copy link
Contributor

SongChiYoung commented Mar 30, 2025

Thanks for the clarification!

Based on the error message This model only supports stream mode, please enable the stream parameter this seems to be a server-side constraint from QwQ, not a limitation in the OpenAIChatCompletionClient itself.

That’s why I think one possible path forward is to handle this at the model level (e.g. via model config / registry), so the correct stream=True flag is automatically applied based on model requirements.

This is aligned with the goal of PR #6063

to encapsulate model-specific behaviors and avoid leaking model flags like stream into higher-level constructs such as GroupChat.

That said, this is just my opinion — I believe the maintainers' judgment here is the most important.

@ekzhu
Copy link
Collaborator

ekzhu commented Mar 31, 2025

Is the constraint on stream only a temporary one for QwQ or it is permanent?

I think we can enable streaming for SelectoGroupChat's built-in selector by introducing an option in SelectorGroupChat, e.g., model_client_stream so the model client will be used in streaming model. It will use create_stream rather than create.

As the next step, we can enable streaming of orchestration events through run_stream so the streaming output will be visible from consumer of run_stream. Issue here: #5127

@yingjiewei are you interested in submitting a PR for this? Just focus on adding model_client_stream option to SelectorGroupChat. You can see the contributing guide on how to set up local development environment: https://github.com/microsoft/autogen/blob/main/python/README.md

@ekzhu ekzhu changed the title stream = True is required for the SelectorGroupChat Model client streaming from the selector of SelectorGroupChat Apr 2, 2025
@ekzhu ekzhu added help wanted Extra attention is needed and removed needs-triage labels Apr 2, 2025
@SongChiYoung
Copy link
Contributor

Fixed via #6145 — SelectorGroupChat now supports streaming mode for select_speaker.
Please take a look when you have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants