Bug: json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) #3

Z1zs · 2025-05-17T17:49:33Z

Z1zs
May 17, 2025

I got an error during evaluating locomo benchmark through memos. However, I can't reproduce it even when I keep the same input and define the seed of ollama in src/memos/llms/ollama.py. Unclear about the reason for irreproduciblity currently.

The conv data raising the error:

[{'role': 'user', 'content': 'Caroline: Hey Mel! Good to see you! How have you been?'}, {'role': 'assistant', 'content': "Melanie: Hey Caroline! Good to see you! I'm swamped with the kids & work. What's up with you? Anything new?"}]

The error I got:

Traceback (most recent call last):
File "/mnt/public/code/jiahaohuo/MemOS/evaluation/scripts/locomo/zep_locomo_ingestion.py", line 164, in
main(frame)
File "/mnt/public/code/jiahaohuo/MemOS/evaluation/scripts/locomo/zep_locomo_ingestion.py", line 155, in main
ingest_session(session, frame, metadata)
File "/mnt/public/code/jiahaohuo/MemOS/evaluation/scripts/locomo/zep_locomo_ingestion.py", line 116, in ingest_session
memos_ingest(client_a, messages[i:i+2])
File "/mnt/public/code/jiahaohuo/MemOS/evaluation/scripts/locomo/zep_locomo_ingestion.py", line 60, in memos_ingest
memories = client.extract(message)
^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/public/code/jiahaohuo/MemOS/src/memos/memories/textual/naive.py", line 77, in extract
raise e
File "/mnt/public/code/jiahaohuo/MemOS/src/memos/memories/textual/naive.py", line 74, in extract
extracted_memories = json.loads(response)
^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/mem0/lib/python3.11/json/init.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/mem0/lib/python3.11/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/mem0/lib/python3.11/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The logger output:

2025-05-17 17:23:56,386 - memos.llms.ollama - INFO - Raw response from Ollama: {"model":"qwen3:0.6b","created_at":"2025-05-17T17:23:56.382270391Z","done":true,"done_reason":"length","total_duration":28617625306,"load_duration":15206784,"prompt_eval_count":391,"prompt_eval_duration":2671438407,"eval_count":1024,"eval_duration":25930491322,"message":{"role":"assistant","content":"\nOkay, let's see. The user provided a list of messages, each with a role and content. My job is to extract memories from them. The example given in the previous query shows that each message is either a user or assistant's message, and I need to categorize them into memory types like event, fact, opinion, etc.\n\nLooking at the first input message: {"role": "user", "content": "Caroline: Hey Mel! Good to see you! How have you been?"}. The content is Caroline greeting Mel. The assistant's response is "Melanie: Hey Caroline! Good to see you! I'm swamped with the kids & work. What's up with you? Anything new?" So this is a conversation between user and assistant. The content here is a message from the assistant to the user. The type here would be "event" since it's a conversation. But wait, the example output for the first input shows that the memory is "User plans to visit Paris next week." which is an event. So maybe in this case, the assistant's message is a memory. Wait, but the first message is from the user. Let me check again. The first message is from the user: Caroline's message. Then the assistant's response is the second message. So the first message is a user message, and the second is the assistant's. So in the first input, the user's message is a conversation, and the assistant's response is a memory. But the example output for the first input shows that the memory is "User plans to visit Paris next week." which is an event. So maybe in this case, the assistant's message is a memory. But how do I categorize it? The user's message is a conversation, but the assistant's response is a memory. Wait, the example output for the first input shows that the memory is from the user's message. Wait, no, in the example, the output is two memories: one from the user's event and one from the assistant's opinion. So maybe in this case, the user's message is a memory. But the user's message is a conversation, not a memory. Wait, maybe the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So perhaps in this case, the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So maybe the user's message is a memory, and the assistant's response is a memory. But how do I know? The problem says to extract memories from the messages. Each message is either a user or assistant message. So in the first input, the user's message is a conversation, and the assistant's response is a memory. So the first memory would be the user's message, and the second memory is the assistant's. But the example output shows that the first memory is from the user's event. Wait, maybe the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So perhaps in this case, the user's message is a memory, and the assistant's response is a memory. But how do I know? The problem says to extract memories from the messages. Each message is a memory. So maybe in this case, the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So perhaps the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So maybe the user's message is a memory, and the assistant's response is a memory. But how do I know? The problem says to extract memories from the messages. Each message is a memory. So in the first input, the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So perhaps the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So maybe the user's message is a memory, and the assistant's response is a memory. But how do I know? The problem says to extract memories from the messages. Each message is a memory. So in the first input, the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So perhaps the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So maybe the user's message is a memory, and the assistant","images":null,"tool_calls":null}}

2025-05-17 17:23:56,386 - memos.memories.textual.naive - ERROR - Failed to decode JSON response:
Okay, let's see. The user provided a list of messages, each with a role and content. My job is to extract memories from them. The example given in the previous query shows that each message is either a user or assistant's message, and I need to categorize them into memory types like event, fact, opinion, etc.

Looking at the first input message: {"role": "user", "content": "Caroline: Hey Mel! Good to see you! How have you been?"}. The content is Caroline greeting Mel. The assistant's response is "Melanie: Hey Caroline! Good to see you! I'm swamped with the kids & work. What's up with you? Anything new?" So this is a conversation between user and assistant. The content here is a message from the assistant to the user. The type here would be "event" since it's a conversation. But wait, the example output for the first input shows that the memory is "User plans to visit Paris next week." which is an event. So maybe in this case, the assistant's message is a memory. Wait, but the first message is from the user. Let me check again. The first message is from the user: Caroline's message. Then the assistant's response is the second message. So the first message is a user message, and the second is the assistant's. So in the first input, the user's message is a conversation, and the assistant's response is a memory. But the example output for the first input shows that the memory is "User plans to visit Paris next week." which is an event. So maybe in this case, the assistant's message is a memory. But how do I categorize it? The user's message is a conversation, but the assistant's response is a memory. Wait, the example output for the first input shows that the memory is from the user's message. Wait, no, in the example, the output is two memories: one from the user's event and one from the assistant's opinion. So maybe in this case, the user's message is a memory. But the user's message is a conversation, not a memory. Wait, maybe the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So perhaps in this case, the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So maybe the user's message is a memory, and the assistant's response is a memory. But how do I know? The problem says to extract memories from the messages. Each message is either a user or assistant message. So in the first input, the user's message is a conversation, and the assistant's response is a memory. So the first memory would be the user's message, and the second memory is the assistant's. But the example output shows that the first memory is from the user's event. Wait, maybe the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So perhaps in this case, the user's message is a memory, and the assistant's response is a memory. But how do I know? The problem says to extract memories from the messages. Each message is a memory. So maybe in this case, the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So perhaps the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So maybe the user's message is a memory, and the assistant's response is a memory. But how do I know? The problem says to extract memories from the messages. Each message is a memory. So in the first input, the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So perhaps the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So maybe the user's message is a memory, and the assistant's response is a memory. But how do I know? The problem says to extract memories from the messages. Each message is a memory. So in the first input, the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So perhaps the user's message is a memory, and the assistant's response is a memory. But the example output shows that the first memory is from the user's event. So maybe the user's message is a memory, and the assistant

Currently I suggest that someone be assigned to:

Revise the code of src/memos/llms/ollama.py, so that it repeats the same behavior when meeting same input and config. However this bug may also be related with the float accuracy of tensor computing, which cannot be fixed——it's to be determined.
Revise the code of src/memos/memories/textual/naive.py to handle irregular response.

For someone to reproduce the error, running the following commands in `root@is-da42k32wkrg6i5ti-devmachine-0` (machine of Metax C500*4)：

cd /mnt/public/code/jiahaohuo/MemOS/evaluation
conda activate mem0
git checkout eval/locomo  
python scripts/locomo/zep_locomo_ingestion.py --frame "memos"

It will get broken somehow finally(every time on different input😅).
My codework is still in progress, so providing a simplest reproduceable code is temporarily impossible.

Answered by Ki-Seki

Jun 5, 2025

Here’s a best practice suggested by @J1awei-Yang to help others when facing issues with JSON data extraction.

Best Practice: Reliable JSON Extraction from LLM Responses with Retry

The Problem

When calling a language model (LLM) to generate structured JSON output, it’s common for the model to occasionally return malformed JSON—such as extra text, missing commas, or unclosed braces. This causes json.loads() to raise a JSONDecodeError.

response = llm.generate(prompt)
json.loads(response)  # 💥 Raises JSONDecodeError if the response is not valid JSON

The Solution: Wrap Both the LLM Call and JSON Parsing in a Retry

Use a retry mechanism (e.g., Tenacity) to automatically retry the entire process—…

View full answer

Ki-Seki · 2025-05-23T04:00:56Z

Ki-Seki
May 23, 2025
Maintainer

This issue will be scheduled in the near future but will temporarily remain unaddressed. Below are some potential solutions for future reference:

Use libraries like Outlines to control the parsing of formatted data
Write better prompts to regulate JSON extraction
Utilize JSON mode offered by some model providers

该问题将在近期安排处理，但目前暂时搁置。以下是一些可行的备选方案：

使用Outlines等约束解码库来控制格式化数据的解析
编写更精准的提示语来规范JSON提取
采用部分模型提供商提供的JSON解析模式

0 replies

Ki-Seki · 2025-05-26T09:20:05Z

Ki-Seki
May 26, 2025
Maintainer

@Z1zs

I'll reassign this issue to you because we now have the MemoryItem base model (https://github.com/MemTensor/MemOS/blob/main/src/memos/memories/textual/item.py), which enables constrained decoding.

Here's what you need to do:

First try Outlines yourself and evaluate whether it's suitable for integration into our framework. Refer to: https://dottxt-ai.github.io/outlines/latest/reference/models/openai/ . Pay special attention to whether Outlines can work when loading models like Qwen through the OpenAI package.
If feasible, submit a PR to add a generate_json method alongside generate. You can start by implementing this feature for OpenAI models first.

我把这个issue重新分配给你，因为我们现在已经有了MemoryItem模型（https://github.com/MemTensor/MemOS/blob/main/src/memos/memories/textual/item.py），可以实现约束解码了。

你需要：

先自行试用下Outlines，评估是否适合集成到我们的框架中。参考：https://dottxt-ai.github.io/outlines/latest/reference/models/openai/ 。特别注意评估当通过OpenAI包加载类似Qwen这样的模型时，能否使用Outlines。
如果可行，提交一个PR，在generate之外新增generate_json方法。可以先针对OpenAI模型实现这个功能。

0 replies

Z1zs · 2025-06-02T09:41:36Z

Z1zs
Jun 2, 2025
Author

While evaluating outlines, I noticed that the library does not guarantee that an LLM will always return a valid JSON payload. Instead, it only constrains the output to match a schema defined with Pydantic, enumerated choices, or regex patterns.

from pydantic import BaseModel
from outlines import models, generate

class User(BaseModel):
    name: str
    last_name: str
    id: int

model = models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = generate.json(model, User)
result = generator("Create a user profile with the fields name, last_name and id")
print(result)          # ➜ User(name='John', last_name='Doe', id=11)

Since outlines merely validates format compliance rather than enforcing a JSON container, adopting it would require every caller to supply a precise schema (the “desired LLM-output element”) up front when using generate_json. Please kindly consider whether this restriction is proper currently @Ki-Seki

0 replies

Ki-Seki · 2025-06-05T06:38:45Z

Ki-Seki
Jun 5, 2025
Maintainer

@Z1zs

After careful consideration, I decided we will not adopt outlines. If we apply outlines to OpenAI models, as you mentioned, it may still not follow the JSON schema. If we apply outlines to Hugging Face models — for example, using a Pydantic model to constrain the output and ensure a memory item — it takes much longer than I expected.

For instance, I attempted a vivid example:

import uuid

from datetime import datetime
from typing import Literal

from outlines import generate, models
from pydantic import BaseModel, ConfigDict, Field, field_validator


class TextualMemoryMetadata(BaseModel):
    """Metadata for a memory item.

    This includes information such as the type of memory, when it occurred,
    its source, and other relevant details.
    """

    user_id: str | None = Field(
        default=None,
        description="The ID of the user associated with the memory. Useful for multi-user systems.",
    )
    session_id: str | None = Field(
        default=None,
        description="The ID of the session during which the memory was created. Useful for tracking context in conversations.",
    )
    status: Literal["active", "archived", "deleted"] | None = Field(
        default=None,
        description="The status of the memory, e.g., 'active', 'archived', 'deleted'.",
    )
    type: Literal["procedure", "fact", "event", "opinion"] | None = Field(default=None)
    memory_time: str | None = Field(
        default=None,
        description='The time the memory occurred or refers to. Must be in standard `YYYY-MM-DD` format. Relative expressions such as "yesterday" or "tomorrow" are not allowed.',
    )
    source: Literal["conversation", "retrieved", "web", "file"] | None = Field(
        default=None, description="The origin of the memory"
    )
    confidence: float | None = Field(
        default=None,
        description="A numeric score (float between 0 and 100) indicating how certain you are about the accuracy or reliability of the memory.",
    )
    entities: list[str] | None = Field(
        default=None,
        description='A list of key entities mentioned in the memory, e.g., people, places, organizations, e.g., `["Alice", "Paris", "OpenAI"]`.',
    )
    tags: list[str] | None = Field(
        default=None,
        description='A list of keywords or thematic labels associated with the memory for categorization or retrieval, e.g., `["travel", "health", "project-x"]`.',
    )
    visibility: Literal["private", "public", "session"] | None = Field(
        default=None, description="e.g., 'private', 'public', 'session'"
    )
    updated_at: str | None = Field(
        default_factory=lambda: datetime.now().isoformat(),
        description="The timestamp of the last modification to the memory. Useful for tracking memory freshness or change history. Format: ISO 8601.",
    )

    model_config = ConfigDict(extra="forbid")

    @field_validator("memory_time")
    @classmethod
    def validate_memory_time(cls, v):
        try:
            if v:
                datetime.strptime(v, "%Y-%m-%d")
        except ValueError as e:
            raise ValueError("Invalid date format. Use YYYY-MM-DD.") from e
        return v

    @field_validator("confidence")
    @classmethod
    def validate_confidence(cls, v):
        if v is not None and (v < 0 or v > 100):
            raise ValueError("Confidence must be between 0 and 100.")
        return v


class TextualMemoryItem(BaseModel):
    """Represents a single memory item in the textual memory.

    This serves as a standardized format for memory items across different
    textual memory implementations.
    """

    id: str = Field(default_factory=lambda: str(uuid.uuid4()))
    memory: str
    metadata: TextualMemoryMetadata = Field(default_factory=TextualMemoryMetadata)

    model_config = ConfigDict(extra="forbid")

    @field_validator("id")
    @classmethod
    def validate_id(cls, v):
        try:
            uuid.UUID(v)
        except ValueError as e:
            raise ValueError("Invalid UUID format") from e
        return v

    @classmethod
    def from_dict(cls, data: dict) -> "TextualMemoryItem":
        return cls(**data)

    def to_dict(self) -> dict:
        return self.model_dump(exclude_none=True)

    def __str__(self) -> str:
        """Pretty string representation of the memory item."""
        meta = self.metadata.model_dump(exclude_none=True)
        meta_str = ", ".join(f"{k}={v}" for k, v in meta.items())
        return f"<ID: {self.id} | Memory: {self.memory} | Metadata: {meta_str}>"


EXTRACTION_PROMPT_PART_1 = f"""You are a memory extractor. Your task is to extract memories from the given messages.
* You will receive a list of messages, each with a role (user or assistant) and content.
* Your job is to extract memories related to the user's long-term goals, interests, and emotional states.
* Each memory should be a dictionary with the following keys:
    - "memory": The content of the memory (string). Rephrase the content if necessary.
    - "metadata": A dictionary containing additional information about the memory.
* The metadata dictionary should include:
    - "type": The type of memory (string), e.g., "procedure", "fact", "event", "opinion", etc.
    - "memory_time": The time the memory occurred or refers to (string). Must be in standard `YYYY-MM-DD` format. Relative expressions such as "yesterday" or "tomorrow" are not allowed.
    - "source": The origin of the memory (string), e.g., `"conversation"`, `"retrieved"`, `"web"`, `"file"`.
    - "confidence": A numeric score (float between 0 and 100) indicating how certain you are about the accuracy or reliability of the memory.
    - "entities": A list of key entities (array of strings) mentioned in the memory, e.g., people, places, organizations, e.g., `["Alice", "Paris", "OpenAI"]`.
    - "tags": A list of keywords or thematic labels (array of strings) associated with the memory for categorization or retrieval, e.g., `["travel", "health", "project-x"]`.
    - "visibility": The accessibility scope of the memory (string), e.g., `"private"`, `"public"`, `"session"`, determining who or what contexts can access it.
    - "updated_at": The timestamp of the last modification to the memory (string). Useful for tracking memory freshness or change history. Format: ISO 8601 or natural language.
* Current date and time is {datetime.now().isoformat()}.
* Only return the list of memories in JSON format.
* Do not include any other text or explanation.

## Example

### Input

[
    {{"role": "user", "content": "I plan to visit Paris next week."}},
    {{"role": "assistant", "content": "Paris is a beautiful city with many attractions."}},
    {{"role": "user", "content": "I love the Eiffel Tower."}},
    {{"role": "assistant", "content": "The Eiffel Tower is a must-see landmark in Paris."}}
]

### Output

[
  {{
    "memory": "The user plans to visit Paris on 05-26-2025.",
    "metadata": {{
      "type": "event",
      "memory_time": "2025-05-26",
      "source": "conversation",
      "confidence": 90.0,
      "entities": ["Paris"],
      "tags": ["travel", "plans"],
      "visibility": "private",
      "updated_at": "2025-05-19T00:00:00"
    }}
  }},
  {{
    "memory": "The user loves the Eiffel Tower.",
    "metadata": {{
      "type": "opinion",
      "memory_time": "2025-05-19",
      "source": "conversation",
      "confidence": 100.0,
      "entities": ["Eiffel Tower"],
      "tags": ["opinions", "landmarks"],
      "visibility": "session",
      "updated_at": "2025-05-19T00:00:00"
    }}
  }}
]

"""

EXTRACTION_PROMPT_PART_2 = """
## Query

### Input

{messages}

### Output

"""

messages = [
    {"role": "user", "content": "I love eating bananas."},
    {"role": "assistant", "content": "Bananas are a great source of potassium."},
]

model = models.transformers("Qwen/Qwen2.5-0.5B-Instruct")
generator = generate.json(model, TextualMemoryItem)
result = generator(EXTRACTION_PROMPT_PART_1 + EXTRACTION_PROMPT_PART_2.format(messages=messages))
print(result)

It took approximately thirty minutes (nearly all the time was related to outlines' FSM analysis), though the result is correct. 😢

0 replies

Ki-Seki · 2025-06-05T07:06:44Z

Ki-Seki
Jun 5, 2025
Maintainer

Here’s a best practice suggested by @J1awei-Yang to help others when facing issues with JSON data extraction.

Best Practice: Reliable JSON Extraction from LLM Responses with Retry

The Problem

When calling a language model (LLM) to generate structured JSON output, it’s common for the model to occasionally return malformed JSON—such as extra text, missing commas, or unclosed braces. This causes json.loads() to raise a JSONDecodeError.

response = llm.generate(prompt)
json.loads(response)  # 💥 Raises JSONDecodeError if the response is not valid JSON

The Solution: Wrap Both the LLM Call and JSON Parsing in a Retry

Use a retry mechanism (e.g., Tenacity) to automatically retry the entire process—regenerating the LLM output and re-parsing it—whenever a JSONDecodeError occurs.

from tenacity import retry, retry_if_exception_type, stop_after_attempt
import json
import logging

logger = logging.getLogger(__name__)

@retry(
    stop=stop_after_attempt(3),
    retry=retry_if_exception_type(json.JSONDecodeError),
    before_sleep=lambda state: logger.warning(
        f"Failed to parse JSON from LLM output (attempt {state.attempt_number}): {state.outcome.exception()}"
    ),
)
def extract_structured_output(llm, prompt: str) -> list[dict]:
    """Call the LLM and parse its JSON output, retrying on failures."""
    response = llm.generate([{"role": "user", "content": prompt}])
    return json.loads(response)

Usage Example

user_prompt = build_extraction_prompt(messages)
structured_data = extract_structured_output(llm, user_prompt)

Bonus Tips

Apply .strip() or regex extraction to clean the response before parsing.
Log or store failed outputs for debugging if all retries fail.
Customize retry exceptions to include ValueError, TypeError, etc., if necessary.

以下是@J1awei-Yang建议的一个最佳实践，用于帮助他人解决提取JSON数据时遇到的问题。

最佳实践：通过重试从LLM响应中可靠提取JSON

问题描述

当调用语言模型（LLM）生成结构化JSON输出时，模型偶尔会返回格式错误的JSON——例如多余文本、缺失逗号或未闭合的括号等。这会导致json.loads()抛出JSONDecodeError。

response = llm.generate(prompt)
json.loads(response)  # 💥 如果响应不是有效的JSON，将抛出JSONDecodeError

解决方案：将LLM调用和JSON解析同时包装在重试机制中

使用重试机制（如Tenacity），在发生JSONDecodeError时自动重试整个流程——重新生成LLM输出并再次解析。

from tenacity import retry, retry_if_exception_type, stop_after_attempt
import json
import logging

logger = logging.getLogger(__name__)

@retry(
    stop=stop_after_attempt(3),
    retry=retry_if_exception_type(json.JSONDecodeError),
    before_sleep=lambda state: logger.warning(
        f"Failed to parse JSON from LLM output (attempt {state.attempt_number}): {state.outcome.exception()}"
    ),
)
def extract_structured_output(llm, prompt: str) -> list[dict]:
    """调用LLM并解析其JSON输出，失败时自动重试。"""
    response = llm.generate([{"role": "user", "content": prompt}])
    return json.loads(response)

使用示例

user_prompt = build_extraction_prompt(messages)
structured_data = extract_structured_output(llm, user_prompt)

额外建议

在解析前使用.strip()或正则表达式清理响应内容。
如果所有重试均失败，记录或存储失败的输出以便调试。
根据需要自定义重试异常，如包含ValueError、TypeError等。

0 replies

Bug: json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) #3

Uh oh!

Uh oh!

Z1zs May 17, 2025

Currently I suggest that someone be assigned to:

For someone to reproduce the error, running the following commands in root@is-da42k32wkrg6i5ti-devmachine-0 (machine of Metax C500*4)：

Best Practice: Reliable JSON Extraction from LLM Responses with Retry

The Problem

The Solution: Wrap Both the LLM Call and JSON Parsing in a Retry

Replies: 0 comments · 5 replies

Uh oh!

Ki-Seki May 23, 2025 Maintainer

Uh oh!

Uh oh!

Ki-Seki May 26, 2025 Maintainer

Uh oh!

Uh oh!

Z1zs Jun 2, 2025 Author

Uh oh!

Ki-Seki Jun 5, 2025 Maintainer

Uh oh!

Ki-Seki Jun 5, 2025 Maintainer

Best Practice: Reliable JSON Extraction from LLM Responses with Retry

The Problem

The Solution: Wrap Both the LLM Call and JSON Parsing in a Retry

Usage Example

Bonus Tips

最佳实践：通过重试从LLM响应中可靠提取JSON

问题描述

解决方案：将LLM调用和JSON解析同时包装在重试机制中

使用示例

额外建议

Z1zs
May 17, 2025

For someone to reproduce the error, running the following commands in `root@is-da42k32wkrg6i5ti-devmachine-0` (machine of Metax C500*4)：

Replies: 0 comments 5 replies

Ki-Seki
May 23, 2025
Maintainer

Ki-Seki
May 26, 2025
Maintainer

Z1zs
Jun 2, 2025
Author

Ki-Seki
Jun 5, 2025
Maintainer

Ki-Seki
Jun 5, 2025
Maintainer