Skip to content

BennyTMT/GAMETime

Repository files navigation

Inferring Events from Time Series using Language Models

What’s happening behind the time series?

Events Reasoning

Setup 🔧

For our environment configuration and required additional packages, please refer to "environment.yml".

Our pipeline for GRPO training is built upon the Open-r1 framework, for which we express our sincere appreciation. For details regarding the experimental environment setup, you might want ot refer to Open-r1.

🚀🚀 Post-training Significantly Improves Events-Times Reasoning

Warm-up Stage: Knowledge Distillation

For models that initially lack any understanding or reasoning ability in the domain (e.g., sports)— such as Qwen2.5 1.5B — knowledge distillation is a necessary prerequisite before applying reinforcement learning. The input data is structured as follows:

[
  {
    "key": "ORL_CLE_Apr 20, 2024.csv",
    "instruction": "You are an assistant for NBA basketball task. We will .....",
    "input": "Below is provided win probabilities  .... ",
    "process": "<think>\nAlright, let's try to figure out .... </think>\n**d**",
    "label": "d",
    "pred": "d"
  },
]

(The 'process' is distilled solution from DeepSeek-R1-Distill-Qwen-32B and 'instruction' is our question.)

To start the warm-up stage (Slurm), use the following command:

sbatch warm-up/run.sh

Reinforcement Learning: GRPO

After the warm-up, we apply GRPO (Generalized Reinforcement with Prompt Optimization), which enables further improvement through self-improvement. The input data consists of only Q&A pairs:

[
  {
    "instruction": "You are an assistant for NBA basketball task. We will …..",
    "input": "Below is provided win probabilities …. ",
    "output": "c",
    "key": "DEN_OKC_Jan 31, 2024.csv7095"
  },
]

To start the GRPO (Slurm), use the following command:

sbatch scripts-sports-nba/nba-GRPO.sh
sbatch scripts-sports-nfl/nfl-GRPO.sh

Results

Our two-stage post-training enables a 1.5B model without domain-specific reasoning ability to outperform its distillation source (DeepSeek-R1-Distill-Qwen-32B) and in some tasks, even approach OpenAI’s o1.

Post-training Results

(Rewards With Training Steps can be found in Paper`s Figure 6)

We believe most reasoning tasks can be improved through this two-stage Post-training:

  1. Fill knowledge gaps and 2. Self-improvement.

Run Inferring 💁🏼

Setup Prompts

run ./script/build.sh and change the conditions in the shell script for different experiments And the prompts will be/have been saved in "./prompt/"

Events Inferring

./batch.sh (or ./slow.sh) is the shell built to submit tasks in ./script/ Our main configuration files about LLMs and experiments are located in "./tsllm/config/".

How to set up LLMs

The setup of different LLMs is in ./tsllm/models/

Dataset📖

We curate a dataset and propose our benchmark "GAMETime: Generating And Modeling Events from TIME series". This dataset contains a real-valued time series of 1.7 million timestamps along with corresponding event sequences.

If you're interested in our Events-Times data 🛎️

Game Events

Please email the author directly (e.g., [email protected]). To simplify the process, just send the following:

  "Hello, GAMETime."
  [Your Name]

We will provide you with a download link, aiming to complete the process within 1 minute.

Or you can find the script in the resource file to download the HTML and extract the data yourself.

Citation

If you find our work helpful, please kindly cite as follows:

@article{tan2025inferring,
    title={Inferring Events from Time Series using Language Models},
    author={Tan, Mingtian and Merrill, Mike A and Gottesman, Zack and Althoff, Tim and Evans, David and Hartvigsen, Tom},
    journal={arXiv preprint arXiv:2503.14190},
    year={2025}
}