Skip to content

Commit 5878857

Browse files
authored
Merge pull request #72 from DavyMorgan/main
add example [online judge programming]
2 parents c779ac9 + de72955 commit 5878857

File tree

7 files changed

+887
-0
lines changed

7 files changed

+887
-0
lines changed
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# Online Judge Programming Example
2+
3+
This example demonstrates how OpenEvolve can solve programming problems and pass all test cases on [Kattis online judge](https://open.kattis.com/) starting from scratch.
4+
5+
## Problem Description
6+
7+
We take the [Alphabet](https://open.kattis.com/problems/alphabet) problem from [Kattis](https://open.kattis.com/) as following:
8+
```markdown
9+
A string of lowercase letters is called **alphabetical** if some of the letters can be deleted so that the only letters that remain are the letters from 'a' to 'z' in order. Given a string s, determine the minimum number of letters to add anywhere in the string to make it alphabetical.
10+
11+
Input:
12+
Each input will consist of a single test case. Note that your program may be run multiple times on different inputs. The only line of input contains a string s (1 ≤ |s| ≤ 50) which contains only lowercase letters.
13+
14+
Output:
15+
Output a single integer, which is the smallest number of letters needed to add to `s` to make it alphabetical.
16+
17+
Sample Input 1:
18+
xyzabcdefghijklmnopqrstuvw
19+
20+
Sample Output 1:
21+
3
22+
23+
Sample Input 2:
24+
aiemckgobjfndlhp
25+
26+
Sample Output 2:
27+
20
28+
```
29+
30+
## Getting Started
31+
32+
First, fill your username and token in `example.kattisrc` according to your personal configuration file (must be logged in) from [Kattis](https://open.kattis.com/download/kattisrc) and rename the file as `.kittisrc`.
33+
34+
Then, to run this example:
35+
36+
```bash
37+
cd examples/online_judge_programming
38+
python ../../openevolve-run.py initial_program.py evaluator.py --config config.yaml
39+
```
40+
41+
## Algorithm Evolution
42+
43+
### Initial Algorithm (dummy output)
44+
45+
The initial implementation was a simple dummy output that returned 0 directly.
46+
47+
```python
48+
import sys
49+
for line in sys.stdin:
50+
s = line.strip()
51+
52+
ans = 0
53+
print(ans)
54+
```
55+
56+
### Evolved Algorithm (Dynamic Programming)
57+
58+
After running OpenEvolve for just 4 iterations, it discovered a dynamic programming algorithm that passes all test cases on Kattis:
59+
60+
```python
61+
import sys
62+
63+
for line in sys.stdin:
64+
s = line.strip()
65+
66+
n = len(s)
67+
dp = [1] * n
68+
69+
for i in range(1, n):
70+
for j in range(i):
71+
if s[i] > s[j]:
72+
dp[i] = max(dp[i], dp[j] + 1)
73+
74+
longest_alphabetical_subsequence_length = max(dp)
75+
ans = 26 - longest_alphabetical_subsequence_length
76+
print(ans)
77+
```
78+
79+
## Next Steps
80+
81+
Try modifying the config.yaml file to:
82+
- Change the programming problem in system prompt
83+
- Change the LLM model configuration
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Configuration for function minimization example
2+
max_iterations: 20
3+
checkpoint_interval: 1
4+
log_level: "INFO"
5+
6+
# LLM configuration
7+
llm:
8+
primary_model: "gemini-2.0-flash"
9+
primary_model_weight: 0.6
10+
secondary_model: "gemini-2.5-flash-preview-05-20"
11+
secondary_model_weight: 0.4
12+
api_base: "https://generativelanguage.googleapis.com/v1beta/openai/"
13+
api_key: YOUR_API_KEY
14+
temperature: 0.7
15+
top_p: 0.95
16+
max_tokens: 4096
17+
18+
# Prompt configuration
19+
prompt:
20+
system_message: |
21+
You are an expert programmer. Your task is to implement an algorithm in Python to pass all the test cases. The problem is as follows:
22+
23+
A string of lowercase letters is called alphabetical if some of the letters can be deleted so that the only letters that remain are the letters from a to z in order. Given a string s, determine the minimum number of letters to add anywhere in the string to make it alphabetical.
24+
25+
Input:
26+
Each input will consist of a single test case. Note that your program may be run multiple times on different inputs. The only line of input contains a string s (1 ≤ |s| ≤ 50) which contains only lowercase letters.
27+
Output:
28+
Output a single integer, which is the smallest number of letters needed to add to s to make it alphabetical.
29+
30+
Sample Input 1:
31+
xyzabcdefghijklmnopqrstuvw
32+
Sample Output 1:
33+
3
34+
35+
Sample Input 2:
36+
aiemckgobjfndlhp
37+
Sample Output 2:
38+
20
39+
40+
Your program should always read/write to STDIN/STDOUT. For example, to handle integer input, use the following format:
41+
```
42+
import sys
43+
for line in sys.stdin:
44+
data = int(line)
45+
```
46+
Use print() for output. For example:
47+
```
48+
print("Hello, World!")
49+
```
50+
num_top_programs: 3
51+
use_template_stochasticity: true
52+
53+
# Database configuration
54+
database:
55+
population_size: 50
56+
archive_size: 20
57+
num_islands: 3
58+
elite_selection_ratio: 0.2
59+
exploitation_ratio: 0.7
60+
61+
# Evaluator configuration
62+
evaluator:
63+
timeout: 60
64+
cascade_evaluation: false
65+
cascade_thresholds: [1.0]
66+
parallel_evaluations: 4
67+
use_llm_feedback: false
68+
69+
# Evolution settings
70+
diff_based_evolution: true
71+
allow_full_rewrites: false
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
"""
2+
Evaluator for the function minimization example
3+
"""
4+
5+
import re
6+
import subprocess
7+
import time
8+
import traceback
9+
10+
11+
def run_with_timeout(program_path, timeout_seconds=60):
12+
"""
13+
Run a function with a timeout using subprocess.
14+
15+
Args:
16+
func: Function to run
17+
args: Arguments to pass to the function
18+
kwargs: Keyword arguments to pass to the function
19+
timeout_seconds: Timeout in seconds
20+
21+
Returns:
22+
Result of the function or raises TimeoutError
23+
"""
24+
cmd = ["python", "submit.py", program_path, "-p", "alphabet", "-l", "Python 3", "-f"]
25+
26+
try:
27+
# Run the command and grab its output using subprocess.Popen
28+
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
29+
stdout, stderr = proc.communicate(timeout=timeout_seconds)
30+
exit_code = proc.returncode
31+
if exit_code != 0:
32+
print(stderr) # Print the error output if the command failed
33+
raise RuntimeError(f"Process exited with code {exit_code}")
34+
except subprocess.TimeoutExpired:
35+
# Kill the process if it times out
36+
proc.kill()
37+
raise TimeoutError(f"Process timed out after {timeout_seconds} seconds")
38+
39+
pattern = (
40+
r"Score:\s*(\d+)\s*"
41+
r"Test cases done:\s*(\d+)\s*"
42+
r"Test cases correct:\s*(\d+)\s*"
43+
r"Test cases total:\s*(\d+)"
44+
)
45+
match = re.search(pattern, stdout)
46+
if not match:
47+
raise ValueError("Expected summary lines not found")
48+
49+
score, done, correct, total = map(int, match.groups())
50+
return score, done, correct, total
51+
52+
53+
def evaluate(program_path):
54+
"""
55+
Evaluate the program by submitting it to OJ and fetching metrics based on how well it performs.
56+
57+
Args:
58+
program_path: Path to the program file
59+
60+
Returns:
61+
Dictionary of metrics
62+
"""
63+
try:
64+
# For constructor-based approaches, a single evaluation is sufficient
65+
# since the result is deterministic
66+
start_time = time.time()
67+
68+
# Use subprocess to run with timeout
69+
score, done, correct, total = run_with_timeout(
70+
program_path, timeout_seconds=60 # Single timeout
71+
)
72+
73+
end_time = time.time()
74+
eval_time = end_time - start_time
75+
76+
# Combined score - higher is better
77+
combined_score = correct / total if total > 0 else 0.0
78+
79+
print(
80+
f"Evaluation: Score={score}, Done={done}, Correct={correct}, Total={total}, Combined={combined_score:.2f}"
81+
)
82+
83+
return {
84+
"score": score,
85+
"done": done,
86+
"correct": correct,
87+
"total": total,
88+
"eval_time": eval_time,
89+
"combined_score": float(combined_score),
90+
}
91+
92+
except Exception as e:
93+
print(f"Evaluation failed completely: {str(e)}")
94+
traceback.print_exc()
95+
return {
96+
"score": 0,
97+
"done": 0,
98+
"correct": 0,
99+
"total": 0,
100+
"eval_time": 0.0,
101+
"combined_score": 0.0,
102+
}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# Please save this file as .kattisrc in your home directory.
2+
# This file includes a secret token that allows you to log in.
3+
# DO NOT SHARE IT WITH ANYONE ELSE.
4+
# If someone gets access to this token, please revoke it by changing your KATTIS password.
5+
6+
[user]
7+
username: YOUR_USERNAME
8+
token: YOUR_TOKEN
9+
10+
[kattis]
11+
hostname: open.kattis.com
12+
loginurl: https://open.kattis.com/login
13+
submissionurl: https://open.kattis.com/submit
14+
submissionsurl: https://open.kattis.com/submissions
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
"""Online judge programming example for OpenEvolve"""
2+
3+
# EVOLVE-BLOCK-START
4+
import sys
5+
6+
for line in sys.stdin:
7+
s = line.strip()
8+
9+
ans = 0
10+
print(ans)
11+
12+
# EVOLVE-BLOCK-END
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
lxml
2+
requests

0 commit comments

Comments
 (0)