Skip to content

Add Length of Stay Dataset and Prediction Task #526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions pyhealth/fake_los_data.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
patient_id,admission_date,discharge_date
1,2023-01-01,2023-01-05
2,2023-02-10,2023-02-15
3,2023-03-20,2023-03-22
4,2023-04-01,2023-04-10
5,2023-05-05,2023-05-07
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
patient_id,admission_date,discharge_date
1,2023-01-01,2023-01-05
2,2023-02-10,2023-02-15
3,2023-03-20,2023-03-22
4,2023-04-01,2023-04-10
5,2023-05-05,2023-05-07
25 changes: 25 additions & 0 deletions pyhealth/pyhealth/contrib/tasks/length_of_stay/fakedata.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import csv
from datetime import datetime, timedelta

def create_fake_los_data(filename="fake_los_data.csv"):
# Define headers expected by the LOS task
headers = ['admission_id', 'patient_id', 'admission_time', 'discharge_time']

# Prepare some example rows
base_date = datetime(2024, 1, 1, 10, 0)
rows = [
['1', '1001', base_date.strftime("%Y-%m-%d %H:%M:%S"), (base_date + timedelta(days=4, hours=2)).strftime("%Y-%m-%d %H:%M:%S")],
['2', '1002', (base_date + timedelta(days=1)).strftime("%Y-%m-%d %H:%M:%S"), (base_date + timedelta(days=3, hours=5)).strftime("%Y-%m-%d %H:%M:%S")],
['3', '1003', (base_date + timedelta(days=2)).strftime("%Y-%m-%d %H:%M:%S"), (base_date + timedelta(days=5)).strftime("%Y-%m-%d %H:%M:%S")],
]

# Write to CSV
with open(filename, mode='w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(headers)
writer.writerows(rows)

print(f"Created fake LOS data CSV file: {filename}")

if __name__ == "__main__":
create_fake_los_data()
11 changes: 11 additions & 0 deletions pyhealth/pyhealth/contrib/tasks/length_of_stay/lengthofstay.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: "1.4"

tables:
admissions:
file_path: "fake_los_data.csv"
patient_id: "patient_id"
timestamp: "admission_date"
attributes:
- "patient_id"
- "admission_date"
- "discharge_date"
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
root: ./ # base path for your CSV files

tables:
- fakedata.csv

columns:
patient_id: patient_id
admission_time: admission_time
discharge_time: discharge_time
features:
- lab_results
- diagnoses
target:
length_of_stay: length_of_stay_days



17 changes: 17 additions & 0 deletions pyhealth/pyhealth/contrib/tasks/length_of_stay/lengthofstay1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
root: ./ # base path for your CSV files

tables:
- fakedata.csv

columns:
patient_id: patient_id
admission_time: admission_time
discharge_time: discharge_time
features:
- lab_results
- diagnoses
target:
length_of_stay: length_of_stay_days



Loading