Skip to content

Commit ad656fe

Browse files
committed
Merge remote-tracking branch 'rael/wecc' into hydro_scenario
2 parents 7f37d05 + 8362e91 commit ad656fe

File tree

12 files changed

+710
-125
lines changed

12 files changed

+710
-125
lines changed

docs/Performance.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Performance
2+
3+
Memory use and solve time are two important factors that we try to keep to a minimum in our models. There are multiple
4+
things one can do to improve performance.
5+
6+
## Solving methods
7+
8+
By far the biggest factor that impacts performance is the method used by Gurobi. The fastest method is barrier solve
9+
without crossover (use `--recommended-fast`)
10+
however this method often returns a suboptimal solution. The next fastest is barrier solve followed by crossover and
11+
simplex (use `--recommended`) which almost always works. In some cases barrier solve encounters numerical issues (
12+
see [`Numerical Issues.md`](./Numerical%20Issues.md))
13+
in which case the slower Simplex method must be used (`--solver-options-string method=1`).
14+
15+
## Solver interface
16+
17+
Solver interfaces are how Pyomo communicates with Gurobi (or any solver).
18+
19+
There are two solver interfaces that you should know about: `gurobi` and `gurobi_direct`.
20+
21+
- When using `gurobi`, Pyomo will write the entire model to a temporary text file and then start a *separate Gurobi
22+
process* that will read the file, solve the model and write the results to another temporary text file. Once Gurobi
23+
finishes writing the results Pyomo will read the results text file and load the results back into the Python program
24+
before running post_solve (e.g. generate csv files, create graphs, etc). Note that these temporary text files are
25+
stored in `/tmp` but if you use `--recommended-debug` Pyomo and Gurobi will instead use a `temp` folder in your model.
26+
27+
- `gurobi_direct` uses Gurobi's Python library to create and solve the model directly in Python without the use of
28+
intermediate text files.
29+
30+
In theory `gurobi_direct` should be faster and more efficient however in practice we find that that's not the case. As
31+
such we recommend using `gurobi` and all our defaults do so. If someone has the time they could profile `gurobi_direct`
32+
to improve performance at which point we could make `gurobi_direct` the default (and enable `--save-warm-start` by default, see below).
33+
34+
The `gurobi` interface has the added advantage of separating Gurobi and Pyomo into separate threads. This means that
35+
while Gurobi is solving and Pyomo is idle, the operating system can automatically move Pyomo's memory usage
36+
to [virtual memory](https://serverfault.com/questions/48486/what-is-swap-memory)
37+
which will free up more memory for Gurobi.
38+
39+
## Warm starting
40+
41+
Warm starting is the act of using a solution from a previous similar model to start the solver closer to your expected
42+
solution. Theoretically this can help performance however in practice there are several limitations. For this section, *
43+
previous solution* refers to the results from an already solved model that you are using to warm start the solver. *
44+
Current solution* refers to the solution you are trying to find while using the warm start feature.
45+
46+
- To warm start a model use `switch solve --warm-start <path_to_previous_solution>`.
47+
48+
- Warm starting only works if the previous solution does not break any constraints of the current solution. This usually
49+
only happens if a) the model has the exact same set of variables b)
50+
the previous solution was "harder" (e.g. it had more constraints to satisfy).
51+
52+
- Warm starting always uses the slower Simplex method. This means unless you expect the previous solution and current
53+
solution to be very similar, it may be faster to solve without warm start using the barrier method.
54+
55+
- If your previous solution didn't use crossover (e.g. you used `--recommended-fast`) then warm starting will be even
56+
slower since the solver will need to first run crossover before warm starting.
57+
58+
- Our implementation of warm starting only works if your previous solution has an `outputs/warm_start.pickle`
59+
file. This file is only generated when you use `--save-warm-start`.
60+
61+
- `--save-warm-start` and `--warm-start` both use an extension of the `gurobi_direct` solver interface which is
62+
generally slower than the `gurobi` solver interface (see section above).
63+
64+
## Tools for improving performance
65+
66+
- [Memory profiler](https://pypi.org/project/memory-profiler/) for generating plots of the memory
67+
use over time. Use `mprof run --interval 60 --multiprocess switch solve ...` and once solving is done
68+
run `mprof plot -o profile.png` to make the plot.
69+
70+
- [Fil Profiler](https://pypi.org/project/filprofiler/) is an amazing tool for seeing which parts of the code are
71+
using up memory during peak memory usage.
72+
73+
- Using `switch_model.utilities.StepTimer` to measure how long certain code blocks take to run. See examples
74+
throughout the code.

switch_model/balancing/load_zones.py

Lines changed: 44 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -259,10 +259,11 @@ def get_component_per_year(m, z, p, component):
259259
title="Energy balance duals per period",
260260
note="Note: Outliers and zero-valued duals are ignored."
261261
)
262-
def graph(tools):
262+
def graph_energy_balance(tools):
263263
load_balance = tools.get_dataframe('load_balance.csv')
264264
load_balance = tools.transform.timestamp(load_balance)
265-
load_balance["energy_balance_duals"] = tools.pd.to_numeric(load_balance["normalized_energy_balance_duals_dollar_per_mwh"], errors="coerce") / 10
265+
load_balance["energy_balance_duals"] = tools.pd.to_numeric(
266+
load_balance["normalized_energy_balance_duals_dollar_per_mwh"], errors="coerce") / 10
266267
load_balance = load_balance[["energy_balance_duals", "time_row"]]
267268
load_balance = load_balance.pivot(columns="time_row", values="energy_balance_duals")
268269
# Don't include the zero-valued duals
@@ -274,3 +275,44 @@ def graph(tools):
274275
ylabel='Energy balance duals (cents/kWh)',
275276
showfliers=False
276277
)
278+
279+
280+
@graph(
281+
"daily_demand",
282+
title="Total daily demand",
283+
supports_multi_scenario=True
284+
)
285+
def demand(tools):
286+
df = tools.get_dataframe("loads.csv", from_inputs=True, drop_scenario_info=False)
287+
df = df.groupby(["TIMEPOINT", "scenario_name"], as_index=False).sum()
288+
df = tools.transform.timestamp(df, key_col="TIMEPOINT", use_timepoint=True)
289+
df = df.groupby(["season", "hour", "scenario_name", "time_row"], as_index=False).mean()
290+
df["zone_demand_mw"] /= 1e3
291+
pn = tools.pn
292+
293+
plot = pn.ggplot(df) + \
294+
pn.geom_line(pn.aes(x="hour", y="zone_demand_mw", color="scenario_name")) + \
295+
pn.facet_grid("time_row ~ season") + \
296+
pn.labs(x="Hour (PST)", y="Demand (GW)", color="Scenario")
297+
tools.save_figure(plot.draw())
298+
299+
300+
@graph(
301+
"demand",
302+
title="Total demand",
303+
supports_multi_scenario=True
304+
)
305+
def yearly_demand(tools):
306+
df = tools.get_dataframe("loads.csv", from_inputs=True, drop_scenario_info=False)
307+
df = df.groupby(["TIMEPOINT", "scenario_name"], as_index=False).sum()
308+
df = tools.transform.timestamp(df, key_col="TIMEPOINT", use_timepoint=True)
309+
df["zone_demand_mw"] *= df["tp_duration"] / 1e3
310+
df["day"] = df["datetime"].dt.day_of_year
311+
df = df.groupby(["day", "scenario_name", "time_row"], as_index=False)["zone_demand_mw"].sum()
312+
pn = tools.pn
313+
314+
plot = pn.ggplot(df) + \
315+
pn.geom_line(pn.aes(x="day", y="zone_demand_mw", color="scenario_name")) + \
316+
pn.facet_grid("time_row ~ .") + \
317+
pn.labs(x="Day of Year", y="Demand (GW)", color="Scenario")
318+
tools.save_figure(plot.draw())

switch_model/generators/core/build.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -668,14 +668,15 @@ def graph_capacity(tools):
668668
color=tools.get_colors(len(capacity_df.index)),
669669
)
670670

671+
tools.bar_label()
671672

672673
@graph(
673674
"buildout_gen_per_period",
674675
title="Built Capacity per Period",
675676
supports_multi_scenario=True
676677
)
677678
def graph_buildout(tools):
678-
build_gen = tools.get_dataframe("BuildGen.csv")
679+
build_gen = tools.get_dataframe("BuildGen.csv", dtype={"GEN_BLD_YRS_1": str})
679680
build_gen = build_gen.rename(
680681
{"GEN_BLD_YRS_1": "GENERATION_PROJECT", "GEN_BLD_YRS_2": "build_year", "BuildGen": "Amount"},
681682
axis=1

switch_model/generators/core/dispatch.py

Lines changed: 128 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -619,7 +619,6 @@ def graph_hourly_curtailment(tools):
619619
@graph(
620620
"total_dispatch",
621621
title="Total dispatched electricity",
622-
is_long=True,
623622
)
624623
def graph_total_dispatch(tools):
625624
# ---------------------------------- #
@@ -659,6 +658,134 @@ def graph_total_dispatch(tools):
659658
ylabel="Total dispatched electricity (TWh)"
660659
)
661660

661+
tools.bar_label()
662+
663+
@graph(
664+
"energy_balance",
665+
title="Energy Balance For Every Month",
666+
supports_multi_scenario=True,
667+
is_long=True
668+
)
669+
def energy_balance(tools):
670+
# Get dispatch dataframe
671+
cols = ["timestamp", "gen_tech", "gen_energy_source", "DispatchGen_MW", "scenario_name", "scenario_index",
672+
"Curtailment_MW"]
673+
df = tools.get_dataframe("dispatch.csv", drop_scenario_info=False)[cols]
674+
df = tools.transform.gen_type(df)
675+
676+
# Rename and add needed columns
677+
df["Dispatch Limit"] = df["DispatchGen_MW"] + df["Curtailment_MW"]
678+
df = df.drop("Curtailment_MW", axis=1)
679+
df = df.rename({"DispatchGen_MW": "Dispatch"}, axis=1)
680+
# Sum dispatch across all the projects of the same type and timepoint
681+
key_columns = ["timestamp", "gen_type", "scenario_name", "scenario_index"]
682+
df = df.groupby(key_columns, as_index=False).sum()
683+
df = df.melt(id_vars=key_columns, value_vars=["Dispatch", "Dispatch Limit"], var_name="Type")
684+
df = df.rename({"gen_type": "Source"}, axis=1)
685+
686+
discharge = df[(df["Source"] == "Storage") & (df["Type"] == "Dispatch")].drop(["Source", "Type"], axis=1).rename(
687+
{"value": "discharge"}, axis=1)
688+
689+
# Get load dataframe
690+
load = tools.get_dataframe("load_balance.csv", drop_scenario_info=False)
691+
load = load.drop("normalized_energy_balance_duals_dollar_per_mwh", axis=1)
692+
693+
# Sum load across all the load zones
694+
key_columns = ["timestamp", "scenario_name", "scenario_index"]
695+
load = load.groupby(key_columns, as_index=False).sum()
696+
697+
# Subtract storage dispatch from generation and add it to the storage charge to get net flow
698+
load = load.merge(
699+
discharge,
700+
how="left",
701+
on=key_columns,
702+
validate="one_to_one"
703+
)
704+
load["ZoneTotalCentralDispatch"] -= load["discharge"]
705+
load["StorageNetCharge"] += load["discharge"]
706+
load = load.drop("discharge", axis=1)
707+
708+
# Rename and convert from wide to long format
709+
load = load.rename({
710+
"ZoneTotalCentralDispatch": "Total Generation (excl. storage discharge)",
711+
"TXPowerNet": "Transmission Losses",
712+
"StorageNetCharge": "Storage Net Flow",
713+
"zone_demand_mw": "Demand",
714+
}, axis=1).sort_index(axis=1)
715+
load = load.melt(id_vars=key_columns, var_name="Source")
716+
load["Type"] = "Dispatch"
717+
718+
# Merge dispatch contributions with load contributions
719+
df = pd.concat([load, df])
720+
721+
# Add the timestamp information and make period string to ensure it doesn't mess up the graphing
722+
df = tools.transform.timestamp(df).astype({"period": str})
723+
724+
# Convert to TWh (incl. multiply by timepoint duration)
725+
df["value"] *= df["tp_duration"] / 1e6
726+
727+
FREQUENCY = "1W"
728+
729+
def groupby_time(df):
730+
return df.groupby([
731+
"scenario_name",
732+
"period",
733+
"Source",
734+
"Type",
735+
tools.pd.Grouper(key="datetime", freq=FREQUENCY, origin="start")
736+
])["value"]
737+
738+
df = groupby_time(df).sum().reset_index()
739+
740+
# Get the state of charge data
741+
soc = tools.get_dataframe("StateOfCharge.csv", dtype={"STORAGE_GEN_TPS_1": str}, drop_scenario_info=False)
742+
soc = soc.rename({"STORAGE_GEN_TPS_2": "timepoint", "StateOfCharge": "value"}, axis=1)
743+
# Sum over all the projects that are in the same scenario with the same timepoint
744+
soc = soc.groupby(["timepoint", "scenario_name"], as_index=False).sum()
745+
soc["Source"] = "State Of Charge"
746+
soc["value"] /= 1e6 # Convert to TWh
747+
748+
# Group by time
749+
soc = tools.transform.timestamp(soc, use_timepoint=True, key_col="timepoint").astype({"period": str})
750+
soc["Type"] = "Dispatch"
751+
soc = groupby_time(soc).mean().reset_index()
752+
753+
# Add state of charge to dataframe
754+
df = pd.concat([df, soc])
755+
# Add column for day since that's what we really care about
756+
df["day"] = df["datetime"].dt.dayofyear
757+
758+
# Plot
759+
# Get the colors for the lines
760+
colors = tools.get_colors()
761+
colors.update({
762+
"Transmission Losses": "brown",
763+
"Storage Net Flow": "cadetblue",
764+
"Demand": "black",
765+
"Total Generation (excl. storage discharge)": "black",
766+
"State Of Charge": "green"
767+
})
768+
769+
# plot
770+
num_periods = df["period"].nunique()
771+
pn = tools.pn
772+
plot = pn.ggplot(df) + \
773+
pn.geom_line(pn.aes(x="day", y="value", color="Source", linetype="Type")) + \
774+
pn.facet_grid("period ~ scenario_name") + \
775+
pn.labs(y="Contribution to Energy Balance (TWh)") + \
776+
pn.scales.scale_color_manual(values=colors, aesthetics="color", na_value=colors["Other"]) + \
777+
pn.scales.scale_x_continuous(
778+
name="Month",
779+
labels=["J", "F", "M", "A", "M", "J", "J", "A", "S", "O", "N", "D"],
780+
breaks=(15, 46, 76, 106, 137, 167, 198, 228, 259, 289, 319, 350),
781+
limits=(0, 366)) + \
782+
pn.scales.scale_linetype_manual(
783+
values={"Dispatch Limit": "dotted", "Dispatch": "solid"}
784+
) + \
785+
pn.theme(
786+
figure_size=(pn.options.figure_size[0] * tools.num_scenarios, pn.options.figure_size[1] * num_periods))
787+
788+
tools.save_figure(plot.draw())
662789

663790
@graph(
664791
"curtailment_per_period",

switch_model/generators/extensions/storage.py

Lines changed: 33 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -451,7 +451,7 @@ def graph_state_of_charge(tools):
451451
df = tools.get_dataframe("storage_dispatch")
452452
df = df.groupby(["timepoint", "scenario_name"], as_index=False)["StateOfCharge"].sum()
453453
# Add datetime information
454-
df = tools.transform.timestamp(df, timestamp_col="timepoint")
454+
df = tools.transform.timestamp(df, key_col="timepoint")
455455
# Count num rows
456456
num_periods = len(df["period"].unique())
457457

@@ -467,16 +467,24 @@ def graph_state_of_charge(tools):
467467
how="left"
468468
)
469469

470-
# Convert values to GWh
471-
df["StateOfCharge"] /= 1e3
472-
df["OnlineEnergyCapacityMWh"] /= 1e3
470+
# Convert values to TWh
471+
df["StateOfCharge"] /= 1e6
472+
df["OnlineEnergyCapacityMWh"] /= 1e6
473+
474+
# Determine information for the label
475+
y_axis_lim = df["OnlineEnergyCapacityMWh"].max()
476+
offset = y_axis_lim * 0.05
477+
df["label_position"] = df["OnlineEnergyCapacityMWh"] + offset
478+
df["label"] = df["OnlineEnergyCapacityMWh"].round(decimals=2)
479+
label_x_pos = df["datetime"].median()
473480

474481
# Plot with plotnine
475482
pn = tools.pn
476483
plot = pn.ggplot(df, pn.aes(x="datetime", y="StateOfCharge")) \
477484
+ pn.geom_line() \
478-
+ pn.labs(y="State of Charge (GWh)", x="Time of Year") \
479-
+ pn.geom_hline(pn.aes(yintercept="OnlineEnergyCapacityMWh"), linetype="dashed", color='blue')
485+
+ pn.labs(y="State of Charge (TWh)", x="Time of Year") \
486+
+ pn.geom_hline(pn.aes(yintercept="OnlineEnergyCapacityMWh"), linetype="dashed", color='blue') \
487+
+ pn.geom_text(pn.aes(label="label", x=label_x_pos, y="label_position"), fontweight="light", size="10")
480488

481489
tools.save_figure(by_scenario_and_period(tools, plot, num_periods).draw())
482490

@@ -499,7 +507,7 @@ def graph_state_of_charge_per_duration(tools):
499507
# Get the total state of charge at each timepoint for each project
500508
df = tools.get_dataframe("storage_dispatch")[
501509
["generation_project", "timepoint", "StateOfCharge", "scenario_name"]]
502-
df = tools.transform.timestamp(df, timestamp_col="timepoint")
510+
df = tools.transform.timestamp(df, key_col="timepoint")
503511

504512
# Add the capacity information to the state of charge information
505513
df = df.merge(
@@ -511,7 +519,9 @@ def graph_state_of_charge_per_duration(tools):
511519
df = df.groupby(["duration", "scenario_name", "datetime", "period"], as_index=False)[
512520
["StateOfCharge", "OnlineEnergyCapacityMWh"]].sum()
513521
# Convert to GWh
514-
df["StateOfCharge"] /= 1e3
522+
# df["StateOfCharge"] /= 1e3
523+
# Convert to percent
524+
df["StateOfCharge"] /= df["OnlineEnergyCapacityMWh"]
515525

516526
# Plot with plotnine
517527
pn = tools.pn
@@ -532,7 +542,7 @@ def graph_dispatch_cycles(tools):
532542
# Aggregate by timepoint
533543
df = df.groupby("timepoint", as_index=False).sum()
534544
# Add datetime column
535-
df = tools.transform.timestamp(df, timestamp_col="timepoint")
545+
df = tools.transform.timestamp(df, key_col="timepoint")
536546
# Find charge in GWh
537547
df["StateOfCharge"] /= 1e3
538548

@@ -560,7 +570,12 @@ def graph_dispatch_cycles(tools):
560570
title="Storage cycle duration based on fourier transform"
561571
" of state of charge")
562572
ax.semilogx(1 / xfreq, yfreq)
573+
# Plot some key cycle lengths
574+
ax.axvline(24, linestyle="dotted", label="24 hours", color="red") # A day
575+
ax.axvline(24 * 21, linestyle="dotted", label="3 weeks", color="green") # 3 weeks
576+
ax.axvline(24 * 182.5, linestyle="dotted", label="1/2 Year", color="purple")
563577
ax.set_xlabel("Hours per cycle")
578+
ax.legend()
564579
ax.grid(True, which="both", axis="x")
565580

566581

@@ -578,6 +593,7 @@ def graph_buildout(tools):
578593
df = df[df["IncrementalPowerCapacityMW"] != 0]
579594
df["duration"] = df["IncrementalEnergyCapacityMWh"] / df["IncrementalPowerCapacityMW"]
580595
df["power"] = df["IncrementalPowerCapacityMW"] / 1e3
596+
df["energy"] = df["IncrementalEnergyCapacityMWh"] / 1e3
581597
df = tools.transform.build_year(df)
582598
pn = tools.pn
583599
num_regions = len(df["region"].unique())
@@ -598,6 +614,14 @@ def graph_buildout(tools):
598614
tools.save_figure(by_scenario(tools, plot).draw(), "storage_duration_histogram")
599615
tools.save_figure(by_scenario_and_region(tools, plot, num_regions).draw(), "storage_duration_histogram_by_region")
600616

617+
plot = pn.ggplot(df, pn.aes(x="duration")) \
618+
+ pn.geom_histogram(pn.aes(weight="energy"), binwidth=5) \
619+
+ pn.labs(title="Storage Duration Histogram", x="Duration (h)", y="Energy Capacity (GWh)")
620+
621+
tools.save_figure(by_scenario(tools, plot).draw(), "storage_duration_histogram_by_energy")
622+
tools.save_figure(by_scenario_and_region(tools, plot, num_regions).draw(), "storage_duration_histogram_by_region_and_energy")
623+
624+
601625

602626
def by_scenario(tools, plot):
603627
pn = tools.pn

0 commit comments

Comments
 (0)