-
Notifications
You must be signed in to change notification settings - Fork 90
Meters data screening
Normalized consumption is the daily energy consumption (KWh) per area unit (square meters) of the building. You can replicate the plot shown in Figure 1 following this notebook.
Input data: raw meter data was used, without processing outliers or missing values.
- Calculate daily sum of energy consumption
- Calculate daily normalized consumption: daily consumption by area unit of the building
- Min-max scaler
- Plot daily normalized consumption on heatmap
Figure 1: normalized consumption for each kind of meter during the years 2016-2017, scaled by min-max. Buildings are sorted (bottom to top) from lowest to highest -scaled- daily normalized consumption.
This screening has the purpose to identify the quality of the data (missing values, outliers, zero readings) in a glance. You can replicate the plot shown in Figure 2 following this notebook.
Input data: raw meter data was used and processed before plotting.
- Detect atypical data. Outliers in the raw meters dataset were detected using the Seasonal Hybrid ESD (S-H-ESD) developed by Twitter. This part was implemented in R language, the process can be found here.
Note: detecting outliers in all the meters data sets can take a couple of hours, go cook dinner while you wait.- Label outliers, missing values, zero-values and "good data"
- Plot labels on heatmap
Figure 2: Data quality plot of each meter type. Sorted (bottom-to-top) according to increasing number of "good data".
Weahter senstivity is the correlation (Spearman's Rank Coefficient was used here) between the energy consumption (KWh) and the outside air temperature (ªC). A positive correlation implies high energy consumption when the outside temperature rise (i.e., cooling energy). Oposite to this, a negative correlation is energy used in heating. Buildings (y axis) are ordered from negative to positive correlation sum. You can replicate the plot shown in Figure 3 following this notebook.
Input data: raw meter data was used and processed before plotting (outliers and 24 hours zero readings removed).
- Add to meter dataset, site ID (metadata dataset) and outside air temperature (weather dataset) columns.
- Calculate Spearman's rank correlation coefficient between the meter reading and the outside air temperature, for each month-building
- Plot correlation coefficient on heatmap
Figure 3: Weather sensitivity plot of each meter type. Spearman's rank coefficient was calculated between the meter reading (KWh) and the outside air temperature (ºC) for each month and building. Sorted (bottom-to-top) according to increasing sum of coefficients.
A breakout occurs when value in a time serie exits an area pattern; a breakout is typically characterized by two steady states and an intermediate transition period. For breakouts detection in the raw meters dataset was used the Breakout Detection package developed by Twitter, choosing 168 points (a week) as minimum to define a gap. A brief introduction about this package can be found here. Raw meter data are used for this analysis. This was implemented in R language, the process can be found here. You can replicate the plot shown in Figure 4 following this notebook.
Input data: raw meter data was used.
- Detect breakouts.
Note: detecting breakouts in all the meters data sets can take days, go watch the Sopranos while you wait (the whole show). Or, you can run it in several R sessions at the same time (that's up to your number of cores).- Label each point to the gap it belong to (gaps are numbered starting from 0)
- Plot labels on heatmap
Figure 4: Breakout detection heatmap sorted (bottom-to-top) according to increasing number of breakouts detected.
- Miller, C., 2017. Screening Meter Data: Characterization of Temporal Energy Data from Large Groups of Non-Residential Buildings. ETH Zürich, Zurich, Switzerland.
- Miller, Clayton & Schlueter, Arno. (2015). Forensically discovering simulation feedback knowledge from a campus energy information system. 10.13140/RG.2.1.2286.0964.