Bases: nilmtk.hashable.Hashable
Represents an appliance instance.
Attributes
metadata | (dict) See here metadata attributes: http://nilm-metadata.readthedocs.org/en/latest/dataset_metadata.html#appliance |
Static (AKA class) variable. Maps from appliance_type (string) to a dict describing metadata about each appliance type.
Return string ‘(<type>, <identifier>)’ e.g. ‘(fridge, 1)’ if pretty=False else if pretty=True then return a string like ‘Fridge’ or ‘Fridge 2’. If type == ‘unknown’ then appends original_name to end of label.
Bases: nilmtk.hashable.Hashable
Attributes
elec | (MeterGroup) |
metadata | (dict) Metadata just about this building (e.g. geo location etc). See http://nilm-metadata.readthedocs.org/en/latest/dataset_metadata.html#building Has these additional keys: dataset : string |
Bases: object
Attributes
buildings | (OrderedDict) Each key is an integer, starting from 1. Each value is a nilmtk.Building object. |
store | (nilmtk.DataStore) |
metadata | (dict) Metadata describing the dataset name, authors etc. (Metadata about specific buildings, meters, appliances etc. is stored elsewhere.) See http://nilm-metadata.readthedocs.org/en/latest/dataset_metadata.html#dataset |
Returns a DataFrame describing this dataset. Each column is a building. Each row is a feature.
doc_inherit decorator
Usage:
@doc_inherit def foo(self):
pass
Now, Bar.foo.__doc__ == Bar().foo.__doc__ == Foo.foo.__doc__ == “Frobber”
from: http://code.activestate.com/recipes/576862-docstring-inheritance-decorator/
Bases: object
Docstring inheriting method descriptor
The class itself is also used as a decorator
alias of DocInherit
Bases: nilmtk.hashable.Hashable, nilmtk.electric.Electric
Represents a physical electricity meter.
Attributes
appliances | (list of Appliance objects connected immediately downstream) of this meter. Will be [] if no appliances are connected directly to this meter. |
store | (nilmtk.DataStore) |
key | (string) key into nilmtk.DataStore to access data. |
metadata | (dict.) See http://nilm-metadata.readthedocs.org/en/latest/dataset_metadata.html#elecmeter |
Finds available alternating current types for a specific physical quantity.
Parameters: | physical_quantity : str or list of strings |
---|---|
Returns: | list of strings e.g. [‘apparent’, ‘active’] |
See also
_compute_stat, _get_stat_from_cache_or_compute, key_for_cached_stat, get_cached_stat
Tries to find the most dominant appliance on this meter, and then returns that appliance object. Will return None if there are no appliances on this meter.
Parameters: | ignore_gaps : bool, default=True
full_results : bool, default=False **loader_kwargs : key word arguments for DataStore.load() |
---|---|
Returns: | DropoutRateResults object if full_results is True, else float |
Parameters: | key_for_stat : str |
---|---|
Returns: | pd.DataFrame |
See also
_compute_stat, _get_stat_from_cache_or_compute, key_for_cached_stat, clear_cache
Parameters: | full_results : bool, default=False **loader_kwargs : key word arguments for DataStore.load() |
---|---|
Returns: | if full_results is True then return nilmtk.stats.GoodSectionsResults object otherwise return list of TimeFrame objects. |
Parameters: | stat_name : str |
---|---|
Returns: | key : str |
See also
clear_cache, _compute_stat, _get_stat_from_cache_or_compute, get_cached_stat
Returns a string describing this meter.
Parameters: | pretty : boolean
|
---|---|
Returns: | string : A label listing all the appliance types. |
Returns a generator of DataFrames loaded from the DataStore.
By default, load will load all available columns from the DataStore. Specific columns can be selected in one or two mutually exclusive ways:
If ‘resample’ is set to ‘True’ then the default behaviour is for gaps shorter than max_sample_period will be forward filled.
Parameters: | physical_quantity : string or list of strings
ac_type : string or list of strings, defaults to None
cols : list of tuples, using NILMTK’s vocabulary for measurements.
sample_period : int, defaults to None
resample : boolean, defaults to False
resample_kwargs : dict of key word arguments (other than ‘rule’) to
preprocessing : list of Node subclass instances
**kwargs : any other key word arguments to pass to self.store.load() |
---|---|
Returns: | Always return a generator of DataFrames (even if it only has a single column). |
Raises: | nilmtk.exceptions.MeasurementError if a measurement is specified which is not available. |
Convert all relevant attributes to a dict to be saved as metadata in destination at location specified by key
Bases: object
Common implementations of methods shared by ElecMeter and MeterGroup.
Returns runs of an appliance.
Most appliances spend a lot of their time off. This function finds periods when the appliance is on.
Parameters: | min_off_duration : int
min_on_duration : int
border : int
on_power_threshold : int or float
**kwargs : kwargs for self.power_series() |
---|---|
Returns: | list of pd.Series. Each series contains one activation. |
Return a histogram vector showing when activity occurs.
e.g. to see when, over the course of an average day, activity occurs then use bin_duration=’H’ and period=’D’.
Parameters: | period : str. Pandas period alias. bin_duration : str. Pandas period alias e.g. ‘H’ = hourly; ‘D’ = daily.
Returns ——- hist : np.ndarray
|
---|
Finds available alternating current types from power measurements.
Returns: | list of strings e.g. [‘apparent’, ‘active’] Note Deprecated in NILMTK v0.3 available_power_ac_types should not be used. Instead please use available_ac_types(‘power’). |
---|
Calculate the average energy per period. e.g. the average energy per day.
Parameters: | offset_alias : str
use_uptime : bool |
---|---|
Returns: | pd.Series
|
Finds the correlation between the two ElecMeters. Both the ElecMeters should be perfectly aligned Adapted from: http://www.johndcook.com/blog/2008/11/05/how-to-calculate-pearson-correlation-accurately/
Parameters: | other : an ElecMeter or MeterGroup object |
---|---|
Returns: | float : [-1, 1] |
This implementation is provided courtesy NPEET toolbox, the authors kindly allowed us to directly use their code. As a courtesy procedure, you may wish to cite their paper, in case you use this function. This fails if there is a large number of records. Need to ask the authors what to do about the same! The classic K-L k-nearest neighbor continuous entropy estimator x should be a list of vectors, e.g. x = [[1.3],[3.7],[5.1],[2.4]] if x is a one-dimensional scalar and we have four samples
Parameters: | ac_type : str physical_quantity : str
**kwargs : passed through to load(). |
---|---|
Returns: | generator of pd.Series. If a single ac_type is found for the physical_quantity then the series.name will be a normal tuple. If more than 1 ac_type is found then the ac_type will be a string of the ac_types with ‘+’ in between. e.g. ‘active+apparent’. |
Parameters: | key : dict |
---|---|
Returns: | True if all key:value pairs in key match any appliance in self.appliances. |
Mutual information of two ElecMeters x,y should be a list of vectors, e.g. x = [[1.3],[3.7],[5.1],[2.4]] if x is a one-dimensional scalar and we have four samples
Parameters: | other : ElecMeter or MeterGroup |
---|
Returns the minimum on_power_threshold across all appliances immediately downstream of this meter. If any appliance does not have an on_power_threshold then default to 10 watts.
Parameters: | width : int, optional
ax : matplotlib.axes, optional plot_legend : boolean, optional
unit : {‘W’, ‘kW’} **kwargs |
---|
Plots autocorrelation of power data Reference: http://www.itl.nist.gov/div898/handbook/eda/section3/autocopl.htm
Returns: | matplotlib.axis |
---|
Plots a lag plot of power data http://www.itl.nist.gov/div898/handbook/eda/section3/lagplot.htm
Returns: | matplotlib.axis |
---|
Parameters: | ax : axes load_kwargs : dict plot_kwargs : dict range : None or tuple
**hist_kwargs |
---|---|
Returns: | ax |
Plots spectral plot of power data http://www.itl.nist.gov/div898/handbook/eda/section3/spectrum.htm
Code borrowed from: http://glowingpython.blogspot.com/2011/08/how-to-plot-frequency-spectrum-with.html
Returns: | matplotlib.axis |
---|
Get power Series.
Parameters: | ac_type : str, defaults to ‘best’ **kwargs :
|
---|---|
Returns: | generator of pd.Series of power measurements. |
Compute the proportion of energy of self compared to other.
By default, only uses other.good_sections(). You may want to set sections=self.good_sections().intersection(other.good_sections())
Parameters: | other : nilmtk.MeteGroup or ElecMeter
|
---|---|
Returns: | float [0,1] or NaN if other.total_energy == 0 |
Returns a value in the range [0,1] specifying the proportion of the upstream meter’s total energy used by this meter.
Returns an array of pd.DateTime when a switch occurs as defined by threshold
Parameters: | threshold: int, threshold in Watts between succcessive readings to amount for an appliance state change |
---|
Are the connected appliances appliance is on (True) or off (False)?
Uses self.on_power_threshold() if on_power_threshold not provided.
Parameters: | on_power_threshold : number, optional
**load_kwargs : key word arguments
|
---|---|
Returns: | generator of pd.Series
|
Returns runs of an appliance.
Most appliances spend a lot of their time off. This function finds periods when the appliance is on.
Parameters: | chunk : pd.Series min_off_duration : int
min_on_duration : int
border : int
on_power_threshold : int or float
|
---|---|
Returns: | list of pd.Series. Each series contains one activation. |
Returns a generator of 2-column pd.DataFrames. The first column is from master, the second from slave.
Takes the sample rate and good_periods of master and applies to slave.
Parameters: | master, slave : ElecMeter or MeterGroup instances |
---|
File defining custom nilmtk exception classes.
Parameters: | column_tuples : list of 2-tuples |
---|---|
Returns: | pd.MultiIndex |
Selects the ‘best’ alternating current measurement type from available_ac_types.
Parameters: | available_ac_types : list of strings
mains_ac_types : list of strings, optional
|
---|---|
Returns: | best_ac_type : string |
Bases: nilmtk.electric.Electric
A group of ElecMeter objects. Can contain nested MeterGroup objects.
Implements many of the same methods as ElecMeter.
Attributes
meters | (list of ElecMeters or nested MeterGroups) |
disabled_meters | (list of ElecMeters or nested MeterGroups) |
name | (only set by functions like ‘groupby’ and ‘select_top_k’) |
Returns set of all available alternating current types for a specific physical quantity.
Parameters: | physical_quantity : str or list of strings |
---|---|
Returns: | list of strings e.g. [‘apparent’, ‘active’] |
Calls method on each element in self.meters.
Parameters: | method : str
|
---|---|
Returns: | pd.Series of result of method called on each element in self.meters. |
Returns True if this MeterGroup contains meters from more than one building.
Parameters: | sample_period : int or float, optional
resample : bool, defaults to True
**kwargs :
ac_type : string, defaults to ‘best’ physical_quantity: string, defaults to ‘power’ |
---|---|
Returns: | DataFrame
|
Returns pd.Series describing this MeterGroup.
Sums together total energy for each meter.
Parameters: | full_results : bool, default=False **loader_kwargs : key word arguments for DataStore.load() |
---|---|
Returns: | if full_results is True then return TotalEnergyResults object else return either a single number of, if there are multiple AC types, then return a pd.Series with a row for each AC type. |
Returns pd.DataFrame where columns is meter.identifier and each value is total energy. Index is AC types.
Does not care about wiring hierarchy. Does not attempt to ensure all channels share the same time sections.
Parameters: | per_period : None or offset alias
ac_type : None or str
use_meter_labels : bool
mains : None or MeterGroup or ElecMeter
|
---|---|
Returns: | pd.DataFrame if mains is None else a pd.Series |
Finds the entropy of each meter in this MeterGroup.
Returns: | pd.Series of entropy |
---|
Fraction of energy per meter.
Return pd.Series. Index is meter.instance. Each value is a float in the range [0,1].
Parameters: | meter_ids : list or tuple
|
---|---|
Returns: | MeterGroup |
Assemble a new meter group using the same meter IDs and nested MeterGroups as other. This is useful for preparing a ground truth metergroup from a meter group of NILM predictions.
Parameters: | other : MeterGroup dataset : string
|
---|---|
Returns: | MeterGroup |
Create human-readable meter labels.
Parameters: | meter_ids : list of ElecMeterIDs (or 3-tuples in same order as ElecMeterID) |
---|---|
Returns: | list of strings describing the appliances. |
Returns: | nilmtk.TimeFrame representing the timeframe which is the union
|
---|
Returns good sections for just the first meter.
TODO: combine good sections from every meter.
e.g. groupby(‘category’)
Returns: | MeterGroup of nested MeterGroups: one per group |
---|
Parameters: | store : nilmtk.DataStore elec_meters : dict of dicts
appliances : list of dicts
building_id : BuildingID |
---|
Returns a generator of DataFrames loaded from the DataStore.
By default, load will load all available columns from the DataStore. Specific columns can be selected in one or two mutually exclusive ways:
Each meter in the MeterGroup will first be resampled before being added. The returned DataFrame will include NaNs at timestamps where no meter had a sample (after resampling the meter).
Parameters: | sample_period : int or float, optional
resample_kwargs : dict of key word arguments (other than ‘rule’) to
chunksize : int, optional
**kwargs :
physical_quantity : string or list of strings
ac_type : string or list of strings, defaults to None
cols : list of tuples, using NILMTK’s vocabulary for measurements.
preprocessing : list of Node subclass instances
|
---|---|
Returns: | Always return a generator of DataFrames (even if it only has a single column). Note Different AC types will be treated separately. |
Calls method on all pairs in self.meters.
Assumes method is symmetrical.
Parameters: | method : str
|
---|---|
Returns: | pd.DataFrame of the result of method called on each pair in self.meters. |
Finds the pairwise correlation among different meters in a MeterGroup.
Returns: | pd.DataFrame of correlation between pair of ElecMeters. |
---|
Finds the pairwise mutual information among different meters in a MeterGroup.
Returns: | pd.DataFrame of mutual information between pair of ElecMeters. |
---|
Parameters: | width : int, optional
ax : matplotlib.axes, optional plot_legend : boolean, optional
kind : {‘separate lines’, ‘sum’, ‘area’, ‘snakey’, ‘energy bar’} timeframe : nilmtk.TimeFrame, optional
|
---|
Parameters: | label_func : str or None
include_disabled_meters : bool |
---|
Create multiple subplots.
Parameters: | axes : list of matplotlib axes objects.
meter_keys : list of keys for identifying ElecMeters or MeterGroups.
plot_func : string
kwargs_per_meter : dict
pretty_label : bool **kwargs : any key word arguments to pass the same values to the
|
---|---|
Returns: | axes (flattened into a 1D list) |
Returns: | float [0,1] or NaN if mains total_energy == 0 |
---|
Select a group of meters based on meter metadata.
e.g. * select(building=1, sample_period=6) * select(room=’bathroom’)
If multiple criteria are supplied then these are ANDed together.
Returns: | new MeterGroup of selected meters. |
---|
Only select the top K meters, according to energy.
Functions on the entire MeterGroup. So if you mean to select the top K from only the submeters, please do something like this:
elec.submeters().select_top_k()
Parameters: | k : int, optional, defaults to 5 by: string, optional, defaults to energy
asc: bool, optional, defaults to False
group_remainder : bool, optional, defaults to False
**kwargs : key word arguments to pass to load() |
---|---|
Returns: | MeterGroup |
Select a group of meters based on appliance metadata.
e.g. * select(category=’lighting’) * select(type=’fridge’) * select(building=1, category=’lighting’) * select(room=’bathroom’)
If multiple criteria are supplied then these are ANDed together.
Returns: | new MeterGroup of selected meters. |
---|
Parameters: | threshold : number, threshold in Watts |
---|---|
Returns: | sim_switches : pd.Series of type {timestamp: number of simultaneous switches} |
Notes
This function assumes that the submeters in this MeterGroup are all aligned. If they are not then you should align the meters, e.g. by using an Apply node with resample.
Sums together total meter_energy for each meter.
Note that this function does not return the total aggregate energy for a building. Instead this function adds up the total energy for all the meters contained in this MeterGroup. If you want the total aggregate energy then please use MeterGroup.mains().total_energy().
Parameters: | full_results : bool, default=False **loader_kwargs : key word arguments for DataStore.load() |
---|---|
Returns: | if full_results is True then return TotalEnergyResults object else return a pd.Series with a row for each AC type. |
Parameters: | train_fraction |
---|---|
Returns: | split_time: pd.Timestamp where split should happen |
Returns: | new MeterGroup where its set of meters is the union of self.meters and other.meters. |
---|
Returns single upstream meter. Raises RuntimeError if more than 1 upstream meter.
Swap present mains meter(s) for mains meter(s) in disabled_meters. This is useful if the dataset has multiple, redundant mains meters (e.g. in UK-DALE buildings 1, 2 and 5).
Combines chunks into a single DataFrame.
Adds or averages columns, depending on whether each column is in PHYSICAL_QUANTITIES_TO_AVERAGE.
Returns: | DataFrame |
---|
Parameters: | master, slave : MeterGroup |
---|---|
Returns: | list of 2-tuples of the form (master_meter, slave_meter) |
Metrics to compare disaggregation performance against ground truth data.
All metrics functions have the same interface. Each function takes predictions and ground_truth parameters. Both of which are nilmtk.MeterGroup objects. Each function returns one of two types: either a pd.Series or a single float. Most functions return a pd.Series where each index element is a meter instance int or a tuple of ints for MeterGroups.
Below is the notation used to mathematically define each metric.
\(T\) - number of time slices.
\(t\) - a time slice.
\(N\) - number of appliances.
\(n\) - an appliance.
\(y^{(n)}_t\) - ground truth power of appliance \(n\) in time slice \(t\).
\(\hat{y}^{(n)}_t\) - estimated power of appliance \(n\) in time slice \(t\).
\(x^{(n)}_t\) - ground truth state of appliance \(n\) in time slice \(t\).
\(\hat{x}^{(n)}_t\) - estimated state of appliance \(n\) in time slice \(t\).
Compute error in assigned energy.
Parameters: | predictions, ground_truth : nilmtk.MeterGroup |
---|---|
Returns: | errors : pd.Series
|
Compute F1 scores.
Parameters: | predictions, ground_truth : nilmtk.MeterGroup |
---|---|
Returns: | f1_scores : pd.Series
|
Compute fraction of energy assigned correctly
Ignores distinction between different AC types, instead if there are multiple AC types for each meter then we just take the max value across the AC types.
Parameters: | predictions, ground_truth : nilmtk.MeterGroup |
---|---|
Returns: | fraction : float in the range [0,1]
|
Compute mean normalized error in assigned power
Parameters: | predictions, ground_truth : nilmtk.MeterGroup |
---|---|
Returns: | mne : pd.Series
|
Compute RMS error in assigned power
Parameters: | predictions, ground_truth : nilmtk.MeterGroup |
---|---|
Returns: | error : pd.Series
|
Bases: object
Abstract class defining interface for all Node subclasses, where a ‘node’ is a module which runs pre-processing or statistics (or, later, maybe NILM training or disaggregation).
Checks that self.upstream.dry_run_metadata satisfies self.requirements.
Raises: | UnsatistfiedRequirementsError |
---|
Does a ‘dry run’ so we can validate the full pipeline before loading any data.
Returns: | dict : dry run metadata |
---|
Returns: | Set of measurements that need to be loaded from disk for this node. |
---|
Parameters: | state, requirements : dict
|
---|---|
Returns: | list of strings describing (for human consumption) which conditions are not satisfied. If all conditions are satisfied then returns an empty list. |
Set up matplotlib’s RC params for LaTeX plotting. Call this before plotting a figure.
Parameters: | fig_width : float, optional, inches fig_height : float, optional, inches columns : {1, 2} |
---|
Plots a heatmap of a ‘square’ df Rows and columns are same and the values in this dataframe correspond to the computation b/w row,column. This plot can be used for plotting pairwise_correlation or pairwise_mutual_information or any method which works similarly
Plot function for series which is about 5 times faster than pd.Series.plot().
Parameters: | series : pd.Series ax : matplotlib Axes, optional
fig : matplotlib Figure date_format : str, optional, default=’%d/%m/%y %H:%M:%S’ tz_localize : boolean, optional, default is True
Can also use all **kwargs expected by `ax.plot` |
---|
Bases: object
Stats results from each node need to be assigned to a specific class so we know how to combine results from multiple chunks. For example, Energy can be simply summed; while dropout rate should be averaged, and gaps need to be merged across chunk boundaries. Results objects contain a DataFrame, the index of which is the start timestamp for which the results are valid; the first column (‘end’) is the end timestamp for which the results are valid. Other columns are accumulators for the results.
Attributes
_data | (DataFrame) Index is period start. Columns are: end and any columns for internal storage of stats. |
Append a single result.
Parameters: | timeframe : nilmtk.TimeFrame new_results : dict |
---|
Return all results from each chunk combined. Either return single float for all periods or a dict where necessary, e.g. if calculating Energy for a meter which records both apparent power and active power then get active power with energyresults.combined[‘active’]
Returns: | pd.DataFrame |
---|
Notes
Objects are converted using DataFrame.convert_objects(). The reason for doing this is to strip out the timezone information from data columns. We have to do this otherwise Pandas complains if we try to put a column with multiple timezones (e.g. Europe/London across a daylight saving boundary).
Parameters: | cached_stat : DataFrame of cached data sections : list of nilmtk.TimeFrame objects
|
---|
Take results from another table of data (another physical meter) and merge those results into self. For example, if we have a dual-split mains supply then we want to merge the results from each physical meter. The two sets of results must be for exactly the same timeframes.
Parameters: | other : Results subclass (same class as self).
|
---|
Bases: object
A TimeFrame is a single time span or period, e.g. from “2013” to “2014”.
Attributes
_start | (pd.Timestamp or None) if None and empty if False then behave as if start is infinitely far into the past |
_end | (pd.Timestamp or None) if None and empty is False then behave as if end is infinitely far into the future |
enabled | (boolean) If False then behave as if both _end and _start are None |
_empty | (boolean) If True then represents an empty time frame |
include_end | (boolean) |
Returns True if self.start == other.end or visa versa.
Parameters: | gap : float or int
|
---|
Notes
Does not yet handle case where self or other is open-ended.
Returns a new TimeFrame of the intersection between this TimeFrame and other TimeFrame. If the intersect is empty then the returned TimeFrame will have empty == True.
Slices frame using self.start and self.end.
Parameters: | frame : pd.DataFrame or pd.Series to slice |
---|---|
Returns: | frame : sliced frame |
Parameters: | timeframes : list of TimeFrame objects |
---|---|
Returns: | list of dicts |
Bases: list
A collection of nilmtk.TimeFrame objects.
Returns a new TimeFrameGroup of self masked by other.
Illustrated example:
self.good_sections(): |######—-#####—–######|
Parameters: | t : str or pd.Timestamp or datetime or None |
---|---|
Returns: | pd.Timestamp or None |
Find closest value in known_array for each element in test_array.
Parameters: | known_array : numpy array
test_array : numpy array
|
---|---|
Returns: | indices : numpy array; shape: (n, 1)
residuals : numpy array; shape: (n, 1)
|
Parameters: | filename : string format : ‘CSV’ or ‘HDF’ mode : ‘a’ (append) or ‘w’ (write), optional |
---|---|
Returns: | metadata : dict |
Parameters: | data : pandas.DataFrame or Series or DatetimeIndex |
---|---|
Returns: | index : the index for the DataFrame or Series |
Returns the nearest Timestamp to timestamp which would be in the set of timestamps returned by pd.DataFrame.resample(freq=freq)
Convert timedelta to seconds.
Parameters: | timedelta : np.timedelta64 |
---|---|
Returns: | float : seconds |
Parameters: | timestamp : pd.Timestamp or datetime.datetime |
---|---|
Returns: | True if timestamp is naive (i.e. if it does not have a timezone associated with it). See: https://docs.python.org/2/library/datetime.html#available-types |
Nosetests package teardown function (run when tests are done). See http://nose.readthedocs.org/en/latest/writing_tests.html#test-packages
Uses git to reset data_dir after tests have run.