nilmtk.stats package

Submodules

nilmtk.stats.dropoutrate module

class nilmtk.stats.dropoutrate.DropoutRate(upstream=None, generator=None)[source]

Bases: nilmtk.node.Node

postconditions = {'statistics': {'dropout_rate': None}}
process()[source]
requirements = {'device': {'sample_period': 'ANY VALUE'}}
results_class

alias of DropoutRateResults

nilmtk.stats.dropoutrate.get_dropout_rate(data, sample_period)[source]
Parameters:

data : pd.DataFrame or pd.Series

sample_period : number, seconds

Returns:

dropout_rate : float [0,1]

The proportion of samples that have been lost; where 1 means that all samples have been lost and 0 means that no samples have been lost. NaN means too few samples.

nilmtk.stats.dropoutrateresults module

class nilmtk.stats.dropoutrateresults.DropoutRateResults[source]

Bases: nilmtk.results.Results

Attributes

_data (pd.DataFrame) index is start date for the whole chunk end is end date for the whole chunk dropout_rate is float [0,1] n_samples is int, used for calculating weighted mean
combined()[source]

Calculates weighted average.

Returns:dropout_rate : float, [0,1]
name = 'dropout_rate'
plot(ax=None)[source]
to_dict()[source]
unify(other)[source]

nilmtk.stats.goodsections module

class nilmtk.stats.goodsections.GoodSections(upstream=None, generator=None)[source]

Bases: nilmtk.node.Node

Locate sections of data where the sample period is <= max_sample_period.

Attributes

previous_chunk_ended_with_open_ended_good_section (bool)
postconditions = {'statistics': {'good_sections': []}}
process()[source]
requirements = {'device': {'max_sample_period': 'ANY VALUE'}}
reset()[source]
results_class

alias of GoodSectionsResults

nilmtk.stats.goodsections.get_good_sections(df, max_sample_period, look_ahead=None, previous_chunk_ended_with_open_ended_good_section=False)[source]
Parameters:

df : pd.DataFrame

look_ahead : pd.DataFrame

max_sample_period : number

Returns:

sections : list of TimeFrame objects

Each good section in df is marked with a TimeFrame. If this df ends with an open-ended good section (assessed by examining look_ahead) then the last TimeFrame will have end=None. If this df starts with an open-ended good section then the first TimeFrame will have start=None.

nilmtk.stats.goodsectionsresults module

class nilmtk.stats.goodsectionsresults.GoodSectionsResults(max_sample_period)[source]

Bases: nilmtk.results.Results

Attributes

max_sample_period_td (timedelta)
_data (pd.DataFrame) index is start date for the whole chunk end is end date for the whole chunk sections is a TimeFrameGroups object (a list of nilmtk.TimeFrame objects)
append(timeframe, new_results)[source]

Append a single result.

Parameters:

timeframe : nilmtk.TimeFrame

new_results : {‘sections’: list of TimeFrame objects}

combined()[source]

Merges together any good sections which span multiple segments, as long as those segments are adjacent (previous.end - max_sample_period <= next.start <= previous.end).

Returns:sections : TimeFrameGroup (a subclass of Python’s list class)
export_to_cache()[source]
Returns:

DataFrame with three columns: ‘end’, ‘section_end’, ‘section_start’.

Instead of storing a list of TimeFrames on each row, we store one TimeFrame per row. This is because pd.HDFStore cannot save a DataFrame where one column is a list if using ‘table’ format’. We also need to strip the timezone information from the data columns. When we import from cache, we assume the timezone for the data columns is the same as the tz for the index.

import_from_cache(cached_stat, sections)[source]
name = 'good_sections'
plot(**kwargs)[source]
to_dict()[source]
unify(other)[source]

nilmtk.stats.histogram module

nilmtk.stats.histogram.histogram_from_generator(generator, bins=None, range=None, **kwargs)[source]

Apart from ‘generator’, takes the same key word arguments as numpy.histogram. And returns the same objects as np.histogram.

Parameters:

range : None or (min, max)

range differs from np.histogram’s interpretation of ‘range’ in that either element can be None, in which case the min or max of the first chunk is used.

bins : None or int

if None then uses int(range[1]-range[0])

nilmtk.stats.totalenergy module

class nilmtk.stats.totalenergy.TotalEnergy(upstream=None, generator=None)[source]

Bases: nilmtk.node.Node

postconditions = {'statistics': {'energy': {}}}
process()[source]

Preference: Cumulative energy > Energy > Power

required_measurements(state)[source]

TotalEnergy needs all power and energy measurements.

requirements = {'device': {'max_sample_period': 'ANY VALUE'}, 'preprocessing_applied': {'clip': 'ANY VALUE'}}
results_class

alias of TotalEnergyResults

nilmtk.stats.totalenergy.get_total_energy(df, max_sample_period)[source]

Calculate total energy for energy / power data in a dataframe.

Parameters:

df : pd.DataFrame

max_sample_period : float or int

Returns:

energy : dict

With a key for each AC type (reactive, apparent, active) in df. Values are energy in kWh (or equivalent for reactive and apparent power).

nilmtk.stats.totalenergyresults module

class nilmtk.stats.totalenergyresults.TotalEnergyResults[source]

Bases: nilmtk.results.Results

Attributes

_data (pd.DataFrame) index is start date end is end date active is (optional) energy in kWh reactive is (optional) energy in kVARh apparent is (optional) energy in kVAh
append(timeframe, new_results)[source]

Append a single result. e.g. append(TimeFrame(start, end), {‘apparent’: 34, ‘active’: 43})

export_to_cache()[source]
name = 'total_energy'
simple()[source]
to_dict()[source]
unify(other)[source]

Module contents