nilmtk.stats package¶

Subpackages¶

nilmtk.stats.tests package

Submodules¶

nilmtk.stats.dropoutrate module¶

class nilmtk.stats.dropoutrate.DropoutRate(upstream=None, generator=None)[source]¶

Bases: nilmtk.node.Node

postconditions = {'statistics': {'dropout_rate': None}}¶

process()[source]¶

requirements = {'device': {'sample_period': 'ANY VALUE'}}¶

results_class¶: alias of DropoutRateResults

nilmtk.stats.dropoutrate.get_dropout_rate(data, sample_period)[source]¶

Parameters:

Parameters:	data : pd.DataFrame or pd.Series sample_period : number, seconds
Returns:	dropout_rate : float [0,1] The proportion of samples that have been lost; where 1 means that all samples have been lost and 0 means that no samples have been lost. NaN means too few samples.

data : pd.DataFrame or pd.Series

sample_period : number, seconds

Returns:

dropout_rate : float [0,1]

The proportion of samples that have been lost; where 1 means that all samples have been lost and 0 means that no samples have been lost. NaN means too few samples.

nilmtk.stats.dropoutrateresults module¶

class nilmtk.stats.dropoutrateresults.DropoutRateResults[source]¶

Bases: nilmtk.results.Results

Attributes

_data

(pd.DataFrame) index is start date for the whole chunk end is end date for the whole chunk dropout_rate is float [0,1] n_samples is int, used for calculating weighted mean

combined()[source]¶

Calculates weighted average.

Returns:	dropout_rate : float, [0,1]

name = 'dropout_rate'¶

plot(ax=None)[source]¶

to_dict()[source]¶

unify(other)[source]¶

nilmtk.stats.goodsections module¶

class nilmtk.stats.goodsections.GoodSections(upstream=None, generator=None)[source]¶

Bases: nilmtk.node.Node

Locate sections of data where the sample period is <= max_sample_period.

Attributes

previous_chunk_ended_with_open_ended_good_section

(bool)

postconditions = {'statistics': {'good_sections': []}}¶

process()[source]¶

requirements = {'device': {'max_sample_period': 'ANY VALUE'}}¶

reset()[source]¶

results_class¶: alias of GoodSectionsResults

nilmtk.stats.goodsections.get_good_sections(df, max_sample_period, look_ahead=None, previous_chunk_ended_with_open_ended_good_section=False)[source]¶

Parameters:

Parameters:	df : pd.DataFrame look_ahead : pd.DataFrame max_sample_period : number
Returns:	sections : list of TimeFrame objects Each good section in df is marked with a TimeFrame. If this df ends with an open-ended good section (assessed by examining look_ahead) then the last TimeFrame will have end=None. If this df starts with an open-ended good section then the first TimeFrame will have start=None.

df : pd.DataFrame

look_ahead : pd.DataFrame

max_sample_period : number

Returns:

sections : list of TimeFrame objects

Each good section in df is marked with a TimeFrame. If this df ends with an open-ended good section (assessed by examining look_ahead) then the last TimeFrame will have end=None. If this df starts with an open-ended good section then the first TimeFrame will have start=None.

nilmtk.stats.goodsectionsresults module¶

class nilmtk.stats.goodsectionsresults.GoodSectionsResults(max_sample_period)[source]¶

Bases: nilmtk.results.Results

Attributes

max_sample_period_td	(timedelta)
_data	(pd.DataFrame) index is start date for the whole chunk end is end date for the whole chunk sections is a TimeFrameGroups object (a list of nilmtk.TimeFrame objects)

append(timeframe, new_results)[source]¶

Append a single result.

Parameters:

Parameters:	timeframe : nilmtk.TimeFrame new_results : {‘sections’: list of TimeFrame objects}

timeframe : nilmtk.TimeFrame

new_results : {‘sections’: list of TimeFrame objects}

combined()[source]¶

Merges together any good sections which span multiple segments, as long as those segments are adjacent (previous.end - max_sample_period <= next.start <= previous.end).

Returns:	sections : TimeFrameGroup (a subclass of Python’s list class)

export_to_cache()[source]¶

Returns:

Returns:	DataFrame with three columns: ‘end’, ‘section_end’, ‘section_start’. Instead of storing a list of TimeFrames on each row, we store one TimeFrame per row. This is because pd.HDFStore cannot save a DataFrame where one column is a list if using ‘table’ format’. We also need to strip the timezone information from the data columns. When we import from cache, we assume the timezone for the data columns is the same as the tz for the index.

DataFrame with three columns: ‘end’, ‘section_end’, ‘section_start’.

Instead of storing a list of TimeFrames on each row, we store one TimeFrame per row. This is because pd.HDFStore cannot save a DataFrame where one column is a list if using ‘table’ format’. We also need to strip the timezone information from the data columns. When we import from cache, we assume the timezone for the data columns is the same as the tz for the index.

import_from_cache(cached_stat, sections)[source]¶

name = 'good_sections'¶

plot(**kwargs)[source]¶

to_dict()[source]¶

unify(other)[source]¶

nilmtk.stats.histogram module¶

nilmtk.stats.histogram.histogram_from_generator(generator, bins=None, range=None, **kwargs)[source]¶

Apart from ‘generator’, takes the same key word arguments as numpy.histogram. And returns the same objects as np.histogram.

Parameters:

Parameters:	range : None or (min, max) range differs from np.histogram’s interpretation of ‘range’ in that either element can be None, in which case the min or max of the first chunk is used. bins : None or int if None then uses int(range[1]-range[0])

range : None or (min, max)

range differs from np.histogram’s interpretation of ‘range’ in that either element can be None, in which case the min or max of the first chunk is used.

bins : None or int

if None then uses int(range[1]-range[0])

nilmtk.stats.totalenergy module¶

class nilmtk.stats.totalenergy.TotalEnergy(upstream=None, generator=None)[source]¶

Bases: nilmtk.node.Node

postconditions = {'statistics': {'energy': {}}}¶

process()[source]¶: Preference: Cumulative energy > Energy > Power

required_measurements(state)[source]¶: TotalEnergy needs all power and energy measurements.

requirements = {'device': {'max_sample_period': 'ANY VALUE'}, 'preprocessing_applied': {'clip': 'ANY VALUE'}}¶

results_class¶: alias of TotalEnergyResults

nilmtk.stats.totalenergy.get_total_energy(df, max_sample_period)[source]¶

Calculate total energy for energy / power data in a dataframe.

Parameters:

Parameters:	df : pd.DataFrame max_sample_period : float or int
Returns:	energy : dict With a key for each AC type (reactive, apparent, active) in df. Values are energy in kWh (or equivalent for reactive and apparent power).

df : pd.DataFrame

max_sample_period : float or int

Returns:

energy : dict

With a key for each AC type (reactive, apparent, active) in df. Values are energy in kWh (or equivalent for reactive and apparent power).

nilmtk.stats.totalenergyresults module¶

class nilmtk.stats.totalenergyresults.TotalEnergyResults[source]¶

Bases: nilmtk.results.Results

Attributes

_data

(pd.DataFrame) index is start date end is end date active is (optional) energy in kWh reactive is (optional) energy in kVARh apparent is (optional) energy in kVAh

append(timeframe, new_results)[source]¶: Append a single result. e.g. append(TimeFrame(start, end), {‘apparent’: 34, ‘active’: 43})

export_to_cache()[source]¶

name = 'total_energy'¶

simple()[source]¶

to_dict()[source]¶

unify(other)[source]¶

nilmtk.stats package¶

Subpackages¶

Submodules¶

nilmtk.stats.dropoutrate module¶

nilmtk.stats.dropoutrateresults module¶

nilmtk.stats.goodsections module¶

nilmtk.stats.goodsectionsresults module¶

nilmtk.stats.histogram module¶

nilmtk.stats.totalenergy module¶

nilmtk.stats.totalenergyresults module¶

Module contents¶