Dataset Preprocessing

As a demonstration, let us first load the iAWE dataset (which has already been converted to HDF5 format):

Finding the range of voltage for air conditioner 1

We observe minimum voltage of 0 and maximum of 5140. Clearly, these are due to some fault in data collection. These readings should be removed

Removing readings in the dataset where voltage >260 or voltage <160

Now, observing the voltage variation in the same air conditioner as before.

Filtering data from 13 July to 4 August

Downsampling the dataset to 1 minute

Fill large gaps in appliances with zeros and forward-fill small gaps

Prepend and append zeros

Drop missing samples from mains

Find intersection of mains and appliance datetime indicies