Monday, August 3, 2020

SYND - A Synthetic Energy Dataset

As with related Machine Learning problems, applications like Non-Intrusive Load Monitoring (NILM) require a sufficient amount of data to train and validate new approaches. With SynD, we present a synthetic energy dataset emulating the power consumption of residential buildings. The dataset is freely available and contains 180 days of synthetic power data on aggregate level and individual appliances.

The SynD dataset is based upon
measurements of real devices
SynD is the result of a custom simulation process that relies on power traces of real household appliances. During a measurement campaign in two Austrian households, we monitored 21 electrical household appliances. The main goal of the measurement campaign was to record representative power consumption patterns of those 21 appliances, where a each pattern is represented by the shape of the power consumption over time for a single operation.

Technical validation of SYND by comparing with other datasets
In contrast to datasets entirely based on measurement campaigns, such as our dataset GREEND, the SynD dataset is constructed from a simulation model utilizing the measured devices. This way, a synthetic but realistic power consumption dataset can be obtained. In a technical validation of the dataset we compared SynD with a number of measured datasets showing that SynD is well within the varaiation between mutual datasets.

Wilfried Elmenreich states “Usually I rely on measured data, but with the SYND dataset, we are among the first who created a convincing synthetic dataset.”

The full paper describing the dataset is available under an open access policy here:
Christoph Klemenjak, Christoph Kovatsch, Manuel Herold, and Wilfried Elmenreich. A synthetic energy dataset for non-intrusive load monitoring in households. Scientific Data, 7(1):1–17, 2020. (doi:10.6084/m9.figshare.11940324)
The SynD dataset can be obtained freely at the SynD Github Repository.