This repository contains the 1st place solution for the NeurIPS 2023 Citylearn Challenge - Forecasting Track.
The CityLearn Challenge 2023 - Forecasting Track focuses on designing models to predict the 48-hour-ahead end-use load profiles for each building in a synthetic single-family neighborhood, as well as the neighborhood-level 48-hour-ahead solar generation and carbon intensity profiles. The training data consists of hourly data from 3 buildings spanning a 1-month period. The models are evaluated on a different set of buildings and a different 3-month period. During evaluation, the 48-hour-ahead forecasts are rerun and evaluated every hour. The evaluation metric is RMSE.
The main challenges of this competition were:
- Working in the small data regime --- Participants only have 1 month of training data. Equivalently, this means 720 observations total, 30 observations for each hour of the day, and 4 observations for each hour of the week. Thus, if a model uses the hour of the week as an input, then special care needs to be taken to avoid overfitting.
- Handling the cold start problem --- The models were evaluated on new buildings with no opportunity to warm-start the model. Thus, the models needed to make reasonable predictions without any prior data.
Simple models are well-suited for addressing both of these challenges. In particular, the seasonal average and a few improvements to it worked particularly well on this dataset.
The batch formula for calculating an average of
$$
\begin{align*}
\bar{x}n
= \underbrace{ \frac{n-1}{n} \cdot \bar{x}{n-1} + \frac{1}{n} \cdot x_n}{\text{convex combination}}
= \bar{x}{n-1} + \frac{1}{n} \cdot \underbrace{ (x_n - \bar{x}{n-1})}{\text{update}}
\end{align*}
$$
Thus, to calculate the average of we only need to store the previous average
The seasonal average simply keeps track of 24 averages, one for each hour of the day (or 168 averages, one for each hour of the week).
There are a few simple improvements that can be made:
- better initialization
- filtering out large values (not clipping)
- blending with the most recent observation
The sample average has high variance when
$$ \begin{align*} \tilde{x}n &= \underbrace{\frac{n + \tau -1}{n + \tau} \cdot \tilde{x}{n-1} + \frac{1}{n + \tau} \cdot x_n}{\text{convex combination}} = \tilde{x}{n-1} + \frac{1}{n + \tau} \cdot \underbrace{(x_n - \tilde{x}{n-1})}{\text{update}} \ &= \frac{n}{n + \tau} \cdot \bar{x}_{n} + \frac{\tau}{n + \tau} \cdot x_0 \end{align*} $$
where
The domestic hot water heating (DHW) load and the electrical equipment (EEP) load had large spikes that were difficult to predict. Including these values in the average calculation resulted in bad forecasts; filtering out the large values improved the forecasts. The following figures shows the effects of the large spikes on the forecasts.
Some of the load-types exhibited a high degree of autocorrelation. A simple way to improve the averages is to correct the level of the forecast using the following fomula
$$ \hat{x}{n+h} = \bar{x}{n+h} + \alpha^h (x_n - \bar{x}_{n}) $$
where