Skip to content

Latest commit

 

History

History
116 lines (74 loc) · 11.7 KB

File metadata and controls

116 lines (74 loc) · 11.7 KB

Conditional Autoencoders for Asset Pricing and GANs

This chapter presents two unsupervised learning techniques that leverage deep learning: autoencoders, which have been around for decades, and Generative Adversarial Networks (GANs), which were introduced by Ian Goodfellow in 2014 and which Yann LeCun has called the most exciting idea in AI in the last ten years.

  • An autoencoder is a neural network trained to reproduce the input while learning a new representation of the data, encoded by the parameters of a hidden layer. Autoencoders have long been used for nonlinear dimensionality reduction and manifold learning. More recently, autoencoders have been designed as generative models that learn probability distributions over observed and latent variables. A variety of designs leverage the feedforward network, Convolutional Neural Network (CNN), and recurrent neural network (RNN) architectures we covered in the last three chapters.
  • GANs are a recent innovation that train two neural nets—a generator and a discriminator—in a competitive setting. The generator aims to produce samples that the discriminator is unable to distinguish from a given class of training data. The result is a generative model capable of producing new (fake) samples that are representative of a certain target distribution. GANs have produced a wave of research and can be successfully applied in many domains. An example from the medical domain that could potentially be highly relevant for trading is the generation of time-series data that simulates alternative trajectories and can be used to train supervised or reinforcement algorithms.

More specifically, this chapter covers:

  • Which types of autoencoders are of practical use and how they work

  • How to build and train autoencoders using Python

  • How GANs work, why they're useful, and how they could be applied to trading

  • How to build GANs using Python

  • Unsupervised Learning, Yann LeCun, 2016

How Autoencoders work

An autoencoder, in contrast, is a neural network designed exclusively to learn a new representation, that is, an encoding of the input. To this end, the training forces the network to faithfully reproduce the input. Since autoencoders typically use the same data as input and output, they are also considered an instance of self-supervised learning. In the process, the parameters of a hidden layer become the code that represents the input.

  • Autoencoders, Ian Goodfellow, Yoshua Bengio and Aaron Courville, Deep Learning Book, Chapter 14, MIT Press 2016

Nonlinear dimensionality reduction

A traditional use case includes dimensionality reduction, achieved by limiting the size of the hidden layer so that it performs lossy compression. Such an autoencoder is called undercomplete and the purpose is to force it to learn the most salient properties of the data by minimizing a loss function. In addition to feedforward architectures, autoencoders can also use convolutional layers to learn hierarchical feature representations.

The powerful capabilities of neural networks to represent complex functions require tight limitations of the capacity of the encoder and decoder to force the extraction of a useful signal rather than noise. In other words, when it is too easy for the network to recreate the input, it fails to learn only the most interesting aspects of the data. This challenge is similar to the overfitting phenomenon that frequently occurs when using models with a high capacity for supervised learning. Just as in these settings, regularization can help by adding constraints to the autoencoder that facilitate the learning of a useful representation.

Sequence-to-Sequence Autoencoders

Sequence-to-sequence autoencoders are based on RNN components, such as Long Short-Term Memory (LSTM) or Gated Recurrent Units (GRUs). They learn a compressed representation of sequential data and have been applied to video, text, audio, and time-series data.

Variational Autoencoders

Variational Autoencoders (VAE) are more recent developments focused on generative modeling. More specifically, VAEs are designed to learn a latent variable model for the input data. Note that we encountered latent variables in Chapter 14, Topic Modeling.

Hence, VAEs do not let the network learn arbitrary functions as long as it faithfully reproduces the input. Instead, they aim to learn the parameters of a probability distribution that generates the input data. In other words, VAEs are generative models because, if successful, you can generate new data points by sampling from the distribution learned by the VAE.

How to build autoencoders using Python

The Keras library makes it fairly straightforward to build various types of autoencoders and the following examples are adapted from Keras' tutorials.

Feedforward Autoencoders with Sparsity Constraints

The notebook deep_autoencoders illustrates how to implement several of the autoencoder models introduced in the preceding section using Keras. This includes autoencoders using deep feedforward nets and sparsity constraints.

Convolutional & Denoising Autoencoders

The notebook convolutional_denoising_autoencoders goes on to demonstrate how to implement convolutionals and denoising autencoders to recover corrupted image inputs.

Sequence-to-sequence autoencoders

Sequence-to-sequence autoencoders are based on RNN components like long short-term memory (LSTM) or gated recurrent units (GRUs). They learn a compressed representation of sequential data and have been applied to video, text, audio, and time-series data.

Variational Autoencoders

The notebook variational_autoencoder shows how to build a Variational Autoencoder using Keras.

Generative Adversarial Networks

The supervised learning algorithms that we focused on for most of this book receive input data that's typically complex and predicts a numerical or categorical label that we can compare to the ground truth to evaluate its performance. These algorithms are also called discriminative models because they learn to differentiate between different output classes.

The goal of generative models is to produce complex output, such as realistic images, given simple input, which can even be random numbers. They achieve this by modeling a probability distribution over the possible output. This probability distribution can have many dimensions, for example, one for each pixel in an image or its character or token in a document. As a result, the model can generate output that are very likely representative of the class of output.

How GANs work

Evolution of GAN Architectures

Applications of GANs

How to build GANs using Python

The notebook deep_convolutional_generative_adversarial_network illustrates the implementation of a GAN using Python. It uses the Deep Convolutional GAN (DCGAN) example to synthesize images from the fashion MNIST dataset