%pylab inline
import pandas as pd
from ipypublish import nb_setup
An autoencoder is a neural net that maps features $X$ into labels $Y$, where $Y=X$. Such a neural net usually has a an odd number of layers where the middle layer has the fewest nodes and the number of nodes per layer increases as the layers get away from the middle. Essentially, what the autoencoder is doing is transforming the input data $X$ into its smallest representation in the middle layer and then expanding this representation as we move towards the output layer. Since the output is the same as the input the original data is compressed and decompressed. Therefore, the middle layer contains the compressed version of $X$, i.e., the reduced dimension version. The neural net encodes $X$ into a smaller representation $X'$ and then decodes it back into the original $X$. This is called "encoding-decoding" and the neural net is called an "encoder-decoder" network. The more general term is "autoencoder".
This note from the Keras blog provides good detail. Some of the material below is based on this blog.
The MNIST data set is a good test bed for generating compressed images using an autoencoder. The input feature vector is of size 784, compressed (encoded) down to a size of 32 and then decompressed (decoded) back to 784.
#Single fully-connected neural layer as encoder and as decoder
from keras.layers import Input, Dense
from keras.models import Model
#This is the size of our encoded representations
encoding_dim = 32 # 32 floats -> compression of factor 24.5, assuming the input is 784 floats
# this is our input placeholder
input_img = Input(shape=(784,))
# "encoded" is the encoded representation of the input
encoded = Dense(encoding_dim, activation='relu')(input_img)
# "decoded" is the lossy reconstruction of the input
decoded = Dense(784, activation='sigmoid')(encoded)
# this model maps an input to its reconstruction
autoencoder = Model(input_img, decoded)
Here the model encodes the higher dimension (784) feature set into a lower dimension (32).
#Separate encoder model
# this model maps an input to its encoded representation
encoder = Model(input_img, encoded)
The compressed image is decoded back to its full dimension.
#Separate decoder model
# create a placeholder for an encoded (32-dimensional) input
encoded_input = Input(shape=(encoding_dim,))
# retrieve the last layer of the autoencoder model
decoder_layer = autoencoder.layers[-1]
# create the decoder model
decoder = Model(encoded_input, decoder_layer(encoded_input))
#Compile the model
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
from keras.datasets import mnist
import numpy as np
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print(x_train.shape)
print(x_test.shape)
#Train the model
autoencoder.fit(x_train, x_train,
epochs=10,
batch_size=256,
shuffle=True,
validation_data=(x_test, x_test))
# encode and decode some digits
# note that we take them from the *test* set
encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)
print(type(encoded_imgs))
print(encoded_imgs.shape)
print(decoded_imgs.shape)
# use Matplotlib (don't ask)
n = 25 # how many digits we will display
figure(figsize=(20, 4))
for i in range(n):
# display original
ax = subplot(2, n, i + 1)
imshow(x_test[i].reshape(28, 28))
gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# display reconstruction
ax = subplot(2, n, i + 1 + n)
imshow(decoded_imgs[i].reshape(28, 28))
gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
show()
We see that the compressed images are recognizable. If the compressed dimension were higher (say 64), then the images would be clearer, of course.
We include some more layers here, 5 hidden layers in total (the previous example only had one hidden layer). Note the conical structure of the hidden layers where the compressed middle layer is reached through progressively smaller hidden layers. Because of the additional layers, the nomenclature "deep" autoencoder is apt.
input_img = Input(shape=(784,))
encoded = Dense(128, activation='relu')(input_img)
encoded = Dense(64, activation='relu')(encoded)
encoded = Dense(32, activation='relu')(encoded)
decoded = Dense(64, activation='relu')(encoded)
decoded = Dense(128, activation='relu')(decoded)
decoded = Dense(784, activation='sigmoid')(decoded)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.fit(x_train, x_train,
epochs=10,
batch_size=256,
shuffle=True,
validation_data=(x_test, x_test))
# this model maps an input to its encoded representation
encoder = Model(input_img, encoded)
encoded_input = Input(shape=(encoding_dim,))
# retrieve the 3rd to last of the autoencoder model
decoded = autoencoder.layers[-3](encoded_input)
# retrieve the 2nd to last of the autoencoder model
decoded = autoencoder.layers[-2](decoded)
# retrieve the 1st to last of the autoencoder model
decoded = autoencoder.layers[-1](decoded)
# create the decoder model
decoder = Model(encoded_input, decoded)
encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)
print(encoded_imgs.shape)
decoded_imgs.shape
n = 25 # how many digits we will display
figure(figsize=(20, 4))
for i in range(n):
# display original
ax = subplot(2, n, i + 1)
imshow(x_test[i].reshape(28, 28))
gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# display reconstruction
ax = subplot(2, n, i + 1 + n)
imshow(decoded_imgs[i].reshape(28, 28))
gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
show()
The additional encoding and decoding layers in the deep autoencoder result in a better quality compressed image than for the shallow autoencoder.
The Treasury yield curve is the relationship of government interest rates with maturity. The curve moves over time as the interest rates change. Rates for different maturities do not always change by the same amounts so the shape of the yield curve evolves. Here are yield curve dynamics related to the stock market. (Click on the "animate" button to watch its evolution.)
Treasury interest rates are assumed to be driven by a few factors, typically three, known for the kinds of movement in the yield curve. These are level, slope, and curvature changes in the curve. Most of the changes tend to be level effects, i.e., rates are driven by a single driving force that moves them all in the same direction.
In the following example we take a long time series of yields for 8 maturities (i.e., 8 features) and use an autoencoder to extract the time series of a smaller feature set.
%pylab inline
import pandas as pd
from keras.layers import Input, Dense
from keras.models import Model
We read in the interest rates and then examine the correlation of the 8 series. As you can see, the correlation among the rates is very high, signifying that there may be one major underlying feature driving the entire system of rates over time.
rates = pd.read_csv("DL_data/tryrates.txt", sep="\t")
print(rates.shape)
rates.head()
rates.tail()
rates.corr()
rates = rates.drop("DATE", axis=1)
rates = array(rates)
print(rates.shape)
print(type(rates))
We will attempt to compress the feature set down from dimension 8 to dimension 2.
encoding_dim = 2 #No of factors
x_train = rates
x_test = rates
Multiple layers. We use 5 hidden layers.
input_img = Input(shape=(8,))
encoded = Dense(6, activation='relu')(input_img)
encoded = Dense(4, activation='relu')(encoded)
encoded = Dense(encoding_dim, activation='relu')(encoded) #Middle layer
decoded = Dense(4, activation='relu')(encoded)
decoded = Dense(6, activation='relu')(decoded)
decoded = Dense(8, activation='sigmoid')(decoded)
autoencoder = Model(input_img, decoded)
#autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.compile(optimizer='adadelta', loss='mean_squared_error')
autoencoder.fit(x_train, x_train,
epochs=15,
batch_size=32,
shuffle=True,
validation_data=(x_test, x_test))
#Needed only for generating output of encoder and decoder
encoder = Model(input_img, encoded)
encoded_input = Input(shape=(encoding_dim,))
decoded = autoencoder.layers[-3](encoded_input)
decoded = autoencoder.layers[-2](decoded)
decoded = autoencoder.layers[-1](decoded)
decoder = Model(encoded_input, decoded)
encoded_imgs = encoder.predict(x_train)
decoded_imgs = decoder.predict(encoded_imgs)
print(encoded_imgs.shape)
print(decoded_imgs.shape)
encoded_imgs[:10,:]
#First factor
plot(encoded_imgs[:,0])
grid()
#Second factor
plot(encoded_imgs[:,1])
grid()
As we see, the autoencoder finds that it can compress the entire 8 dimensions down to a single dimension. We call this as a "single factor" model of all interest rates. This is is the underlying driving force. It looks very much like the main factor correlated with the US inflation rate.
nb_setup.images_vconcat(["DL_images/US_inflation.png"], width=600)
We do a similar analysis for the equity markets. We download time series of stock prices for several tickers (21 in total) and then convert the data into daily returns. This becomes the feature set that we aim to compress down to a smaller feature set (of size 5).
%pylab inline import pandas as pd from keras.layers import Input, Dense from keras.models import Model
We download the data from the web and construct a dataframe of stock prices.
# IMPORTING STOCK DATA USING PANDAS
# Remember to "pip install pandas-datareader"
import pandas_datareader.data as web
from datetime import datetime
tickers = ["GOOG","MSFT","AMZN","AAPL","AMAT","ORCL","CSCO","HPQ","INFY","IBM","JNPR","LOGI",
"QCOM","SAP","VMW","WIT","XRX","C","BAC","PG","PEP"]
stkp = web.DataReader(tickers,"yahoo",datetime(2010,1,1),datetime(2018,12,31))
stkp = stkp["Adj Close"]
stkp.head()
stkp.to_csv("DL_data/equity_prices.csv")
Convert stocks prices into returns.
#Read in data and prepare for Autoencoder
stkp = pd.read_csv("DL_data/equity_prices.csv")
stkp = stkp.drop("Date", axis=1)
rets = stkp.pct_change()
rets = rets.iloc[1:]
print(rets.shape)
rets.head()
rets.dropna()
rets = array(rets)
rets = rets*100.0
print(rets.shape)
print(type(rets))
We select the number of dimensions we want in the reduced feature set.
encoding_dim = 5 #No of factors
x_train = rets
x_test = rets
Set up the autoencoder with 5 hidden layers.
input_img = Input(shape=(21,))
encoded = Dense(15, activation='tanh')(input_img)
encoded = Dense(9, activation='tanh')(encoded)
encoded = Dense(encoding_dim, activation='tanh')(encoded) #Middle layer
decoded = Dense(9, activation='tanh')(encoded)
decoded = Dense(15, activation='tanh')(decoded)
decoded = Dense(21, activation='linear')(decoded)
Compile and fit the autoencoder.
autoencoder = Model(input_img, decoded)
#autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.compile(optimizer='adadelta', loss='mean_squared_error')
autoencoder.fit(x_train, x_train,
epochs=100,
batch_size=32,
shuffle=True,
validation_data=(x_test, x_test))
# Needed only for generating output of encoder and decoder
encoder = Model(input_img, encoded)
encoded_input = Input(shape=(encoding_dim,))
decoded = autoencoder.layers[-3](encoded_input)
decoded = autoencoder.layers[-2](decoded)
decoded = autoencoder.layers[-1](decoded)
decoder = Model(encoded_input, decoded)
encoded_imgs = encoder.predict(x_train)
decoded_imgs = decoder.predict(encoded_imgs)
print(encoded_imgs.shape)
print(decoded_imgs.shape)
Show the first few values of the 5 dimension reduced data set.
encoded_imgs[:10,:]
The correlation of the factors is not high as we would expect. They capture different attributes of the data.
corrcoef(encoded_imgs.T)
We plot the first and fourth factors just to see what they look like.
plot(encoded_imgs[:,0])
grid()
plot(encoded_imgs[:,3])
grid()
We also plot the third ticker AMZN against its decoded value, and see that they are strongly correlated (75%) as they should be.
plot(rets[:,2], decoded_imgs[:,0], 'bo')
grid()
plot(rets[:,2])
plot(decoded_imgs[:,2])
corrcoef(rets[:,2],decoded_imgs[:,2])