"Everything that comes together falls apart." -- John Green
%pylab inline
from ipypublish import nb_setup
Assume you are comfortable with (1) linear transformations, (2) determinants, (3) span, and (4) change of basis.
Definition: An eigenvector $v$ is one that remains on the same span before and after the linear transformation ($M$).
So, this means the vector remains on the span and is only stretched or shrunk. In 2-D, this means that a vector $v=[v_1,v_2]^\top$ will simply be multiplied by a constant, i.e., its orientation (span) will not change.
The number of eigenvectors of the transformation will be the same as the dimension of the vector space.
Example:
M = array([[2,3],[0,-1]])
v = array([1,0])
v_new = M.dot(v)
print(v_new)
And so, we see that $v$ gets stretched into $v_{new}$. Its span remains the same, therefore, $v$ is an eigenvector of the linear transformation $M$.
Since we are in 2-D, there will be two eigenvectors, and we may solve for them with the equation
$$ \lambda v = M · v $$M = array([[2,3],[0,-1]])
res = eig(M)
print(res)
print(res[0]) #eigenvalues
print(res[1]) #eigenvectors
We get two eigenvectors:
$$ v_1 = \left[\begin{array}{c} 1\\0 \end{array} \right], \quad \quad v_2 = \left[\begin{array}{c} -0.71\\0.71 \end{array} \right] $$We can also see that the eigenvalues $(2, -1)$ are those that give us the stretching factor for each eigenvector.
Understanding eigenvalues and eigenvectors is best done visually. An excellent simple exposition is available at: http://setosa.io/ev/eigenvectors-and-eigenvalues/
A $M \times M$ matrix $A$ has attendant $M$ eigenvectors $V$ and eigenvalues $\lambda$ if we can write
\begin{equation} \lambda V = A \cdot V \end{equation}Starting with matrix $A$, the eigenvalue decomposition gives both $V$ and $\lambda$.
It turns out we can find $M$ such eigenvalues and eigenvectors, as there is no unique solution to this equation. We also require that $\lambda \neq 0$.
We may implement this in code as follows.
A = array([[5,1],[2,4]])
E = eig(A)
print(E)
v1 = E[1][:,0]
v2 = E[1][:,1]
e1 = E[0][0]
e2 = E[0][1]
print(e1*v1)
print(A.dot(v1))
print(e2*v2)
print(A.dot(v2))
The math is as follows:
$$ M · v = λ v $$$$ M · v = (λ I) v $$$$ (M - λ I) \cdot v = {\bf 0} $$which means $v$ is being flattened down to ${\bf 0}$, which can only happen if the linear transformation $ (M - λ I)$ is singular, i.e., its determinant must be 0.
$$ det(M - λ I) = 0 $$In our example, $M = \left[\begin{array}{cc} 2 & 3\\0 & -1 \end{array} \right]$. So
$$ M - λ I = \left[\begin{array}{cc} 2-\lambda & 3\\0 & -1-\lambda \end{array} \right] $$Then
$$ det(M-λ I) = (2-\lambda)(-1-\lambda) - (0)(3) = 0 $$which is a quadratic equation
$$ \lambda^2 - λ - 2 = 0 $$with two roots, $\lambda=2$ and $λ = -1$, which you can see are the eigenvalues above. Using these, we can extract (solve for) the eigenvectors.
Yes, this is possible. See the matrix
$$ M = \left[\begin{array}{cc} 0 & -1\\1 & 0 \end{array} \right] $$The equation $det(M - λ I)= \lambda^2 + 1 = 0$ has no real roots.
A shear has just one eigenvector. (Check this.)
Neutrinos Lead to Unexpected Discovery in Basic Math.
The paper with Terrence Tao. What the heck is a neutrino?
A basis made up of eigenvectors is an eigenbasis.
EB = eig(M)[1]
EB #eigenbasis
#This has an interesting property
inv(EB).dot(M).dot(EB)
It always gives a diagonal matrix. So it may be used to speed up computations where we have matrix powers.
M = array([[1,4],[3,-2]])
# x = array([2,5])
y = M.dot(M).dot(M).dot(M) #Fourth power of M
print('M4 =',y)
#The eigenbasis can be used to quickly calculate powers of a matrix
EB = eig(M)[1]
print(EB)
A = (inv(EB).dot(M).dot(EB))
print(A.round(4))
A = A**4
print(A.round(4))
M4 = EB.dot(A).dot(inv(EB))
print('M4 check =')
print(M4)
We will read in data on US interest rates to find out its drivers.
import pandas as pd
rates = pd.read_csv("DL_data/tryrates.txt", sep="\t")
print(rates.columns)
rates.head()
rates = rates.drop("DATE", axis=1)
cv = rates.cov().values
cv.round(3)
eigen_decomp = eig(cv)
eigen_decomp
bar(arange(len(eigen_decomp[0])),eigen_decomp[0])
grid()
How do you interpret this matrix below?
eigen_decomp[1][:,:3]
An important application of eigensystems is in social networks. We often want to rank nodes in a network by influence. This is defined as eigenvalue centrality.
nb_setup.images_vconcat(["DL_images/Centrality.png"], width=600)
A = array([[0,1,1],[1,0,0],[1,0,0]])
res = eig(A)
print(res)
print('Centrality vector =',res[1][:,argmax(res[0])])
A = array([[0,1,1],[1,0,1],[1,1,0]])
res = eig(A)
print(res)
print('Centrality vector =',res[1][:,argmax(res[0])])
A = array([[0,2,1],[2,0,0],[1,0,0]])
res = eig(A)
print(res)
print('Centrality vector =',res[1][:,argmax(res[0])])
In finance, we often need to check whether a covariance matrix is positive definite. That is, it is positive in high dimension, as a covariance matrix is a high dimension version of univariate variance.
A matrix is positive definite if all its eigenvalues are positive.
A matrix is positive semi-definite if all its eigenvalues are non-negative.
Let's check if the covariance matrix of interest rates is positive definite.
A = rates.cov().values #get the covariance matrix
ev = eig(A)[0]
print(ev)
print('Positive definite? ',all(eig(A)[0]>0))
SVD is a generalization of eigenvalue decomposition. We can obtain decompositions of non-square matrices. Consider the decomposition of a non-square matrix $M$ of size $m \times n$. The canonical decomposition is as follows:
$$ M = T \cdot S \cdot D^\top $$where $T$ is $m \times m$, $S$ is $m \times n$, and $D^\top$ is $n \times n$. $T$ and $D$ are orthonormal to each other. $S$ is the “singular values” matrix, i.e., a diagonal matrix with singular values on the principal diagonal. These values denote the relative importance of the terms in the TDM. (vs orthogonal)
Let's create a 50 x 10 matrix and then undertake SVD. We often do this with Term-Document matrices.
tdm = zeros((50,10))
for i in range(50):
for j in range(10):
if rand()<0.10:
tdm[i,j] = randint(20)
tdm
#Using SciPy
from scipy.linalg import svd
T,S,Dt = svd(tdm)
print(T.shape, S.shape, Dt.shape)
print(S)
#Using SkLearn
from sklearn.decomposition import TruncatedSVD
svd = TruncatedSVD(n_components=5, n_iter=100, random_state=42)
svd.fit(tdm)
print(svd.explained_variance_ratio_)
print(svd.explained_variance_ratio_.sum())
print(svd.singular_values_)
from sklearn.utils.extmath import randomized_svd
T, S, Dt = randomized_svd(tdm, n_components=5, n_iter=100, random_state=42)
print(T.shape, S.shape, Dt.shape)
print(S)
rates.shape
from sklearn.utils.extmath import randomized_svd
T, S, Dt = randomized_svd(rates.values, n_components=2, n_iter=100, random_state=42)
print(T.shape, S.shape, Dt.shape)
print(S)
figure(figsize=(15,5))
subplot(1,2,1)
plot(T[:,0])
grid()
subplot(1,2,2)
plot(T[:,1])
grid()
We see that the results are the same as we get with PCA.
What are these components? The first one may be GDP growth and the second one may be inflation.
Here we will take a very small $M$ matrix and decompose it. Recall that
$$ M = T · S · D^⊤ $$We will show that $T$ is the eigenvector matrix of $M · M^\top$, and $D$ is the eigenvector matrix of $M^⊤ · M$. The singular matrix contains the eigenvalues. When $M$ is a square matrix, this is the usual eigen decomposition.
M = array([[1,1,0,0],[1,0,1,0],[1,0,0,0],[0,1,1,1],[0,0,1,1]])
M
from scipy.linalg import svd
T,S,Dt = svd(M)
print(T.shape, S.shape, Dt.shape)
print("T =",T.round(3))
print("S =",S.round(4))
print("Dt =",Dt.round(3))
#Check T
res = eig(M.dot(M.T))[1]
print(res.shape)
print("T check =",res.round(3))
#Check Dt
res = eig(M.T.dot(M))[1]
print(res.shape)
print("Dt check =",res.T.round(3))
#Check S, noting that it is the square of the eigenvalues from a eigen decomposition of co-occurence
res = eig(M.T.dot(M))[0]
print("S check =",res.round(3), S**2)
res = eig(M.dot(M.T))[0]
print("S check =",res.round(3))
Suppose you have a covariance matrix $A$. This is a square matrix, of course. You may want to decompose it into a product of two square matrices, i.e., factor it into $L$ and $U$. One way to do this is known as "LU decomposition."
$$ A = L · U $$A = rates.cov().values
from scipy.linalg import lu
LU = lu(A)
print('L =',LU[1].round(1))
print('U =',LU[2].round(1))
In this decomposition, $L=U$, so that we want
$$ A = L · L^⊤ $$Note that this is especially meaningful when the matrix $A$ is a covariance matrix. Then we may think of $L$ as a "square-root" of $A$. Therefore, it is a "kind-of" standard deviation matrix. We will show the code and then examine an application.
print(A.round(2))
ch = cholesky(A)
print(ch.round(2))
#Check the decomposition
res = ch.dot(ch.T)
print(res.round(2))
Suppose we want to generate correlated random numbers from a covariance matrix. We find the cholesky decomposition $L$, and then apply it to a set of uncorrelated random numbers to transform these numbers into correlated numbers.
Let's take the Treasury rates covariance matrix and then generate 1000 sets of 8 uncorrelated random numbers. We will then convert them to correlated numbers.
z = randn(80000).reshape((8,10000))
z.shape
#Correlated normal values
x = ch.dot(z).T
print(x.shape)
df = pd.DataFrame(x)
df.cov()