!pip install ipypublish

Collecting ipypublish
  Downloading https://files.pythonhosted.org/packages/06/e1/1d5a845940e558fd3fc3c6f4265f83d45945fd8fb103fab35b0c8af0da27/ipypublish-0.10.11-py2.py3-none-any.whl (292kB)
     |████████████████████████████████| 296kB 4.6MB/s 
Requirement already satisfied: six>=1.11.0 in /usr/local/lib/python3.6/dist-packages (from ipypublish) (1.15.0)
Collecting ruamel.yaml
  Downloading https://files.pythonhosted.org/packages/a6/92/59af3e38227b9cc14520bf1e59516d99ceca53e3b8448094248171e9432b/ruamel.yaml-0.16.10-py2.py3-none-any.whl (111kB)
     |████████████████████████████████| 112kB 14.7MB/s 
Requirement already satisfied: docutils; python_version >= "3" in /usr/local/lib/python3.6/dist-packages (from ipypublish) (0.15.2)
Collecting jsonextended>=0.7
  Downloading https://files.pythonhosted.org/packages/7b/aa/e084e46ed3a7aab0b910790ca82f496e71dc5a2b7cc64793ee54f5d8bbd3/jsonextended-0.7.11-py2.py3-none-any.whl (466kB)
     |████████████████████████████████| 471kB 16.8MB/s 
Requirement already satisfied: jinja2 in /usr/local/lib/python3.6/dist-packages (from ipypublish) (2.11.2)
Requirement already satisfied: traitlets in /usr/local/lib/python3.6/dist-packages (from ipypublish) (4.3.3)
Requirement already satisfied: nbformat in /usr/local/lib/python3.6/dist-packages (from ipypublish) (5.0.7)
Requirement already satisfied: nbconvert in /usr/local/lib/python3.6/dist-packages (from ipypublish) (5.6.1)
Collecting panflute
  Downloading https://files.pythonhosted.org/packages/84/e3/7b5c4b449b7b06b8d92884ebc04103a718ed00dea8179f2be504f91776a8/panflute-1.12.5.tar.gz
Collecting bibtexparser
  Downloading https://files.pythonhosted.org/packages/7c/c3/c184a4460ba2f4877e3389e2d63479f642d0d3bdffeeffee0723d3b0156d/bibtexparser-1.2.0.tar.gz (46kB)
     |████████████████████████████████| 51kB 8.3MB/s 
Collecting ordered-set
  Downloading https://files.pythonhosted.org/packages/f5/ab/8252360bfe965bba31ec05112b3067bd129ce4800d89e0b85613bc6044f6/ordered-set-4.0.2.tar.gz
Requirement already satisfied: tornado in /usr/local/lib/python3.6/dist-packages (from ipypublish) (5.1.1)
Collecting jupytext
  Downloading https://files.pythonhosted.org/packages/bc/71/eaba4f15759a8295e51dd8bffcb5bbd076a2e1742da56509fe5ade1271ec/jupytext-1.5.2.tar.gz (677kB)
     |████████████████████████████████| 686kB 19.6MB/s 
Requirement already satisfied: jsonschema in /usr/local/lib/python3.6/dist-packages (from ipypublish) (2.6.0)
Collecting ruamel.yaml.clib>=0.1.2; platform_python_implementation == "CPython" and python_version < "3.9"
  Downloading https://files.pythonhosted.org/packages/53/77/4bcd63f362bcb6c8f4f06253c11f9772f64189bf08cf3f40c5ccbda9e561/ruamel.yaml.clib-0.2.0-cp36-cp36m-manylinux1_x86_64.whl (548kB)
     |████████████████████████████████| 552kB 29.8MB/s 
Collecting pathlib2
  Downloading https://files.pythonhosted.org/packages/e9/45/9c82d3666af4ef9f221cbb954e1d77ddbb513faf552aea6df5f37f1a4859/pathlib2-2.3.5-py2.py3-none-any.whl
Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.6/dist-packages (from jinja2->ipypublish) (1.1.1)
Requirement already satisfied: decorator in /usr/local/lib/python3.6/dist-packages (from traitlets->ipypublish) (4.4.2)
Requirement already satisfied: ipython-genutils in /usr/local/lib/python3.6/dist-packages (from traitlets->ipypublish) (0.2.0)
Requirement already satisfied: jupyter-core in /usr/local/lib/python3.6/dist-packages (from nbformat->ipypublish) (4.6.3)
Requirement already satisfied: bleach in /usr/local/lib/python3.6/dist-packages (from nbconvert->ipypublish) (3.1.5)
Requirement already satisfied: pygments in /usr/local/lib/python3.6/dist-packages (from nbconvert->ipypublish) (2.1.3)
Requirement already satisfied: mistune<2,>=0.8.1 in /usr/local/lib/python3.6/dist-packages (from nbconvert->ipypublish) (0.8.4)
Requirement already satisfied: testpath in /usr/local/lib/python3.6/dist-packages (from nbconvert->ipypublish) (0.4.4)
Requirement already satisfied: defusedxml in /usr/local/lib/python3.6/dist-packages (from nbconvert->ipypublish) (0.6.0)
Requirement already satisfied: entrypoints>=0.2.2 in /usr/local/lib/python3.6/dist-packages (from nbconvert->ipypublish) (0.3)
Requirement already satisfied: pandocfilters>=1.4.1 in /usr/local/lib/python3.6/dist-packages (from nbconvert->ipypublish) (1.4.2)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.6/dist-packages (from panflute->ipypublish) (3.13)
Requirement already satisfied: click in /usr/local/lib/python3.6/dist-packages (from panflute->ipypublish) (7.1.2)
Requirement already satisfied: pyparsing>=2.0.3 in /usr/local/lib/python3.6/dist-packages (from bibtexparser->ipypublish) (2.4.7)
Requirement already satisfied: future>=0.16.0 in /usr/local/lib/python3.6/dist-packages (from bibtexparser->ipypublish) (0.16.0)
Requirement already satisfied: toml in /usr/local/lib/python3.6/dist-packages (from jupytext->ipypublish) (0.10.1)
Requirement already satisfied: packaging in /usr/local/lib/python3.6/dist-packages (from bleach->nbconvert->ipypublish) (20.4)
Requirement already satisfied: webencodings in /usr/local/lib/python3.6/dist-packages (from bleach->nbconvert->ipypublish) (0.5.1)
Building wheels for collected packages: panflute, bibtexparser, ordered-set, jupytext
  Building wheel for panflute (setup.py) ... done
  Created wheel for panflute: filename=panflute-1.12.5-cp36-none-any.whl size=31627 sha256=676fe43af43c2ee924ac78cb5443892d7dd3fa6cfccf92713c51d53eb77fde2e
  Stored in directory: /root/.cache/pip/wheels/32/b3/ae/db7fa3b632575b1050b93af1ab043eb868ed20b5cefa5a3912
  Building wheel for bibtexparser (setup.py) ... done
  Created wheel for bibtexparser: filename=bibtexparser-1.2.0-cp36-none-any.whl size=36714 sha256=6fa2e578a13d20a30fbcd3260d4e7e0006f031e75de048f2c3b61d9c172c95b2
  Stored in directory: /root/.cache/pip/wheels/b2/5a/e7/867bcbc3a81c15b675b931aa19b6698375c5a5e90419a366db
  Building wheel for ordered-set (setup.py) ... done
  Created wheel for ordered-set: filename=ordered_set-4.0.2-py2.py3-none-any.whl size=8209 sha256=f8f7f8dfa2b453410667a3a19e84697071a9598225a888609ff0bc57fa2489ee
  Stored in directory: /root/.cache/pip/wheels/e1/c6/9b/651d8a21d59b51a75ab9c070838f9231b8126421bc0569af47
  Building wheel for jupytext (setup.py) ... done
  Created wheel for jupytext: filename=jupytext-1.5.2-cp36-none-any.whl size=281979 sha256=3299047d859eb8c3f24e06a74b358487553ab824f305a8f0fecac8aa67b69c68
  Stored in directory: /root/.cache/pip/wheels/d1/3b/42/81158eb89f58243fe0b47f41d0d28670171354276f437ee630
Successfully built panflute bibtexparser ordered-set jupytext
Installing collected packages: ruamel.yaml.clib, ruamel.yaml, pathlib2, jsonextended, panflute, bibtexparser, ordered-set, jupytext, ipypublish
Successfully installed bibtexparser-1.2.0 ipypublish-0.10.11 jsonextended-0.7.11 jupytext-1.5.2 ordered-set-4.0.2 panflute-1.12.5 pathlib2-2.3.5 ruamel.yaml-0.16.10 ruamel.yaml.clib-0.2.0


%pylab inline
import pandas as pd
from IPython.external import mathjax
from ipypublish import nb_setup

Populating the interactive namespace from numpy and matplotlib


from google.colab import drive
drive.mount('/content/drive')  # Add My Drive/<>

import os
os.chdir('drive/My Drive')
os.chdir('Books_Writings/ML_Book/')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-4-28aef19873cf> in <module>()
      3 
      4 import os
----> 5 os.chdir('drive/My Drive')
      6 os.chdir('Books_Writings/ML_Book/')

FileNotFoundError: [Errno 2] No such file or directory: 'drive/My Drive'


nb_setup.images_hconcat(["DSTMAA_images/ML_AI.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/DL_PatternRecognition.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/NN_diagram.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/NN_subset.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/Activation_functions.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/Softmax.png"], width=700)


#The Softmax function
#Assume 10 output nodes with randomly generated values
z = randn(32)  #inputs from last hidden layer of 32 nodes to the output layer
w = rand(32*10).reshape((10,32))  #weights for the output layer
b = rand(10)   #bias terms at output later
a = w.dot(z) + b  #Net input at output layer
e = exp(a)
softmax_output = e/sum(e)
print(softmax_output.round(3))
print('final tag =',where(softmax_output==softmax_output.max())[0][0])

[0.198 0.033 0.267 0.144 0.043 0.155 0.023 0.039 0.035 0.063]
final tag = 2


nb_setup.images_hconcat(["DSTMAA_images/Loss_function.png"], width=600)


y = [0.33, 0.33, 0.34]
bits = log2(y)
print(bits)
entropy = -sum(y*bits)
print(entropy)

[-1.59946207 -1.59946207 -1.55639335]
1.58481870497303


y = [0.2, 0.3, 0.5]
entropy = -sum(y*log2(y))
print(entropy)

1.4854752972273344


y = [0.1, 0.1, 0.8]
print(log2(y))
entropy = -sum(y*log2(y))
print(entropy)

[-3.32192809 -3.32192809 -0.32192809]
0.9219280948873623


#Correct prediction
y = [0, 0, 1]
yhat = [0.1, 0.1, 0.8]
crossentropy = -sum(y*log2(yhat))
print(crossentropy)

0.3219280948873623


#Wrong prediction
yhat = [0.1, 0.6, 0.3]
crossentropy = -sum(y*log2(yhat))
print(crossentropy)

1.736965594166206


#Correct prediction
y = [0, 0, 1.0]
yhat = [0.1, 0.1, 0.8]
KL = sum(y[2]*log2(y[2]/yhat[2]))
print(KL)

#Wrong prediction
yhat = [0.1, 0.6, 0.3]
KL = sum(y[2]*log2(y[2]/yhat[2]))
print(KL)

0.32192809488736235
1.736965594166206


nb_setup.images_hconcat(["DSTMAA_images/Gradient_descent.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/Chain_rule.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/Delta_values.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/Output_layer.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/Feedforward_Backprop.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/Recap.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/Backprop_one_slide.png"], width=600)


def f(x):
    return 3*x**2 -5*x + 10

x = linspace(-4,4,100)
plot(x,f(x))
grid()


dx = 0.001
eta = 0.05  #learning rate
x = -3
for j in range(20):
    df_dx = (f(x+dx)-f(x))/dx
    x = x - eta*df_dx
    print(x,f(x))

-1.8501500000001698 29.519915067502733
-1.0452550000002532 18.503949045077853
-0.4818285000003115 13.105618610239208
-0.08742995000019249 10.460081738472072
0.18864903499989083 9.163520200219716
0.3819043244999847 8.528031116715445
0.5171830271499616 8.216519714966186
0.6118781190049631 8.06379390252634
0.6781646833034642 7.988898596522943
0.7245652783124417 7.95215813604575
0.7570456948186948 7.934126078037087
0.7797819863730542 7.925269906950447
0.7956973904611502 7.920916059254301
0.8068381733227685 7.918772647178622
0.8146367213259147 7.917715356568335
0.8200957049281836 7.917192371084045
0.8239169934497053 7.916932669037079
0.8265918954147882 7.9168030076222955
0.8284643267903373 7.916737788340814
0.8297750287532573 7.916704651261121


nb_setup.images_hconcat(["DSTMAA_images/LearningRate_Matters.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/GD_MultipleDimensions.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/GD_saddle.png"], width=400)


nb_setup.images_hconcat(["DSTMAA_images/LearningRateAnnealing.png"], width=500)


nb_setup.images_hconcat(["DSTMAA_images/LearningRateAnnealing2.png"], width=500)


nb_setup.images_hconcat(["DSTMAA_images/sigmoid_activation.png"], width=700)


nb_setup.images_hconcat(["DSTMAA_images/tanh_activation.png"], width=500)


nb_setup.images_hconcat(["DSTMAA_images/relu_activation.png"], width=400)


nb_setup.images_hconcat(["DSTMAA_images/dead_relu.png"], width=400)


nb_setup.images_hconcat(["DSTMAA_images/leaky_relu.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/prelu.png"], width=500)


nb_setup.images_hconcat(["DSTMAA_images/maxout.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/data_preprocessing.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/zero_centering_helps.png"], width=400)


nb_setup.images_hconcat(["DSTMAA_images/batch_normalization.png"], width=600)


## Under and Over-fitting


nb_setup.images_hconcat(["DSTMAA_images/underoverfitting.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/early_stopping.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/dropout.png"], width=600)


nb_setup.images_hconcat(["DSTMAA_images/bagging.png"], width=500)


nb_setup.images_hconcat(["DSTMAA_images/TensorFlow_playground.png"], width=600)


import pandas as pd


## Read in the data set
data = pd.read_csv("DSTMAA_data/BreastCancer.csv")
data.head()


x = data.loc[:,'Cl.thickness':'Mitoses']
print(x.head())
y = data.loc[:,'Class']
print(y.head())

   Cl.thickness  Cell.size  Cell.shape  ...  Bl.cromatin  Normal.nucleoli  Mitoses
0             5          1           1  ...            3                1        1
1             5          4           4  ...            3                2        1
2             3          1           1  ...            3                1        1
3             6          8           8  ...            3                7        1
4             4          1           1  ...            3                1        1

[5 rows x 9 columns]
0    benign
1    benign
2    benign
3    benign
4    benign
Name: Class, dtype: object


## Convert the class variable into binary numeric
ynum = zeros((len(x),1))
for j in arange(len(y)):
    if y[j]=="malignant":
        ynum[j]=1
ynum[:10]

array([[0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [1.],
       [0.],
       [0.],
       [0.],
       [0.]])


## Make label data have 1-shape, 1=malignant
from keras import utils
y.labels = utils.to_categorical(ynum, num_classes=2)
#x = x.as_matrix()
print(y.labels[:10])
print(shape(x))
print(shape(y.labels))
print(shape(ynum))

Using TensorFlow backend.

[[1. 0.]
 [1. 0.]
 [1. 0.]
 [1. 0.]
 [1. 0.]
 [0. 1.]
 [1. 0.]
 [1. 0.]
 [1. 0.]
 [1. 0.]]
(683, 9)
(683, 2)
(683, 1)


## Define the neural net and compile it
from keras.models import Sequential
from keras.layers import Dense, Activation

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=9))
model.add(Dense(32, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])


## Fit/train the model (x,y need to be matrices)
model.fit(x, ynum, epochs=25, batch_size=32, verbose=2, validation_split=0.3)

Train on 478 samples, validate on 205 samples
Epoch 1/25
 - 2s - loss: 0.5969 - accuracy: 0.5439 - val_loss: 0.5223 - val_accuracy: 0.9073
Epoch 2/25
 - 0s - loss: 0.4930 - accuracy: 0.8473 - val_loss: 0.4246 - val_accuracy: 0.9610
Epoch 3/25
 - 0s - loss: 0.4308 - accuracy: 0.8912 - val_loss: 0.3669 - val_accuracy: 0.9610
Epoch 4/25
 - 0s - loss: 0.3815 - accuracy: 0.9121 - val_loss: 0.3117 - val_accuracy: 0.9659
Epoch 5/25
 - 0s - loss: 0.3443 - accuracy: 0.9121 - val_loss: 0.2514 - val_accuracy: 0.9854
Epoch 6/25
 - 0s - loss: 0.3159 - accuracy: 0.9121 - val_loss: 0.2656 - val_accuracy: 0.9561
Epoch 7/25
 - 0s - loss: 0.2954 - accuracy: 0.9121 - val_loss: 0.2374 - val_accuracy: 0.9610
Epoch 8/25
 - 0s - loss: 0.2755 - accuracy: 0.9310 - val_loss: 0.1803 - val_accuracy: 0.9951
Epoch 9/25
 - 0s - loss: 0.2575 - accuracy: 0.9226 - val_loss: 0.1933 - val_accuracy: 0.9854
Epoch 10/25
 - 0s - loss: 0.2369 - accuracy: 0.9351 - val_loss: 0.1463 - val_accuracy: 0.9902
Epoch 11/25
 - 0s - loss: 0.2223 - accuracy: 0.9414 - val_loss: 0.1606 - val_accuracy: 0.9902
Epoch 12/25
 - 0s - loss: 0.2072 - accuracy: 0.9414 - val_loss: 0.1613 - val_accuracy: 0.9854
Epoch 13/25
 - 0s - loss: 0.2144 - accuracy: 0.9519 - val_loss: 0.1073 - val_accuracy: 0.9951
Epoch 14/25
 - 0s - loss: 0.1904 - accuracy: 0.9519 - val_loss: 0.1032 - val_accuracy: 0.9951
Epoch 15/25
 - 0s - loss: 0.1771 - accuracy: 0.9540 - val_loss: 0.1048 - val_accuracy: 0.9951
Epoch 16/25
 - 0s - loss: 0.1711 - accuracy: 0.9519 - val_loss: 0.0820 - val_accuracy: 0.9951
Epoch 17/25
 - 0s - loss: 0.1614 - accuracy: 0.9603 - val_loss: 0.0964 - val_accuracy: 0.9902
Epoch 18/25
 - 0s - loss: 0.1603 - accuracy: 0.9477 - val_loss: 0.0855 - val_accuracy: 0.9902
Epoch 19/25
 - 0s - loss: 0.1542 - accuracy: 0.9582 - val_loss: 0.0762 - val_accuracy: 0.9951
Epoch 20/25
 - 0s - loss: 0.1462 - accuracy: 0.9603 - val_loss: 0.0636 - val_accuracy: 0.9902
Epoch 21/25
 - 0s - loss: 0.1321 - accuracy: 0.9707 - val_loss: 0.0679 - val_accuracy: 0.9902
Epoch 22/25
 - 0s - loss: 0.1379 - accuracy: 0.9623 - val_loss: 0.0612 - val_accuracy: 0.9902
Epoch 23/25
 - 0s - loss: 0.1258 - accuracy: 0.9644 - val_loss: 0.0631 - val_accuracy: 0.9902
Epoch 24/25
 - 0s - loss: 0.1250 - accuracy: 0.9665 - val_loss: 0.0523 - val_accuracy: 0.9902
Epoch 25/25
 - 0s - loss: 0.1210 - accuracy: 0.9644 - val_loss: 0.0564 - val_accuracy: 0.9902

<keras.callbacks.callbacks.History at 0x7f24f71ea7f0>


## Accuracy
yhat = model.predict_classes(x, batch_size=32)
acc = sum(yhat==ynum)
print("Accuracy = ",acc/len(ynum))

## Confusion matrix
from sklearn.metrics import confusion_matrix
confusion_matrix(yhat,ynum)

Accuracy =  0.9765739385065886

array([[431,   3],
       [ 13, 236]])


nb_setup.images_hconcat(["DSTMAA_images/MNIST.png"], width=800)


## Read in the data set
train = pd.read_csv("DSTMAA_data/train.csv", header=None)
test = pd.read_csv("DSTMAA_data/test.csv", header=None)
print(shape(train))
print(shape(test))

(60000, 785)
(10000, 785)


train.shape

(60000, 785)


## Reformat the data
X_train = train.loc[:,:783]
Y_train = train.loc[:,784]
print(shape(X_train))
print(shape(Y_train))
X_test = test.loc[:,:783]
Y_test = test.loc[:,784]
print(shape(X_test))
print(shape(Y_test))
y.labels = utils.to_categorical(Y_train, num_classes=10)
print(shape(y.labels))
print(y.labels[1:5,:])
print(Y_train[1:5])

(60000, 784)
(60000,)
(10000, 784)
(10000,)
(60000, 10)
[[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]]
1    3
2    0
3    0
4    2
Name: 784, dtype: int64


## Define the neural net and compile it
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.optimizers import SGD
from tensorflow.keras.utils import plot_model

data_dim = shape(X_train)[1]

model = Sequential([
    Dense(100, input_shape=(784,)),
    Activation('sigmoid'),
    Dense(100),
    Activation('sigmoid'),
    Dense(100),
    Activation('sigmoid'),
    Dense(100),
    Activation('sigmoid'),
    Dense(10),
    Activation('softmax'),
])

#model = Sequential()
#model.add(Dense(100, activation='sigmoid', input_dim=data_dim))
#model.add(Dropout(0.25))
#model.add(Dense(100, activation='sigmoid'))
#model.add(Dropout(0.25))
#model.add(Dense(100, activation='sigmoid'))
#model.add(Dropout(0.25))
#model.add(Dense(100, activation='sigmoid'))
#model.add(Dropout(0.25))
#model.add(Dense(10, activation='softmax'))

model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_20 (Dense)             (None, 100)               78500     
_________________________________________________________________
activation_16 (Activation)   (None, 100)               0         
_________________________________________________________________
dense_21 (Dense)             (None, 100)               10100     
_________________________________________________________________
activation_17 (Activation)   (None, 100)               0         
_________________________________________________________________
dense_22 (Dense)             (None, 100)               10100     
_________________________________________________________________
activation_18 (Activation)   (None, 100)               0         
_________________________________________________________________
dense_23 (Dense)             (None, 100)               10100     
_________________________________________________________________
activation_19 (Activation)   (None, 100)               0         
_________________________________________________________________
dense_24 (Dense)             (None, 10)                1010      
_________________________________________________________________
activation_20 (Activation)   (None, 10)                0         
=================================================================
Total params: 109,810
Trainable params: 109,810
Non-trainable params: 0
_________________________________________________________________


plot_model(model)


## Fit/train the model (x,y need to be matrices)
model.fit(X_train, y.labels, epochs=10, batch_size=32, verbose=2, validation_split=0.2)

Train on 48000 samples, validate on 12000 samples
Epoch 1/10
 - 6s - loss: 0.2277 - accuracy: 0.9320 - val_loss: 0.2233 - val_accuracy: 0.9314
Epoch 2/10
 - 5s - loss: 0.2188 - accuracy: 0.9334 - val_loss: 0.2216 - val_accuracy: 0.9336
Epoch 3/10
 - 6s - loss: 0.2135 - accuracy: 0.9353 - val_loss: 0.2231 - val_accuracy: 0.9333
Epoch 4/10
 - 6s - loss: 0.2106 - accuracy: 0.9360 - val_loss: 0.2071 - val_accuracy: 0.9379
Epoch 5/10
 - 5s - loss: 0.2034 - accuracy: 0.9382 - val_loss: 0.2023 - val_accuracy: 0.9375
Epoch 6/10
 - 5s - loss: 0.1954 - accuracy: 0.9406 - val_loss: 0.2177 - val_accuracy: 0.9348
Epoch 7/10
 - 5s - loss: 0.1915 - accuracy: 0.9423 - val_loss: 0.1890 - val_accuracy: 0.9423
Epoch 8/10
 - 6s - loss: 0.1856 - accuracy: 0.9434 - val_loss: 0.1951 - val_accuracy: 0.9427
Epoch 9/10
 - 5s - loss: 0.1794 - accuracy: 0.9460 - val_loss: 0.1944 - val_accuracy: 0.9414
Epoch 10/10
 - 6s - loss: 0.1786 - accuracy: 0.9464 - val_loss: 0.1918 - val_accuracy: 0.9427

<keras.callbacks.callbacks.History at 0x7f23dc93af98>


## In Sample
yhat = model.predict_classes(X_train, batch_size=32)

## Confusion matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(yhat,Y_train)
print(" ")
print(cm)

##
acc = sum(diag(cm))/len(Y_train)
print("Accuracy = ",acc)

 
[[5758    1   23    4   13   20   28   16   18   25]
 [   1 6608   35   25   18    9   10   30  118   17]
 [  24   33 5660  103   40   28   24   41   63   11]
 [  10   24   35 5656    0  130    2   12   93   50]
 [   4    9   27    2 5557   11   12   35   31  253]
 [  28   15   17  143    2 5017   38    6   70   30]
 [  61    4   54   15   81   86 5784    1   55    3]
 [   1   16   51   72    7    7    0 6016   12   63]
 [  27   23   54   70    8   76   20    9 5301   40]
 [   9    9    2   41  116   37    0   99   90 5457]]
Accuracy =  0.9469


## Out of Sample
yhat = model.predict_classes(X_test, batch_size=32)

## Confusion matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(yhat,Y_test)
print(" ")
print(cm)

##
acc = sum(diag(cm))/len(Y_test)
print("Accuracy = ",acc)

 
[[ 960    0    9    0    1    7    9    1    3    4]
 [   0 1119    3    1    1    4    3   10   11    8]
 [   1    3  968   19    5    1    3   18    5    1]
 [   2    4   14  941    0   31    2    3   13    7]
 [   0    1    3    0  940    3    1    1    7   44]
 [   2    0    4   21    0  811    6    0   16    8]
 [  13    3   10    1   17   14  928    0    8    0]
 [   1    2    9   10    2    3    0  978    7    6]
 [   1    2   11   13    2   12    5    1  894   10]
 [   0    1    1    4   14    6    1   16   10  921]]
Accuracy =  0.946


from scipy.stats import norm
def BSM(S,K,T,sig,rf,dv,cp):  #cp = {+1.0 (calls), -1.0 (puts)}
    d1 = (math.log(S/K)+(rf-dv+0.5*sig**2)*T)/(sig*math.sqrt(T))
    d2 = d1 - sig*math.sqrt(T)
    return cp*S*math.exp(-dv*T)*norm.cdf(d1*cp) - cp*K*math.exp(-rf*T)*norm.cdf(d2*cp)

df = pd.read_csv('DSTMAA_data/BS_training.csv')


df['Stock Price'] = df['Stock Price']/df['Strike Price']
df['Call Price']  = df['Call Price'] /df['Strike Price']


n = 300000
n_train =  (int)(0.8 * n)
train = df[0:n_train]
X_train = train[['Stock Price', 'Maturity', 'Dividends', 'Volatility', 'Risk-free']].values
y_train = train['Call Price'].values
test = df[n_train+1:n]
X_test = test[['Stock Price', 'Maturity', 'Dividends', 'Volatility', 'Risk-free']].values
y_test = test['Call Price'].values


#Import libraries
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, LeakyReLU
from keras import backend

def custom_activation(x):
    return backend.exp(x)


nodes = 120
model = Sequential()

model.add(Dense(nodes, input_dim=X_train.shape[1]))
#model.add("relu")
model.add(Dropout(0.25))

model.add(Dense(nodes, activation='elu'))
model.add(Dropout(0.25))

model.add(Dense(nodes, activation='relu'))
model.add(Dropout(0.25))

model.add(Dense(nodes, activation='elu'))
model.add(Dropout(0.25))

model.add(Dense(1))
model.add(Activation(custom_activation))
          
model.compile(loss='mse',optimizer='rmsprop')

model.fit(X_train, y_train, batch_size=64, epochs=10, validation_split=0.1, verbose=2)

Train on 216000 samples, validate on 24000 samples
Epoch 1/10
 - 10s - loss: 0.0053 - val_loss: 0.0022
Epoch 2/10
 - 10s - loss: 0.0015 - val_loss: 0.0012
Epoch 3/10
 - 10s - loss: 0.0011 - val_loss: 1.5032e-04
Epoch 4/10
 - 10s - loss: 9.4730e-04 - val_loss: 1.1840e-04
Epoch 5/10
 - 10s - loss: 8.2619e-04 - val_loss: 3.4301e-04
Epoch 6/10
 - 10s - loss: 7.5072e-04 - val_loss: 5.3885e-04
Epoch 7/10
 - 10s - loss: 6.9686e-04 - val_loss: 6.3596e-04
Epoch 8/10
 - 10s - loss: 6.6734e-04 - val_loss: 3.9628e-04
Epoch 9/10
 - 10s - loss: 6.4379e-04 - val_loss: 3.7819e-04
Epoch 10/10
 - 10s - loss: 6.2233e-04 - val_loss: 2.2964e-04

<keras.callbacks.callbacks.History at 0x7f23db54ff98>


def CheckAccuracy(y,y_hat):
    stats = dict()
    
    stats['diff'] = y - y_hat
    
    stats['mse'] = mean(stats['diff']**2)
    print("Mean Squared Error:      ", stats['mse'])
    
    stats['rmse'] = sqrt(stats['mse'])
    print("Root Mean Squared Error: ", stats['rmse'])
    
    stats['mae'] = mean(abs(stats['diff']))
    print("Mean Absolute Error:     ", stats['mae'])
    
    stats['mpe'] = sqrt(stats['mse'])/mean(y)
    print("Mean Percent Error:      ", stats['mpe'])
    
    #plots
    mpl.rcParams['agg.path.chunksize'] = 100000
    figure(figsize=(10,3))
    plt.scatter(y, y_hat,color='black',linewidth=0.3,alpha=0.4, s=0.5)
    plt.xlabel('Actual Price',fontsize=20,fontname='Times New Roman')
    plt.ylabel('Predicted Price',fontsize=20,fontname='Times New Roman') 
    plt.show()
    
    figure(figsize=(10,3))
    plt.hist(stats['diff'], bins=50,edgecolor='black',color='white')
    plt.xlabel('Diff')
    plt.ylabel('Density')
    plt.show()
    
    return stats


y_train_hat = model.predict(X_train)
#reduce dim (240000,1) -> (240000,) to match y_train's dim
y_train_hat = squeeze(y_train_hat)
CheckAccuracy(y_train, y_train_hat)

findfont: Font family ['Times New Roman'] not found. Falling back to DejaVu Sans.

Mean Squared Error:       0.00023008994372701063
Root Mean Squared Error:  0.015168715955116657
Mean Absolute Error:      0.011827044126932673
Mean Percent Error:       0.05670397821662379

{'diff': array([0.01986485, 0.0093275 , 0.0173008 , ..., 0.01663176, 0.0257388 ,
        0.00174695]),
 'mae': 0.011827044126932673,
 'mpe': 0.05670397821662379,
 'mse': 0.00023008994372701063,
 'rmse': 0.015168715955116657}


y_test_hat = model.predict(X_test)
y_test_hat = squeeze(y_test_hat)
test_stats = CheckAccuracy(y_test, y_test_hat)

Mean Squared Error:       0.00023010178562453136
Root Mean Squared Error:  0.015169106289578545
Mean Absolute Error:      0.011837113037171105
Mean Percent Error:       0.0567868417818699


n = 300000
n_train =  (int)(0.8 * n)
train = df[0:n_train]
X_train = train[['Stock Price', 'Maturity', 'Dividends', 'Volatility', 'Risk-free']].values
y_train = train['Call Price'].values
test = df[n_train+1:n]
X_test = test[['Stock Price', 'Maturity', 'Dividends', 'Volatility', 'Risk-free']].values
y_test = test['Call Price'].values


def CheckAccuracy(y,y_hat):
    stats = dict()
    
    stats['diff'] = y - y_hat
    
    stats['mse'] = mean(stats['diff']**2)
    print("Mean Squared Error:      ", stats['mse'])
    
    stats['rmse'] = sqrt(stats['mse'])
    print("Root Mean Squared Error: ", stats['rmse'])
    
    stats['mae'] = mean(abs(stats['diff']))
    print("Mean Absolute Error:     ", stats['mae'])
    
    stats['mpe'] = sqrt(stats['mse'])/mean(y)
    print("Mean Percent Error:      ", stats['mpe'])
    
    #plots
    mpl.rcParams['agg.path.chunksize'] = 100000
    #figure(figsize=(14,10))
    plt.scatter(y, y_hat,color='black',linewidth=0.3,alpha=0.4, s=0.5)
    plt.xlabel('Actual Price',fontsize=20,fontname='Times New Roman')
    plt.ylabel('Predicted Price',fontsize=20,fontname='Times New Roman') 
    plt.show()
    
    #figure(figsize=(14,10))
    plt.hist(stats['diff'], bins=50,edgecolor='black',color='white')
    plt.xlabel('Diff')
    plt.ylabel('Density')
    plt.show()
    
    return stats


from sklearn.ensemble import RandomForestRegressor

forest = RandomForestRegressor()
forest = forest.fit(X_train, y_train)
y_test_hat = forest.predict(X_test)


stats = CheckAccuracy(y_test, y_test_hat)

Mean Squared Error:       3.432145217739371e-05
Root Mean Squared Error:  0.00585845134633665
Mean Absolute Error:      0.004268844808400679
Mean Percent Error:       0.021931611746946578

	Id	Cl.thickness	Cell.size	Cell.shape	Marg.adhesion	Epith.c.size	Bare.nuclei	Bl.cromatin	Normal.nucleoli	Mitoses	Class
0	1000025	5	1	1	1	2	1	3	1	1	benign
1	1002945	5	4	4	5	7	10	3	2	1	benign
2	1015425	3	1	1	1	2	2	3	1	1	benign
3	1016277	6	8	8	1	3	4	3	7	1	benign
4	1017023	4	1	1	3	2	1	3	1	1	benign

Deep Learning: Introduction¶

Sanjiv R. Das¶

Professor of Finance and Data Science¶

Santa Clara University¶

Net input¶

Examples of Different Types of Neurons¶

Sigmoid¶

ReLU (restricted linear unit)¶

TanH (hyperbolic tangent)¶

Output Layer¶

More on Cross Entropy¶

Entropy¶

Kullback-Leibler Divergence¶

SoftMax Function¶

Delta of Softmax¶

Batch Stochastic Gradient¶

Fitting the NN¶

Gradient Descent Example¶

Vanishing Gradients¶

Gradients in Multiple Dimensions¶

Saddle Points¶

Effect of the Learning Rate¶

Annealing¶

Learning Rate Algorithms¶

Momentum¶

Behavior of Momentum Parameter¶

Properties of the Momentum algorithm¶

Nesterov Momentum¶

The ADAGRAD Algorithm¶

The RMSPROP Algorithm¶

The ADAM Algorithm¶

Back to Activation Functions (Vanishing Gradients)¶

tanh Activation¶

ReLU Activation¶

Dead ReLU Problem¶

Leaky ReLU¶

PreLU¶

Maxout¶

Initializing Weights¶

Data Preprocessing¶

Zero-Centering¶

Zero Centering Helps¶

Batch Normalization¶

Regularization¶

Early Stopping¶

L2 Regularization¶

L1 Regularization¶

Dropout regularization¶

Bagging (Ensemble Learning)¶

TensorFlow Playground¶

Pattern Recognition: Cancer¶

One-Hot Encoding¶

Keras to define DLN¶

Train the Model¶

Another Canonical Example: Digit Recognition (MNIST)¶

Image Processing, Transfer Learning¶

Learning the Black-Scholes Equation¶

Normalizing spot and call prices¶

Data, libraries, activation functions¶

Set up, compile and fit the model¶

Predict and check accuracy (in-sample)¶

Predict and check accuracy (validation-sample)¶

Random Forest of decision trees¶

Prepare Data¶

Fit Random Forest¶

Relevant Applications¶

Causal Models¶

THE END¶