Screenshot%202021-02-04%20at%2010.30.29%20am.png

Convolutional Neural Networks (CNN) - Object Recognition¶

Imports¶

In [1]:
from numpy.random import seed
seed(888)

#from tensorflow import set_random_seed
#set_random_seed(4112)
import tensorflow
tensorflow.random.set_seed(112)
In [2]:
import os
import numpy as np
import itertools

import tensorflow as tf
import keras
from keras.datasets import cifar10 # importing the dataset

from keras.models import Sequential       #to define model/ layers
from keras.layers import Dense, Conv2D, MaxPool2D, Flatten   

from sklearn.metrics import confusion_matrix

# To Explore the images
from IPython.display import display
from keras.preprocessing.image import array_to_img

from tensorflow.keras.utils import to_categorical

import matplotlib.pyplot as plt
%matplotlib inline
In [3]:
import pandas as pd

We are using Tensorflow to power Keras

Get the Dataset¶

CIFAR-10 is an established computer-vision dataset used for object recognition. It is a subset of the 80 million tiny images dataset and consists of 60,000 32x32 color images containing one of 10 object classes, with 6000 images per class. It was collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. The dataset is popularly used to train image classification models

Screenshot%202021-02-05%20at%2012.47.25%20pm.png

In [4]:
# Getting the dataset as a Tuple

(x_train_all, y_train_all), (x_test, y_test) = cifar10.load_data()

Constants¶

In [5]:
LABEL_NAMES = ['airplane', 'automobile','bird','cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

Exploring the Data¶

Lets look at the first image in the dataset

In [6]:
x_train_all.shape
Out[6]:
(50000, 32, 32, 3)
In [7]:
x_train_all[0]
Out[7]:
array([[[ 59,  62,  63],
        [ 43,  46,  45],
        [ 50,  48,  43],
        ...,
        [158, 132, 108],
        [152, 125, 102],
        [148, 124, 103]],

       [[ 16,  20,  20],
        [  0,   0,   0],
        [ 18,   8,   0],
        ...,
        [123,  88,  55],
        [119,  83,  50],
        [122,  87,  57]],

       [[ 25,  24,  21],
        [ 16,   7,   0],
        [ 49,  27,   8],
        ...,
        [118,  84,  50],
        [120,  84,  50],
        [109,  73,  42]],

       ...,

       [[208, 170,  96],
        [201, 153,  34],
        [198, 161,  26],
        ...,
        [160, 133,  70],
        [ 56,  31,   7],
        [ 53,  34,  20]],

       [[180, 139,  96],
        [173, 123,  42],
        [186, 144,  30],
        ...,
        [184, 148,  94],
        [ 97,  62,  34],
        [ 83,  53,  34]],

       [[177, 144, 116],
        [168, 129,  94],
        [179, 142,  87],
        ...,
        [216, 184, 140],
        [151, 118,  84],
        [123,  92,  72]]], dtype=uint8)
In [8]:
x_train_all[0].shape
Out[8]:
(32, 32, 3)

Using ipython to display the image¶

In [9]:
# To use the ipython display to view an image

pic = array_to_img(x_train_all[0])
display(pic)

Using Matplotlib to view the image¶

In [10]:
plt.imshow(x_train_all[0])
Out[10]:
<matplotlib.image.AxesImage at 0x15007a908>
In [11]:
# To check the label 
y_train_all.shape
Out[11]:
(50000, 1)
In [12]:
# Note that in the image above the index 1 corresponds to "Automobile" 
# we have a 2 dimension numpy array; that is why we also include " [0] "

y_train_all[0][0]
Out[12]:
6
In [13]:
# Using the lable names to get the actual names of classes

LABEL_NAMES[y_train_all[0][0]]
Out[13]:
'frog'

The shape of the image¶

* 32, 32 is the weight and the height
* 3 is the number of channels (These are the number of colors): Red, Green & Blue (RGB)

  • x_train_all.shape >>> (50000, 32, 32, 3)
    • this means we have 50,000 entries | then 32x32 weight and height| 3 colors (RGB)
In [14]:
x_train_all.shape
Out[14]:
(50000, 32, 32, 3)
In [15]:
number_of_images, x, y, c = x_train_all.shape
print(f'Number of images = {number_of_images} \t| width = {x} \t| height = {y} \t| channels = {c}')
Number of images = 50000 	| width = 32 	| height = 32 	| channels = 3
In [16]:
x_test.shape
Out[16]:
(10000, 32, 32, 3)

Preprocess Data¶

* We need to preprocess our data so that it is easier to feed it to our neural network.¶

Scalling both x_train and test¶

In [17]:
x_train_all =x_train_all / 255.0
In [18]:
x_test =  x_test / 255.0
In [19]:
y_test
Out[19]:
array([[3],
       [8],
       [8],
       ...,
       [5],
       [1],
       [7]], dtype=uint8)

Creating categorical encoding for the "y " data¶

In [20]:
# 10 >>> simply means we have 10 classes like we already know (creating the encoding for 10 classes)
y_cat_train_all = to_categorical(y_train_all,10)
In [21]:
# 10 >>> simply means we have 10 classes like we already know (creating the encoding for 10 classes)
y_cat_test = to_categorical(y_test,10)
In [22]:
y_cat_train_all
Out[22]:
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 1.],
       [0., 0., 0., ..., 0., 0., 1.],
       ...,
       [0., 0., 0., ..., 0., 0., 1.],
       [0., 1., 0., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.]], dtype=float32)

Creating the Validation dataset¶

Screenshot%202021-02-05%20at%206.37.47%20pm.png

For small data we usually go with:

* 60% for Training
* 20% Validation
* 20% Testing

Only the final selected model gets to see the testing data. This helps us to ensure that we have close to real data in real-world when the model is deployed. Only our best model gets to see our testing dataset. Because it will give us a realistic impression of how our model will do in the real world


However, if the dataset is enormous.:

* 1% for is used for validation
* 1% for is used for testing
In [23]:
VALIDATION_SIZE = 10000
In [24]:
# VALIDATION_SIZE = 10,000 as defined above 

x_val = x_train_all[:VALIDATION_SIZE]
y_val_cat = y_cat_train_all[:VALIDATION_SIZE]
x_val.shape
Out[24]:
(10000, 32, 32, 3)
In [25]:
y_val_cat
Out[25]:
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 1.],
       [0., 0., 0., ..., 0., 0., 1.],
       ...,
       [0., 1., 0., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

NEXT:

  • We Create two NumPy arrays x_train and y_train that have the shape(40000, 3072) and (40000,1) respectively.
  • They will contain the last 40000 values from x_train_all and y_train_all respectively
In [26]:
x_train = x_train_all[VALIDATION_SIZE:]
y_cat_train= y_cat_train_all[VALIDATION_SIZE:]
In [27]:
x_train.shape
Out[27]:
(40000, 32, 32, 3)
In [28]:
y_cat_train
Out[28]:
array([[0., 1., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 1.],
       [0., 1., 0., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.]], dtype=float32)

Screenshot%202021-02-15%20at%208.04.47%20pm.png

NOTE:¶

* FILTERS: Typical values for the number of filters can be determined by the data set's complexity. So essentially the larger the images, the more variety and the more classes you're trying to classify then the more filters you should have.

* Most times people typically pick filter based on powers of 2, for example, 32. However, if you have more complex data like road signs etc. you should be starting with a higher filter value

The default STRIDE value is 1 x 1 pixel

BUILDING THE MODEL¶

In [29]:
model = Sequential()

## ************* FIRST SET OF LAYERS *************************

# CONVOLUTIONAL LAYER
model.add(Conv2D(filters=32, kernel_size=(4,4),input_shape=(32, 32, 3), activation='relu',))
# POOLING LAYER
model.add(MaxPool2D(pool_size=(2, 2)))

## *************** SECOND SET OF LAYERS ***********************
#Since the shape of the data is 32 x 32 x 3 =3072 ... 
#We need to deal with this more complex structure by adding yet another convolutional layer

# *************CONVOLUTIONAL LAYER
model.add(Conv2D(filters=32, kernel_size=(4,4),input_shape=(32, 32, 3), activation='relu',))
# POOLING LAYER
model.add(MaxPool2D(pool_size=(2, 2)))

# FLATTEN IMAGES FROM 32 x 32 x 3 =3072 BEFORE FINAL LAYER
model.add(Flatten())

# 256 NEURONS IN DENSE HIDDEN LAYER (YOU CAN CHANGE THIS NUMBER OF NEURONS)
model.add(Dense(256, activation='relu'))

# LAST LAYER IS THE CLASSIFIER, THUS 10 POSSIBLE CLASSES
model.add(Dense(10, activation='softmax'))


model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])
In [30]:
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 29, 29, 32)        1568      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 32)        16416     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 32)          0         
_________________________________________________________________
flatten (Flatten)            (None, 800)               0         
_________________________________________________________________
dense (Dense)                (None, 256)               205056    
_________________________________________________________________
dense_1 (Dense)              (None, 10)                2570      
=================================================================
Total params: 225,610
Trainable params: 225,610
Non-trainable params: 0
_________________________________________________________________

Adding Early stopping¶

In [31]:
from tensorflow.keras.callbacks import EarlyStopping
In [32]:
early_stop = EarlyStopping(monitor='val_loss',patience=2)
In [33]:
history = model.fit(x_train,y_cat_train,epochs=25,validation_data=(x_val,y_val_cat),callbacks=[early_stop])
Epoch 1/25
1250/1250 [==============================] - 41s 32ms/step - loss: 1.7493 - accuracy: 0.3581 - val_loss: 1.2798 - val_accuracy: 0.5376
Epoch 2/25
1250/1250 [==============================] - 38s 30ms/step - loss: 1.2367 - accuracy: 0.5585 - val_loss: 1.1360 - val_accuracy: 0.6018
Epoch 3/25
1250/1250 [==============================] - 39s 31ms/step - loss: 1.0621 - accuracy: 0.6307 - val_loss: 1.0527 - val_accuracy: 0.6311
Epoch 4/25
1250/1250 [==============================] - 39s 32ms/step - loss: 0.9311 - accuracy: 0.6739 - val_loss: 0.9894 - val_accuracy: 0.6599
Epoch 5/25
1250/1250 [==============================] - 47s 38ms/step - loss: 0.8240 - accuracy: 0.7102 - val_loss: 0.9883 - val_accuracy: 0.6654
Epoch 6/25
1250/1250 [==============================] - 69s 56ms/step - loss: 0.7418 - accuracy: 0.7413 - val_loss: 1.0176 - val_accuracy: 0.6583
Epoch 7/25
1250/1250 [==============================] - 73s 58ms/step - loss: 0.6539 - accuracy: 0.7739 - val_loss: 0.9540 - val_accuracy: 0.6797
Epoch 8/25
1250/1250 [==============================] - 70s 56ms/step - loss: 0.5754 - accuracy: 0.8014 - val_loss: 1.0089 - val_accuracy: 0.6770
Epoch 9/25
1250/1250 [==============================] - 39s 32ms/step - loss: 0.5003 - accuracy: 0.8269 - val_loss: 1.0495 - val_accuracy: 0.6785
In [34]:
model.history.history.keys()
Out[34]:
dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])
In [35]:
metrics = pd.DataFrame(model.history.history)
In [36]:
metrics
Out[36]:
accuracy loss val_accuracy val_loss
0 0.443975 1.536754 0.5376 1.279800
1 0.575075 1.199655 0.6018 1.135980
2 0.634150 1.051031 0.6311 1.052739
3 0.673350 0.935059 0.6599 0.989369
4 0.710600 0.828836 0.6654 0.988293
5 0.739175 0.747640 0.6583 1.017581
6 0.767025 0.668431 0.6797 0.953991
7 0.794350 0.592010 0.6770 1.008880
8 0.817150 0.520959 0.6785 1.049524
In [37]:
metrics[['loss', 'val_loss']].plot()
plt.title('Training Loss Vs Validation Loss', fontsize=16)
plt.show()
In [38]:
metrics[['accuracy', 'val_accuracy']].plot()
plt.title('Training Accuracy Vs Validation Accuracy', fontsize=16)
plt.show()

 Validating on Test Data¶

In [39]:
model.evaluate(x_test,y_cat_test)
313/313 [==============================] - 2s 7ms/step - loss: 1.0768 - accuracy: 0.6703
Out[39]:
[1.0768086910247803, 0.6703000068664551]

Classification Report and Confusion Matrix¶

In [40]:
from sklearn.metrics import classification_report, confusion_matrix
In [5]:
#predictions = model.predict_classes(x_test)
predictions = np.argmax(model.predict(x_test), axis=-1)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-5-22b748ca4dab> in <module>
      1 #predictions = model.predict_classes(x_test)
----> 2 predictions = np.argmax(model.predict(x_test), axis=-1)

NameError: name 'np' is not defined
In [42]:
print(classification_report(y_test,predictions))
              precision    recall  f1-score   support

           0       0.68      0.73      0.71      1000
           1       0.79      0.78      0.79      1000
           2       0.53      0.61      0.57      1000
           3       0.47      0.53      0.49      1000
           4       0.64      0.58      0.61      1000
           5       0.63      0.48      0.54      1000
           6       0.77      0.73      0.75      1000
           7       0.67      0.78      0.72      1000
           8       0.75      0.80      0.77      1000
           9       0.84      0.68      0.75      1000

    accuracy                           0.67     10000
   macro avg       0.68      0.67      0.67     10000
weighted avg       0.68      0.67      0.67     10000

In [43]:
confusion_matrix(y_test,predictions)
Out[43]:
array([[731,  22,  81,  12,  23,   5,   7,  13,  89,  17],
       [ 44, 781,  23,  17,   5,   4,  11,   9,  44,  62],
       [ 64,   6, 608,  82,  70,  47,  56,  48,  16,   3],
       [ 27,  15,  91, 528,  70, 119,  58,  57,  19,  16],
       [ 23,   6, 107,  72, 583,  34,  43, 115,  16,   1],
       [ 18,   7, 102, 239,  38, 479,  21,  79,  12,   5],
       [  5,   9,  62,  88,  52,  16, 735,  17,  12,   4],
       [ 20,   3,  33,  51,  52,  34,   8, 784,   6,   9],
       [ 84,  34,  19,  16,  13,   7,   9,   7, 796,  15],
       [ 55, 100,  22,  29,   4,  17,  10,  37,  48, 678]])

Predicting on single image¶

In [44]:
plt.imshow(x_test[16])
Out[44]:
<matplotlib.image.AxesImage at 0x14d274c18>
In [45]:
my_image = x_test[16]
In [46]:
# SHAPE --> (num_images,width,height,color_channels)
model.predict_classes(my_image.reshape(1,32,32,3))
Out[46]:
array([5])
In [47]:
LABEL_NAMES[y_test[16][0]]
Out[47]:
'dog'
In [ ]: