Skip to main content

Beginner Computer Vision: Episode 2

This episode references this notebook. It shows how to build a Convolutional Neural Network (CNN) model with Keras to predict the classes of MNIST images. If you are already familiar with MNIST and Keras fundamentals, you may want to skip to Episode 3 where Metaflow enters the tutorial.

After following the setup instructions, start the notebook with this command:

jupyter lab cv-intro-2.ipynb

Now it’s time to build a model to compare against the baseline. The goal is to define a CNN model that outperforms the baseline model from the previous notebook.

1Load the Data

We start by loading the data in the same way as the the previous episode:

from tensorflow import keras
import numpy as np

num_classes = 10
((x_train, y_train), (x_test, y_test)) = keras.datasets.mnist.load_data()
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

2Configure Hyperparameters

The model has several hidden layers defined by the hidden_conv_layer_sizes hyperparameter. If you are new to machine learning you don't need to know about setting these values for now. For more experienced users, notice how these values appear in the Metaflow code you will write starting in the next episode. We will use Metaflow as an experiment tracker for hyperparameter values and corresponding metric scores for models they define.

hidden_conv_layer_sizes = [32, 64]
input_shape = (28, 28, 1)
kernel_size = (3, 3)
pool_size = (2, 2)
p_dropout = 0.5

num_classes = y_test.shape[1]

epochs = 5
batch_size = 32
verbose = 2

optimizer = 'adam'
loss = 'categorical_crossentropy'
metrics = ['accuracy']

3Build a Model

In this section you, will build a neural network using the keras.Sequential API. The loop constructs a list of convolutional and pooling layers. The list is then extended with a fully-connected layer. Finally, the Keras model is compiled so it is ready to train.

from tensorflow.keras import layers, Sequential, Input

_layers = [Input(shape=input_shape)]

# dynamic based on length of hidden_conv_layer_sizes
for conv_layer_size in hidden_conv_layer_sizes:
_layers.append(layers.Conv2D(
conv_layer_size,
kernel_size=kernel_size,
activation="relu"
))
_layers.append(layers.MaxPooling2D(pool_size=pool_size))
_layers.extend([
layers.Flatten(),
layers.Dropout(p_dropout),
layers.Dense(num_classes, activation="softmax")
])

model = Sequential(_layers)
model.compile(loss=loss, optimizer=optimizer, metrics=metrics)

4Train Your Model

Keras models like the one you made in the previous step have a .fit function following the convention of the sklearn Estimator API. One benefit of this API is that you can pass data in NumPy arrays directly to model.fit.

history = model.fit(
x_train, y_train,
validation_data = (x_test, y_test),
epochs = epochs,
batch_size = batch_size,
verbose = verbose
)
    Epoch 1/5


2022-10-25 13:09:47.543570: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz


1875/1875 - 15s - loss: 0.2047 - accuracy: 0.9368 - val_loss: 0.0522 - val_accuracy: 0.9831 - 15s/epoch - 8ms/step
Epoch 2/5
1875/1875 - 15s - loss: 0.0789 - accuracy: 0.9753 - val_loss: 0.0412 - val_accuracy: 0.9865 - 15s/epoch - 8ms/step
Epoch 3/5
1875/1875 - 13s - loss: 0.0616 - accuracy: 0.9803 - val_loss: 0.0360 - val_accuracy: 0.9878 - 13s/epoch - 7ms/step
Epoch 4/5
1875/1875 - 13s - loss: 0.0523 - accuracy: 0.9838 - val_loss: 0.0346 - val_accuracy: 0.9882 - 13s/epoch - 7ms/step
Epoch 5/5
1875/1875 - 14s - loss: 0.0469 - accuracy: 0.9857 - val_loss: 0.0278 - val_accuracy: 0.9902 - 14s/epoch - 7ms/step

5Evaluate Performance

Keras models also have a .evaluate function you can use to retrieve accuracy and loss scores.

scores = model.evaluate(x_test, y_test, verbose=0)
categorical_cross_entropy = scores[0]
accuracy = scores[1]
msg = "The CNN model predicted correctly {}% of the time on the test set."
print(msg.format(round(100*accuracy, 3)))
    The CNN model predicted correctly 99.02% of the time on the test set.

In the last two episodes, you have developed and evaluated two different models. Comparing models against each other is an important part of machine learning workflows. For example, you may want to compare changes in hyperparameters and effects on performance metrics. You can also compare more complex models like the CNN from this episode to a simpler model like the baseline from episode 1. Among other benefits, this comparison can help you avoid promoting unnecessarily complex models to a production environment.

In any case, it is likely that at some point you will want to train models in parallel processes or on cloud machines. In the next episode, you will learn how to package the baseline and CNN models into a flow that trains each model in parallel processes on your laptop. Then one of the primary Metaflow benefits kicks in, allowing you to run these flows on your cloud infrastructure seamlessly.