Skip to main content

Natural Language Processing - Episode 2

This episode references

At the end of this episode, you will be able to:

  • Build a simple NLP model using Tensorflow and scikit-learn to classify fashion reviews.
  • Set up your model so that it can be easily integrated into a Metaflow flow.

1Description of a Custom Model

Now it’s time to build our ML model. We are going to define our model in a separate file with a custom class called Nbow_Model. The model contains two subcomponents: the count vectorizer for preprocessing and the model. The Nbow_Model class facilitates combining these two components together so that we don't have to deal with them separately.

Here is an explanation of the various methods in this model:

  1. __init__: Initialize the count vectorizer, a preprocessor that counts the tokens in the text, and a neural network to do the modeling.
  2. fit: Fit the count vectorizer, followed by the model.
  3. predict: Transform the data with the count vectorizer before making predictions.
  4. eval_acc: Calculate model accuracy given a dataset and labels.
  5. eval_rocauc: Calculate the area under the roc curve given a dataset and labels.
  6. model_dict: This exposes a dictionary that has two components that form this model, the count vectorizer and the neural network. We will use this to serialize the model's data into Metaflow.
  7. from_dict: This allows you to instantiate a NbowModel from a model_dict which is useful for de-serializing data in Metaflow.

2How to Serialize Data

Anytime you create your own model library or define models in custom classes, we recommend explicitly defining how you will serialize and load the model. This will minimize the chances that things will break as your model code changes. Explicit definitions for serialization processes give you the ability to make sure any new versions of your code are backward compatible on how to load your model or allow you to deal with serialization/de-serialization accordingly in a way that is transparent to you. This is the purpose of the from_dict method and model_dict property in this example.

For Metaflow, it is very convenient if you have an interface that allows you to save model information that is pickleable, as that is how Metaflow saves data. That is the purpose of model_dict and from_dict: they allow saving and retrieving data from a pickleable data structure.
import tensorflow as tf
from tensorflow.keras import layers, optimizers, regularizers
from sklearn.base import BaseEstimator, ClassifierMixin
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.feature_extraction.text import CountVectorizer

class NbowModel():
def __init__(self, vocab_sz):
self.vocab_sz = vocab_sz
# Instantiate the CountVectorizer = CountVectorizer(
min_df=.005, max_df = .75, stop_words='english',
strip_accents='ascii', max_features=self.vocab_sz

# Define the keras model
inputs = tf.keras.Input(shape=(self.vocab_sz,),
x = layers.Dropout(0.10)(inputs)
x = layers.Dense(
15, activation="relu",
kernel_regularizer=regularizers.L1L2(l1=1e-5, l2=1e-4)
predictions = layers.Dense(1, activation="sigmoid",)(x)
self.model = tf.keras.Model(inputs, predictions)
opt = optimizers.Adam(learning_rate=0.002)
optimizer=opt, metrics=["accuracy"])

def fit(self, X, y):
res =, y=y, batch_size=32,
epochs=10, validation_split=.2)

def predict(self, X):
res =
return self.model.predict(res)

def eval_acc(self, X, labels, threshold=.5):
return accuracy_score(labels,
self.predict(X) > threshold)

def eval_rocauc(self, X, labels):
return roc_auc_score(labels, self.predict(X))

def model_dict(self):
return {'vectorizer', 'model': self.model}

def from_dict(cls, model_dict):
"Get Model from dictionary"
nbow_model = cls(len(
nbow_model.model = model_dict['model'] = model_dict['vectorizer']
return nbow_model

3Fit the Custom Model on the Dataset

Next, let's import the NbowModel and train it on this dataset. The purpose of doing this is to make sure the code works as we expect before using Metaflow. For this example, we will set our vocab_sz = 750.

from model import NbowModel
import pandas as pd
model = NbowModel(vocab_sz=750)
df = pd.read_parquet('train.parquet')['review'], y=df['labels'])
    Epoch 1/10
1/510 [..............................] - ETA: 1:17 - loss: 0.7379 - accuracy: 0.3438

2023-03-31 17:05:22.277608: W tensorflow/core/platform/profile_utils/] Failed to get CPU frequency: 0 Hz

510/510 [==============================] - 1s 836us/step - loss: 0.3559 - accuracy: 0.8495 - val_loss: 0.2983 - val_accuracy: 0.8756
Epoch 2/10
510/510 [==============================] - 0s 628us/step - loss: 0.2934 - accuracy: 0.8823 - val_loss: 0.2939 - val_accuracy: 0.8724
Epoch 3/10
510/510 [==============================] - 0s 651us/step - loss: 0.2825 - accuracy: 0.8876 - val_loss: 0.2934 - val_accuracy: 0.8754
Epoch 4/10
510/510 [==============================] - 0s 624us/step - loss: 0.2735 - accuracy: 0.8922 - val_loss: 0.2986 - val_accuracy: 0.8781
Epoch 5/10
510/510 [==============================] - 0s 626us/step - loss: 0.2654 - accuracy: 0.8963 - val_loss: 0.2967 - val_accuracy: 0.8759
Epoch 6/10
510/510 [==============================] - 0s 636us/step - loss: 0.2548 - accuracy: 0.9036 - val_loss: 0.3070 - val_accuracy: 0.8756
Epoch 7/10
510/510 [==============================] - 0s 620us/step - loss: 0.2476 - accuracy: 0.9056 - val_loss: 0.3051 - val_accuracy: 0.8783
Epoch 8/10
510/510 [==============================] - 0s 627us/step - loss: 0.2400 - accuracy: 0.9106 - val_loss: 0.3136 - val_accuracy: 0.8714
Epoch 9/10
510/510 [==============================] - 0s 632us/step - loss: 0.2303 - accuracy: 0.9180 - val_loss: 0.3269 - val_accuracy: 0.8741
Epoch 10/10
510/510 [==============================] - 0s 626us/step - loss: 0.2233 - accuracy: 0.9214 - val_loss: 0.3310 - val_accuracy: 0.8719

4Evaluate the Model Performance

Next, we can evaluate our model on the validation set as well, using the built-in evaluation methods we created:

valdf = pd.read_parquet('valid.parquet')
model_acc = model.eval_acc(
valdf['review'], valdf['labels'])
model_rocauc = model.eval_rocauc(
valdf['review'], valdf['labels'])

msg = 'Baseline Accuracy: {}\nBaseline AUC: {}'
round(model_acc, 3), round(model_rocauc, 3)
    Baseline Accuracy: 0.875
Baseline AUC: 0.912

Great! This is an improvement upon our baseline! Now we have set up what we need to start using Metaflow. In the next lesson, we are going to operationalize the steps we manually performed here by refactoring them as a Metaflow flow.