Happy Sunday!

Today we dive into step 4 of the Captcha Project! Always been keen on implementing a convolutional neural network? Then stay tuned!

## Neural Networks

A confession is in order: I am not really good with Neural Networks.

In January I started to read the Deep Learning Book which is online available for free. A month later school started again and during the first few

So be kind if my neural network is really not the best. Whenever I reread my posts a while later I feel crunched because what I used to do a few months or weeks ago now looks so “newbie”. But hey, you have to start somewhere and this is also something that I want to show on this blog. No master just fell from the sky. And it is okay to make mistakes, as long as you learn from them.

### Why a convolutional neural network?

Why did I go for a convolutional neural network (CNN) and not for a recurrent one? The data we are working with are pictures which we translate into matrices. The different pixels are stored in those matrices. The positions of the pixels are relevant. Just think about it. If we would change all the positions of the pixels in the picture we as humans would not be able anymore to read the letters.

## Setting a Baseline

But before implementing a CNN it is advisable to set a baseline to which we can compare the CNN we are creating.

We already did that in the last post. Click on here to check it out if you haven’t yet.

## On to the CNN

### More data prep

I think I mentioned it in one of my earlier posts, 80% of the work is data preparation. Although we already did a lot in this department, there are still a few steps missing before we can hand over the data to Keras.

When implementing the baseline we saw, that the data is not balanced. This will most likely also be an issue for the CNN.

Further, we need to make sure that we pass the correct shape for each picture to Keras. Otherwise,

We can reuse the logic for resampling and balancing the data which we use for the baseline.

#Path to where the single letter images are einfach_letters = r'...\Einfach_letters' # initialize the data and labels data = [] labels = [] files = os.listdir(einfach_letters) for f in files: #loop through the letters data_sub = [] labels_sub = [] for image_file in paths.list_images(einfach_letters+'\\'+f): # loop over the input images # Load the image image = cv2.imread(image_file, cv2.IMREAD_GRAYSCALE) #Reshape to later use in Random Forest #image = np.asarray(image).reshape(-1) # Add a dimension to the image to process it in Keras later image = np.expand_dims(image, axis=2) # Get the name of the letter based on the folder it was in label = image_file.split(os.path.sep)[-2] #add to the sub data_sub.append(image) labels_sub.append(label) if len(data_sub) < 110: boot = resample(data_sub, replace = True, n_samples = 110, random_state = 1) boot_label = labels_sub[0] for i in boot: data.append(i) labels.append(boot_label) else: for i in data_sub: data.append(i) labels.append(labels_sub[0])

Then we should make a quick check on the balance. We can do this the same way we did before.

#check whether the dataset is balanced import pandas as pd label_bal = pd.DataFrame(labels) label_bal[0].value_counts()

At the moment both data and labels are lists. We need to transform them into an n-dimensional array. Numpy can help us with that. If you have not imported

import numpy as np data = np.array(data) data.shape

The shape should now be (2090, 60, 60, 1). 2090 because we now have 19*110 pictures, each has 60×60 pixels and we added the third dimension to each image to later process in Keras.

### Labels

Last but not least we need to tend to the labels. Just as with the random forest, Keras wants to get numeric values for labels. We can again copy the label conversion from before.

alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] dic = dict(zip(alphabet, list(range(1,len(alphabet)+1)))) dicRev = dict(zip(list(range(1,len(alphabet)+1)),alphabet)) new_labels = [dic[v] for v in labels] new_labels = np.asarray(new_labels) new_labels = new_labels.reshape(-1,1)

After this preparation, the new_labels are already an array and not a list anymore.

## Keras specific pre-processing

We just did the labels a few seconds ago. Actually, Keras does not only want numerical values, but it also wants the labels to come in a matrix. Or better, to come in OneHotEncoding.

### OneHotEncoding

This really just stands for a format for the labels. At the moment, each letter is now represented by the corresponding number it has in the alphabet. Meaning “a” is shown as 1, “b” is shown as 2 and so on.

Keras now wants us to pass something like this to it:

observation nr | a | b | c | d | … |
---|---|---|---|---|---|

1 | 1 | 0 | 0 | 0 | 0 |

2 | 0 | 1 | 0 | 0 | 0 |

3 | 0 | 0 | 1 | 0 | 0 |

(Yay, please admire my html/css skills I used for the table above ;))

from sklearn import preprocessing ## One Hot Encoder from sklearn preprocessing transforms an array-like of integers or strings ## creates a binary column for each category and returns a sparse matrix or dense array ##then we can apply the OneHotEncoding ohe = preprocessing.OneHotEncoder(sparse = False) labels_ohe = ohe.fit_transform(new_labels) print(labels_ohe)

Actually, now that I look at it, sklearn expects numeric values too for the OneHotEncoding *imagen the monkey which is covering its eyes emojy here*. So the last conversion did actually make sense 🙂

### Shuffling the data

This is not Keras specific but I only want to do it now. We need to shuffle the data again. We can copy past this again from the baseline and adjust it a little.

## because all the letters are sorted right now we need to shuffle the dataset a bit ## numpy has a shuffle method permutation = np.arange(len(data)) #n = number of pictures/length of data np.random.shuffle(permutation) print(permutation) data_shuffled = [data[i] for i in permutation] labels_ohe_shuffled = [labels_ohe[i] for i in permutation]

### Training, test and validation sets

For the

## now we can define a training, test and validation set n = len(data) test_size = 1/3 val_size = 1/5 train_size = 2/3 X_train = data_shuffled[0:round(n*train_size)] Y_train = labels_ohe_shuffled[0:round(n*train_size)] X_val = data_shuffled[round(n*val_size)*2:round(n*val_size)*2+round(n*val_size)] Y_val = labels_ohe_shuffled[round(n*val_size)*2:round(n*val_size)*2+round(n*val_size)] X_test = data_shuffled[round(n*train_size):n] Y_test = labels_ohe_shuffled[round(n*train_size):n]

### Finally putting the network together

After all this preparation, we can finally start with the real deal. First, we need modules out of the Keras package. Also, we will set the random seed in T

import matplotlib.pyplot as plt import matplotlib.image as imgplot import tensorflow as tf tf.set_random_seed(1) from keras.models import Sequential from keras.layers import Dense, Activation, Dropout, BatchNormalization from keras.layers import Conv1D, Conv2D, Convolution2D, MaxPooling2D, Flatten import keras import sys print("Keras {} TF {} Python {}".format(keras.__version__, tf.__version__, sys.version_info))

I also like to check the versions of Tensorflow and Keras. Some combinations of versions work better than others. I am using Keras 2.2.4 and Tensorflow 1.12.0.

# Build the neural network! model = Sequential() # First convolutional layer with max pooling model.add(Conv2D(20, (6, 6), padding="same", input_shape=(60, 60, 1), activation="relu")) model.add(keras.layers.normalization.BatchNormalization()) model.add(MaxPooling2D(pool_size=(3, 3), strides=(3, 3))) model.add(Dropout(0.3)) # Second convolutional layer with max pooling model.add(Conv2D(15, (3, 3), padding="same", activation="relu")) model.add(keras.layers.normalization.BatchNormalization()) model.add(MaxPooling2D(pool_size=(3, 3), strides=(3, 3))) model.add(Dropout(0.3)) # Hidden layer with 300 nodes model.add(Flatten()) model.add(Dense(300, activation="relu")) # Output layer with l nodes (one for each possible letter we predict) model.add(Dense(19, activation="softmax"))

For the first

The different layers were found by try and error. As for now, this is all on how to find a good neural network. Whenever I will have a little more experience, I might attempt to write a guideline on how to find a suitable network. Alternatively, you can also load pre-trained networks and use them. However, this is neither the time nor the place. For now, I was happy with this network.

### Compile and evaluate the network

Last but not least we need to compile the network.

#Compile the network model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

Further we also wan’t to know how well our CNN is doing. This is why we earlier created the validation set. So we can use it when fitting the model.

#evaluate network history=model.fit(np.array(X_train), np.array(Y_train), validation_data=(np.array(X_val), np.array(Y_val)), batch_size=32, epochs=10, verbose=1)

It is easier to see this in a chart…

# visualize history plt.plot(history.history['acc']) plt.plot(history.history['val_acc']) plt.title('model accuracy') plt.ylabel('accuracy') plt.xlabel('epoch') plt.legend(['train', 'valid'], loc='lower right') plt.show()

model.evaluate(np.array(X_test), np.array(Y_test))

The line above will return the loss and the metric. For my

Wow! We did a lot! I think a break is in order. In the next post we will wrap up the project and implement the function which will “predict” what the captcha stands for.

Hope you enjoyed reading!

Best,

Blondie