Tung Thanh Le
ttungl at gmail dot com
Traffic Sign Recognition
- Use
deep neural networks
(DNN) andconvolutional neural networks
(CNN) to build traffic sign recognition. Specifically, we train a deep neural network model to classify traffic signs from the German Traffic Sign Dataset and traffic signs downloaded from internet. - My implementation can be downloaded from the following links [ipynb] or [html].
Build a Traffic Sign Recognition Project
The goals / steps of this project are the following:
- Load the data set (see below for links to the project data set)
- Explore, summarize and visualize the data set
- Design, train and test a model architecture
- Use the model to make predictions on new images
- Analyze the softmax probabilities of the new images
- Summarize the results with a written report
This implementation addressed each point of the rubric points as below.
- The size of training dataset is
27839
images, validation set size is4410
images, and test set size is12630
images. I use numpy library to get the shape of images,(32, 32, 3)
. The number of classes is43
.
- First, I plot
43
images in the training dataset as inFigure 1
. The data shows that the input images are random in the set in terms of the classes.
Figure 1: Input training dataset.
Then, I get the statistical analysis of the dataset in terms of the number of occurrences in each class as in Figure 2
.
Figure 2: Number of occurrences for each class.
The plot shows that the amount of examples in each class is imbalanced. The largest amount of examples are classes 1, 2
, which are around 1600
examples for each class.
- As recommended, I use a quick way to approximately normalize data,
(pixel-128.)/128.
. Then, grayscale is used to convert the RGB image into GRAY image, using OpenCV librarycv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
. After that, I reshape the images to the size of(32,32)
. The result of this process is as follows.
Figure 3: Grayscale images processed.
There is a technique called spatial transformer, which allows the spatial manipulation of image within the network. This technique helps eliminate the white noise of the input images. I think this can be used later to improve the quality of the input image in my implementation.
- Update: The brightness augmentation is also a robust alternative to improve the performance of classifications.
-
I modified the LeNet Architecture by adding dropout probabilities between fully-connected layers.
One layer -> dropout -> another layer
`1.0 -> 1.0` `0.2 -> 0.2` `0.4 -> 0.4 (x)` `-0.3 -> -0.3 (x)`
-
It takes those activations randomly, for every example you train on your network, set half of them to
zero
that are flowing through the network, just destroy it and then randomly again.`1.0 -> 1.0 (x)` `0.2 -> 0.2` `0.4 -> 0.4` `-0.3 -> -0.3 (x)`
-
The purpose of this dropout is that the network can never rely on any given activation to be present because they maybe squashed at any given moment. It's forced to learn a redundance for everything. This is to ensure at least some of the information remains. In practice, it makes more robust and prevents overfitting.
-
Note that, if dropout probabilities method does not work, probably you would need to using a bigger network.
-
I proposed two network architectures, the first architecture
LeNetUpgrade
uses ReLUs activations, and second oneLeNetUpgrade_tanh
uses tanh activations.
**LeNetUpgrade
Architecture
Layer | Description |
---|---|
Input | 32x32x1 RGB image |
Convolution 2d | 1x1 stride, valid padding, outputs 28x28x6 |
RELU | activation function |
Max pooling | 2x2 stride, valid padding, outputs 14x14x6 |
Convolution 2d | 1x1 stride, valid padding, outputs 10x10x16 |
RELU | activation function |
Max pooling | 2x2 stride, valid padding, outputs 5x5x16 |
Flatten | 5x5x16 -> 400x1 |
Fully connected | 400x1 -> 120x1 |
RELU | activation function |
Dropout | keep_prob=0.5 |
Fully connected | 120x1 -> 84x1 |
RELU | activation function |
Dropout | keep_prob=0.5 |
Softmax | 84x1 -> 43x1 |
(I used this model)
**LeNetUpgrade_tanh
Architecture
Layer | Description |
---|---|
Input | 32x32x1 RGB image |
Convolution 2d | 1x1 stride, valid padding, outputs 28x28x6 |
Bias added | add bias to convolution output |
tanh | activation function |
Max pooling | 2x2 stride, valid padding, outputs 14x14x6 |
Convolution 2d | 1x1 stride, valid padding, outputs 10x10x16 |
Bias added | add bias to convolution output |
tanh | activation function |
Max pooling | 2x2 stride, valid padding, outputs 5x5x16 |
Flatten | 5x5x16 -> 400x1 |
Fully connected | 400x1 -> 120x1 |
tanh | activation function |
Dropout | keep_prob=0.5 |
Fully connected | 120x1 -> 84x1 |
tanh | activation function |
Dropout | keep_prob=0.5 |
Softmax | 84x1 -> 43x1 |
- The model was trained by using the following parameters:
Parameter | Setting |
---|---|
EPOCHS | 50 |
BATCH_SIZE | 128 |
LEARNING_RATE | 0.001 |
KEEP_PROB | 0.5 |
beta (REGULARIZATION) | 0.001 |
-
To minimize the loss, I use one-hot encoded and softmax cross entropy for the logits. Then, I use the L2-regularization to prevent overfitting by using
newloss = loss + beta*regularization
.beta
is set at0.001
. At first, I pretended to use SGD with learning rate decay but the ADAM optimizer seems to be a good option as it's simple and performs well without additional hyperparameters. So, I used ADAM optimizer for the training process. -
The
BATCH_SIZE
is128
, and the number ofEPOCHS
is50
.LEARNING_RATE
is set at0.001
and keep_prob is0.5
. -
Note: An epoch is a measure of the number of iterations of training samples are used once to update the weights. Number of iterations = number of tranining samples is divided by batch size. Each epoch runs all of training samples at onces (~ number of iterations) pass through the learning algorithm simultaneously before the weights are updated.
EPOCH 50 ...
Train Accuracy = 0.990
Validation Accuracy = 0.937
Test Accuracy = 0.921
-
After
50
epochs, my validation accuracy is93.7%
, and train accuracy is99.0%
, and test accuracy is92.1%
. -
I observed that, when using both
LeNetUpgrade
andLeNetUpgrade_tanh
, no significant improvement was made. Just keep theKEEP_PROB
at0.5
is fairly reasonable for half of data set to zero, for every example.
-
The new images are downloaded from the internet and then preprocessed them as in Figure 4. This time, instead using OpenCV, I use the different techniques to grayscale, normalize, and standardize the new images as follows.
-
For grayscale images, I import
rgb2gray
library fromskimage.color
. -
For normalize scale, I use min-max scaling method.
-
For standardize, I use the preprocessing approach from this link to obtain the zero-center, and then normalize them.
-
After the process, the new images show as below.
Figure 4: Grayscale new images processed.
I use the proposed neural network model to test the new images downloaded from internet. The result shows in Figure 5 as follows.
Figure 5: Predicted signs from 10 new images of the model. Note, A# is the actual sign; P# is the predicted sign.
- From the result, we see that the model works properly with 70% accuracy. There are three signs that the model predicted incorrectly per 10 signs. The top left sign is the sign of
be aware of ice/snow
has been covered mostly by snow, so it's hard to recognize this image, the model predicted it as aroundabout mandatory
(40
). The second image from the top left sign is the sign ofchildren crossing
, however, this image has been distorted after preprocessing. I observed that this image originally was too wide, so after reshaping it, the sign is distorted, therefore, the model found difficult to recognize it. It predicted this sign as aspeed limit 80km/h
. The second image from the bottom left isspeed limit 100 km/h
. It's blurred and there are some obstacles in front of it, so the model failed to recognize this one. Other than that, all the new images are clear to be recognized correctly by the model.
The prediction result of my neural network model for new images:
New Image | Prediction |
---|---|
Be Aware of Ice/Snow | General caution |
Children crossing | Right-of-way at the next intersection |
No entry | No entry |
Roundabout mandatory | Roundabout mandatory |
Slippery road | Slippery road |
Speed limit 70km/h | Speed limit 70km/h |
Speed limit 100km/h | No passing for vehicles over 3.5 metric tons |
Speed limit 60km/h | Speed limit 60km/h |
Stop | Stop |
Turn left ahead | Turn left ahead |
- Note: As mentioned, if using the spatial transformer to preprocess the images properly, the prediction accuracy will be improved for classifications.
The softmax probabilities are visualized as below.
I export the images in each layers as follows.
Be Aware of Ice/Snow
Children crossing
No entry
Roundabout mandatory
Slippery road
Speed limit 70km/h
Speed limit 100km/h
Speed limit 60km/h
Stop
Turn left ahead
Be Aware of Ice/Snow
Children crossing
No entry
Roundabout mandatory
Slippery road
Speed limit 70km/h
Speed limit 100km/h
Speed limit 60km/h
Stop
Turn left ahead
Be Aware of Ice/Snow
Children crossing
No entry
Roundabout mandatory
Slippery road
Speed limit 70km/h
Speed limit 100km/h
Speed limit 60km/h
Stop
Turn left ahead
Be Aware of Ice/Snow
Children crossing
No entry
Roundabout mandatory
Slippery road
Speed limit 70km/h
Speed limit 100km/h
Speed limit 60km/h
Stop
Turn left ahead