OpenCV Adventure: Multilayer Perceptron

According to OpenCV Book, MLP (Multi-layer Perceptron) is slow to train but able to make quick predict due to it's simplicity. And it's good for text recognition.

API documentation also give a good introduction. What follows is basically paraphrasing from there:

There are 2 implementations in OpenCV:

classical random-sequential back-propagation
batch RPROP (default)

There are 3 choices of activation function - used for all layers of neurons:

Identity - simply the weighted sum
Sigmoid (default) - result would be converted to -1 to 1 (zero-crossing at x=0)
Gaussian (not completely supported?)

Things to consider when deciding network topology (# layers, # hidden nodes)

Too big causes over-fitting
Takes long time to train

Recommends pre-process the inputs with CalcPCA (Principal Component Analysis) to reduce the dimension of feature-vector - speed up training.

ANN is designed for numerical data, the API documentation suggests a work-around to handle categorical data.

Parameters

Training Method
Activation function
Network topology: Number of nodes at each layer and number of layers
Free parameters: activation functions (alpha, beta) - shape the Sigmoid

Sample (letter-recog.cpp, for comparing with Random Trees and AdaBoost)

80% sample data used for training
Topology: Input layers 16 nodes, 2 hidden layer @ 100 nodes each, Output layer 26 nodes
Unroll categorical response data (A...Z) to numerical. In this case, each response is a 26-element vector, conceptually representing a bit-vector. For example, if the response is 'C', then the 3rd element is set to 1, others kept at 0. The choice of 26-element vector because it corresponds to the 26 output layer nodes.
Is there any easy way in OpenCV to create a new matrix out of an arbitrary set of columns from an existing matrix (without copying)?

Observations (16000 training samples, 16-variable feature-vector, 26 classes)

Classical Random-Sequential BP (bp_bw_scale=0.001): train 95.2% test 93.5% (Build-time 3367.66 secs !)
RPROP (rp_dw0=0.05): train 75.6% test 74.3% (Build-time 1350.67 secs)
RPROP(rp_dw0=0.1): train 78.9% test 78% (Build-time 1333.09 secs)
RPROP(rp_dw0=1.0): train 15.2% test 14.3% (Build-time 1329.59 secs)

Code
RPROP parameter rp_dw0 is initialized to different values (0.1, 1.0) in various occasions. Why?

Readings

http://en.wikipedia.org/wiki/Backpropagation . Wikipedia article about the back-propagation algorithm.
LeCun, L. Bottou, G.B. Orr and K.-R. Muller, “Efficient backprop”, in Neural Networks—Tricks of the Trade, Springer Lecture Notes in Computer Sciences 1524, pp.5-50, 1998.
Riedmiller and H. Braun, “A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm”, Proc. ICNN, San Francisco (1993).

4 comments:

JokkerApril 1, 2011 at 3:49 PM
Grate - but would you mind sharing some code?
ButterCookiesApril 1, 2011 at 5:27 PM
@Oleg I am not sure which code you are referring to. I was using 'letter_recog.cpp' sample from OpenCV 2.2.
UnknownApril 29, 2011 at 3:31 AM
Hi,
I have 2 questions:
1. where your samples come from, and what are they look like?
2. could you tell me more about the principal composant analysis you performed on your datas?
I work on a guide robot project, I do computer vision, and I will probably use NN to recognize classrooms numbers.

Thx!!
ButterCookiesApril 30, 2011 at 12:03 PM
I don't have real-life samples for this. I simply used the feature-vector and responses from letter-recognition.data. And that file is located at OpenCV/samples/cpp/.
I have not performed the PCA either because I don't have the samples and feature-extraction method.
Just now I looked up from the OpenCV book and I think you could find more information here:
http://yann.lecun.com/exdb/mnist/
And the paper by LeCun et al:
Gradient-Based Learning Applied to Document Recognition
Hope this helps with your robot project!

OpenCV Adventure

Friday, February 18, 2011

Multilayer Perceptron - ANN (BackPropagation)

4 comments: