perClass Documentation
version 5.1 (11-May-2017)

kb23: Example on image classification

Keywords: image data, PCA, classifier, outliers, classifier execution

23.1. Introduction ↩

This example shows how to approach a typical image classification problem. Suppose, we wish to classify handwritten digit from pre-segmented images.

23.2. Data set ↩

We consider a handwritten digits data set containing 2000 images of ten digit classes scaled to 16x16 matrix. Because pixels are directly used as features, the data set contains 256 dimensions.

>> load digits 
>> a
'Digits' 2000 by 256 sddata, 10 classes: [200  200  200  200  200  200  200  200  200  200]

Each image is labeled into one of the digit classes.

>> a.lab
sdlab with 2000 entries, 10 groups
>> a.lab'
 ind name size percentage
   1 0     200 (10.0%)
   2 1     200 (10.0%)
   3 2     200 (10.0%)
   4 3     200 (10.0%)
   5 4     200 (10.0%)
   6 5     200 (10.0%)
   7 6     200 (10.0%)
   8 7     200 (10.0%)
   9 8     200 (10.0%)
  10 9     200 (10.0%)

We will split the data set into training and test subsets. In this way, we will be able to test classifier performance on images unused when training the models.

>> [tr,ts]=randsubset(a,0.5)
'Digits' 1000 by 256 sddata, 10 classes: [100  100  100  100  100  100  100  100  100  100]
'Digits' 1000 by 256 sddata, 10 classes: [100  100  100  100  100  100  100  100  100  100]

23.3. Visualizing the data set ↩

Because the features represent directly the pixels the resulting feature space is very sparse. We will use the principal component analysis (PCA) to extract a lower-dimensional subspace where the data may be easier visualized.

We will use sdpca command and request it to find projection to 40D subspace of the original 256D space.

>> p=sdpca(tr,40)
PCA pipeline            256x40  73% of variance (sdp_affine)

The result of sdpca is a pipeline object p. We may see, that projecting into the 40D subspace, we retain 73% of the total variance in the data.

We may now project the training data set into this 40D subspace by multiplying the original set tr by the pipeline p:

>> tr2=tr*p
'Digits' 1000 by 40 sddata, 10 classes: [100  100  100  100  100  100  100  100  100  100]

In order to understand our data set, we use the sdscatter command:

>> sdscatter(tr2)

It shows all training samples for the two first principal components. Using the cursor keys, we may change features on the horizontal and vertical axes. By pressing 'l' (legend), we show the class names.

interactive scatter plot

By moving the mouse over the figure, we focus the closes sample with the black round cursor. We may see both sample index and its class in the Figure title. The sdscatter also allows us to visualize the original images under the sample cursor. Use Scatter menu/Show sample inspector command and fill in the name of the data set containing original images (tr in our example). A small image window will open showing the data sample under cursor.

scatter plot showing image corresponding to a sample

Sample inspector allows us to better understand our data. Lets, for example, consider the digit "1". We position mouse over the examples of class "1" and using the right-mouse-click open the context menu. By selecting "Show only this class" we let the scatter to hide other classes.

scatter plot with inspector focused on digit 1

We can see that the outliers of class "1" are either shifted with respect to the center of the bounding box or contain segmentation artifacts.

If we wish to remove such cases from training, we may label the outliers using "Paint class/Create new class" command, hide them and save the clean data set back in Matlab workspace. You may see the example of such outlier removal in this video.

23.4. Building a digit classifier ↩

We will now build a digit classifier. It is a good practice to start by designing simple models as they often exhibit high robustness and may serve as a reference baseline when designing more sophisticated classifiers. One such classical classifier is Fisher linear discriminant (sdfisher).

It derives a linear subspace maximizing separation between the classes and trains a linear Gaussian classifier in it.

>> p2=sdfisher(tr2)
sequential pipeline     40x10 'Fisher linear discriminant'
 1  LDA                    40x9  (sdp_affine)
 2  Gauss eq.cov.           9x10 10 classes, 10 components (sdp_normal)
 3  Output normalization   10x10 (sdp_norm)

We may now put together the complete classifier, composed of PCA projection and the Fisher discriminant:

>> pd=sddecide(p*p2)
sequential pipeline     256x1 'PCA+LDA+Gauss eq.cov.+Output normalization+Decision'
 1  PCA                   256x40 73%% of variance (sdp_affine)
 2  LDA                    40x9  (sdp_affine)
 3  Gauss eq.cov.           9x10 10 classes, 10 components (sdp_normal)
 4  Output normalization   10x10 (sdp_norm)
 5  Decision               10x1  weighting, 10 classes, 1 ops at op 1 (sdp_decide)

Note, that we explicitly state that our classifier pd should return decisions using the sddecide command. If we omitted sddecide, the pipeline would return soft output (confidences), not crisp decisions.

Our classifier is now ready to be executed on any data with 256 features. Here, we apply it to the test set and obtain decisions

>> dec=ts*pd
sdlab with 1000 entries, 10 groups

We will use sdtest function to estimate mean class error:

>> sdtest(ts.lab,dec)

ans =

    0.1510

The Fisher classifier yields 15% error on the test set.

We may quickly test a different classifier type, for example, the k-nearest neighbor.

>> p2=sdknn(tr2,5)
5-NN classifier (dist) pipeline 40x10   (sdp_knn)

>> pd2=p*p2*sddecide
sequential pipeline     256x1 'PCA+5-NN classifier (dist)+Decision'
 1  PCA                   256x40 73%% of variance (sdp_affine)
 2  5-NN classifier (dist)   40x10 (sdp_knn)
 3  Decision               10x1  weighting, 10 classes, 1 ops at op 1 (sdp_decide)

>> sdtest(ts,pd2)

ans =

0.1240

Our 5-NN yields better performance than the Fisher classifier above. Note that we may pass to the sdtest the test set and the trained pipeline instead of the labels and decisions.

23.5. Understanding the classifier results ↩

Training a classifier is but one step in overall system design. It is important, that we understand in more detail how our classifier works and, especially, where it fails.

To get a more detailed view, we may display a confusion matrix:

>> sdconfmat(ts.lab,ts*pd2)

ans =

 True      | Decisions
 Labels    |       0       1       2       3       4       5       6       7       8       9  | Totals
-------------------------------------------------------------------------------------------------------
 0         |     94       2       3       0       0       0       1       0       0       0   |    100
 1         |      0      95       1       0       2       0       2       0       0       0   |    100
 2         |      2       1      92       0       0       0       0       2       2       1   |    100
 3         |      0       1       2      88       0       3       0       2       3       1   |    100
 4         |      0       8       2       0      71       0       0       2       1      16   |    100
 5         |      1       0       1       1       0      92       2       0       0       3   |    100
 6         |      0       4       1       1       0       0      94       0       0       0   |    100
 7         |      1       1       0       0       1       0       0      87       0      10   |    100
 8         |      0       8       1       9       2       1       0       4      70       5   |    100
 9         |      0       1       1       0       2       0       0       3       0      93   |    100
-------------------------------------------------------------------------------------------------------
 Totals    |     98     121     104      99      78      96      99     100      76     129   |   1000

We may notice, that our k-NN rule misclassified higher number of 4's into 9's. Can we see, what examples go wrong?

We will use the sdconfmatind command to find indices of digit '4' labeled as '9':

>> ind=sdconfmatind(ts.lab,ts*pd2,'4','9');

Now, we will create a new set of labels called 'err' copying the class labels and mark the misclassified examples so we may easily inspect them.

>> ts.err=ts.lab
'Digits' 1000 by 256 sddata, 10 classes: [100  100  100  100  100  100  100  100  100  100]

>> ts.err(ind)='4-err'
'Digits' 1000 by 256 sddata, 10 classes: [100  100  100  100  100  100  100  100  100  100]

The ts.err labels now contain 11 groups:

>> ts.err
sdlab with 1000 entries, 11 groups

>> ts.err'
 ind name     size percentage
   1 0         100 (10.0%)
   2 1         100 (10.0%)
   3 2         100 (10.0%)
   4 3         100 (10.0%)
   5 4          84 ( 8.4%)
   6 5         100 (10.0%)
   7 6         100 (10.0%)
   8 7         100 (10.0%)
   9 8         100 (10.0%)
  10 9         100 (10.0%)
  11 4-err      16 ( 1.6%)

Let us now open the scatter plot and inspect the errors more closely:

>> sdscatter(ts*p)

Go to 'Scatter' menu and choose 'Use property'. You will see that two sets of labels are present, namely 'lab' and 'err'. Select 'err' and switch on the legend by pressing 'l' key.

Now we need to select only the relevant subset of classes, i.e. '4','9', and '4-err'. You may do it by choosing them in the 'Scatter'/'Sample filter' menu. Faster way to specify a subset of classes is to press the '/' key and entering the substrings we are interested in. In our case, we may enter '4|9' meaning "all classes that contain 4 OR 9 in their names".

We may now open the sample inspector on the ts data set and investigate in detail what shapes of 4's get mixed with 9's.

scatter plot inspecting misclassified samples

Often, such an analysis of errors helps us to develop more powerful features addressing the specific problem.

23.6. Using the classifier in custom application ↩

When we are happy with our digit classifier, we wish to run it in our custom application. To do that, we use perClass Runtime which allows us to call any classifier we train from arbitrary application.

In our example, we implemented a simple GUI application in RealBasic which allows us to interactively draw digits in 16x16 image and run a classifier. When implementing the GUI, we do not need to decide for the classifier type. We only fix the dimensionality our classifier expects to the 256 input features representing the pixels.

On the side of perClass Toolbox, running in Matlab, we will export our classifier into a pipeline file using sdexport command:

>> sdexport(pd2,'digit_classifier.ppl')
Exporting pipeline..ok
This pipeline requires perClass runtime version 3.0 (18-jun-2011) or higher.

Outside Matlab, we can see the directory with all GUI application files. The DigitsDemo and DigitsDemo Libs contains the application code, perclass.dll is the perClass Runtime library and digits_classifier.ppl is the exported classifier pipeline.

Classifier execution outside Matlab.

We will now execute the DigitsDemo.exe:

Digit classification demo.

We will load a pipeline and may start drawing digits on the 16x16 grid.

Digit classification demo running perClass classifiers.

You may now return back to Matlab, update your classifier, re-export it and immediately experience the changes in the live demo.

This concludes our example.