Username Remember Me?
Password   forgot password?
   
   
Simplest way of chaining classifier outputs. 
Posted: 08 July 2009 05:37 PM   [ Ignore ]  
Newbie
Rank
Total Posts:  13
Joined  2009-07-08

This is probably astoundingly simple to achieve, but I’m rather braindead right now!

I have two separate feature sets, let’s call them X and Y

I want to take these two through a classifier, for example LDC.

Then I want to take the output of each through a third classifier for class assignment.

ie:

X - 30 classes LDC --\
>-- BPXNC - 30 classes (final assignment)
Y - 30 classes LDC --/

What would be the simplest way to achieve this?

Thanks in advance.

Profile
 
 
Posted: 08 July 2009 06:22 PM   [ Ignore ]   [ # 1 ]  
Administrator
Avatar
RankRankRankRank
Total Posts:  236
Joined  2008-04-26

Hi Sproik,

here is a classifier combining example:

>> a=setlablist(gendatd(100,10),{'apple','banana'})
Difficult Dataset100 by 10 dataset with 2 classes[51  49]

>> wf1=featsel(10,1:6)
Feature Selection10 to 6 trained  mapping   --> featsel
>> wf2=featsel(10,7:10)
Feature Selection10 to 4 trained  mapping   --> featsel

>> w1=a*wf1*ldc
Bayes
-Normal-16 to 2 trained  mapping   --> normal_map
>> w2=a*wf2*ldc
Bayes
-Normal-14 to 2 trained  mapping   --> normal_map

we create a stacked mapping
>> W=[wf1*w1 wf2*w2]
10 to 4 trained  mapping   
--> stacked

soft outputs of the stacked two base classifiers
>> out=a*W
Difficult Dataset
100 by 4 dataset with 2 classes[51  49]

the feature labels refer to the class outputs for the first and second base classifier
>> getfeatlab(out)
ans =
apple 
banana
apple 
banana

here we train a combiner on the base classifier soft outputs
>> wcomb=out*fisherc
Fisher
4 to 2 trained classifier --> affine

% and here construct the complete mapping:
>> 
Wfinal=W*wcomb
10 to 2 trained  mapping   
--> sequential

>> b=setlablist(gendatd(100,10),{'apple','banana'})
Difficult Dataset100 by 10 dataset with 2 classes[53  47]

>> confmat(getlab(b),b*Wfinal*labeld)

  
True   Estimated Labels
  Labels 
apple  bananaTotals
 
--------|--------------|-------
  
apple  |   50      3  |   53
  banana 
|   14     33  |   47
 
--------|--------------|-------
  
Totals |   64     36  |  100

Note that we use the same data for training the base classifiers and also for training the combiner. Using data twice results in a bias so this stategy is safe only for very small or very large sample sizes. Simple alternative is to use a subset of training data to train base classifiers and another subset to train the combiner. More sophisticated alternative leveraging all available data is stacked generalization (see sdstackgen in PRSD Studio).

Hope it helps,

Pavel

Profile
 
 
Posted: 09 July 2009 10:44 AM   [ Ignore ]   [ # 2 ]  
Newbie
Rank
Total Posts:  13
Joined  2009-07-08

Hi Pavel,

Thanks for the prompt reply.  I have read the example but am still struggling to follow.

What I have is a 50/50 split between my training and testing features.

I have two independent feature sets.  Let’s call these set1 and set2.

I label up each set (both have the same labelling):

labs=genlab([perclass*ones(1,numberoffeatures)][1:numberoffeatures]');
training_set1=dataset(features_train_set1,labs);
testing_set1=dataset(features_test_set1,labs);
training_set2=dataset(features_train_set2,labs);
testing_set2=dataset(features_test_set2,labs);

I then train my classifiers:

Wldc_set1=ldc(training_set1);
Wldc_set2=ldc(training_set2);

When I test them using:

ldc_results_1=(testing_set1*Wldc_set1);
ldc_results_2=(testing_set2*Wldc_set2);

I get the intended results for each classifier.  But do not see how to create a third classifier to take the outputs of each of these and perform a join classification.

[ Edited: 09 July 2009 12:55 PM by Sproik]
Profile
 
 
Posted: 09 July 2009 12:49 PM   [ Ignore ]   [ # 3 ]  
Moderator
RankRankRankRank
Total Posts:  250
Joined  2008-11-08

Hi Sproik,

We are a little bit confused about your terminology (’train features’ and ‘test features’). Probably you mean objects. The following example may show what you want:

>> train_set1 = gendath; % set1 has 2 features
>> test_set1 = gendath;
>> train_set2 = gendatd([50 50],3); % set2 has 3 features
>> test_set2 = gendatd([50 50],3);
>> w = [train_set1 train_set2] * (parallel([ldc*classc; ldc*classc],[2,3])*ldc);
% this informs the parallel combiner about the number of features per set
% classc takes care that we handle normalized outputs on [0,1]
% the 4 outputs (2 per classifier) are used by the combining classifier (again ldc)
>> [test_set1 test_set2]*w*testc
0.030

It might be better to use an independent training set for the combiner, e.g.

>> train2_set1 = gendath; % set1 has 2 features
>> train2_set2 = gendatd([50 50],3); % set2 has 3 features
>> w_base = [train_set1 train_set2] * parallel([ldc*classc; ldc*classc],[2,3]);
>> w = [train2_set1 train2_set2]*(w_base*ldc);

Hope this helps,

Bob Duin

Profile
 
 
Posted: 09 July 2009 01:41 PM   [ Ignore ]   [ # 4 ]  
Newbie
Rank
Total Posts:  13
Joined  2009-07-08

Hi Bob,

Apologies, I had a few typos in my code.  I have corrected them.

It’s always hard to try and explain in writing, but let me try and explain section by section:

labs=genlab([perclass*ones(1,numberoffeatures)][1:numberoffeatures]');

The above generates labels for the dataset.  Basically these take the form:
1;1;1;1;2;2;2;2;3;3;3;3.....78;78;78;78;79;79;79;79;

Next I take my 632 features.  These are made up of 79 unique classes with 8 24-dimensional feature vectors for each.  I split this into two giving me a training feature set of 316 x 24 dimensional feature vectors split into 79 unique classes (4 sets of vectors per class for training).  I do the same again for testing as shown below:

training_set1=dataset(features_train_set1,labs); %Apply labels to data
testing_set1
=dataset(features_test_set1,labs);

Now I have another completely different set of features for the SAME objects.  If these were features extracted from photographs of bolts, then we could say set_1 was from a side perspective and set_2 was from a birds eye perspective.  Each vector within set 2 is of the same object as set 1 and is handled in exactly the same way:

training_set2=dataset(features_train_set2,labs); %Apply labels to data
testing_set2
=dataset(features_test_set2,labs);

At this point I now have four distinct sets of objects.  set_1 and set_2 training are used to train two different LDC classifiers as shown below.

Wldc_set1=ldc(training_set1);
Wldc_set2=ldc(training_set2);

Once trained, I then test these classifiers using the test sets of data.

ldc_results_1=(testing_set1*Wldc_set1);
ldc_results_2=(testing_set2*Wldc_set2);

By performing:

testc[ldc_results_1]
testc[ldc_results_2]

I can see my error rates readily.

What I want to do however, is take Wldc_set1 and Wlcd_set2 (my trained classifiers) and feed them into a third classifier that looks at both outputs and forms a decision based on the training data I will supply it.

Then I want to test the system using the test data and see if the overall result improves.

I have no need to create artificial data as the object of these tests are to analyise performance of real data.

Hope that explains what I am trying to achieve a bit better, sorry for the confusion.

Profile
 
 
Posted: 09 July 2009 04:06 PM   [ Ignore ]   [ # 5 ]  
Newbie
Rank
Total Posts:  13
Joined  2009-07-08

Hmm, well I think I’ve made progress.  I’ve added the following to my code:

combiner=[training_set1 training_set2] * (parallel([ldc*classcldc*classc],[24,24])*ldc);
[testing_set1 testing_set2]*combiner*testc

However I’m getting a huge error!  I was hitting 0.076 and 0.092 errors for individual classifier outputs.  For the combiner, I’m now seeing 0.671 erros!  That’s almost a factor 10 increase…

Profile
 
 
Posted: 09 July 2009 04:29 PM   [ Ignore ]   [ # 6 ]  
Moderator
RankRankRankRank
Total Posts:  250
Joined  2008-11-08

Hi Sproik,

Please change your terminology: you have 632 OBJECTS and 2 times 24 FEATURES. Otherwise we will
be confused forever. Your original two problems are 24-dimensional. You use 316 objects for training
a classifier, that is fine, even for a 79 class problem.

In the second layer you have 2*79 = 158 classifier outputs that you use a inputs (features) for
the combining classifier. Still you use the same (which is biased and thereby bad) 316 objects for
training. This is too much for ldc. It should be regularised or you should use a more simple classifier
(nmc?) or even a fixed combiner (maxc?).

Your code is now fine, so you are back in science.

Good luck!

Bob Duin

Profile
 
 
Posted: 09 July 2009 04:40 PM   [ Ignore ]   [ # 7 ]  
Newbie
Rank
Total Posts:  13
Joined  2009-07-08

Thanks Bob - appreciate the clarification on the terminology.  Definately helps when we all speak the same language!

Unfortunately I am severely limited by the available data as I only have 8 samples of each class to work from.  Tomorrow I will attempt to split the data and use 3 for layer 1 training, 3 new for layer 2 and the final 2 samples for testing.

In the meantime, it all just clicked!  I have now tried it with nmc and maxc and received 0.15 error (NMC) and 0.0474 (MAXC).  I will play around some more.  Many thanks for your patience and explanations!

A final question, is there some documentation that is a bit more explanatory on the “parallel” command?  I have the PRTools PDF and the help command in matlab, but could do with some other material, specifically about the use of parallel when compared with stacked, sequential etc…

Profile
 
 
Posted: 17 July 2009 11:02 PM   [ Ignore ]   [ # 8 ]  
Newbie
Rank
Total Posts:  13
Joined  2009-07-08

Made massive progress this weeks thanks hugely to the support here.

Would anyone be able to indicate how the following code could be run using a x node neural network instead of the default 5?  (Normally I would use bpxnc(training_set, x).

combiner=[training_set1 training_set2] * (parallel([bpxnc*classcbpxnc*classc],[24,24])*ldc);

Profile