Username Remember Me?
Password   forgot password?
   
   
Classification costs and feature selection
Posted: 27 September 2010 07:12 PM   [ Ignore ]  
Newbie
Rank
Total Posts:  20
Joined  2009-10-19

Hi,

I have a question regarding feature selection and classification cost.

Suppose I have a dataset A with equal classification costs and I found the most important features using a feature selection mechanism.

If I decide to change the classification costs Should I just alter the trained classifier as is done in prex_cost

cost = [0.0 1.0 1.0;
9.0 0.0 1.0;
1.0 1.0 0.0];
wc = w*classc*costm([],cost,class_labels);

or should I go back and alter the classification costs IN THE DATASET, e.g,

A1=A*costm([],cost_matrix)

and do the feature selection again?

Thanks,

Jorge

Profile
 
 
Posted: 13 October 2010 09:29 AM   [ Ignore ]   [ # 1 ]  
Moderator
RankRankRankRank
Total Posts:  253
Joined  2008-11-08

Costs are a property of the classification problem, like prior probabilities, and should thereby be stored in the dataset. In the prex-cost example costs are changed afterwards in the classification matrix. Depending on the classification procedure this might yield the same result. In general and formally however it is better to assign them to the dataset.

Bob Duin

Profile
 
 
Posted: 13 October 2010 02:08 PM   [ Ignore ]   [ # 2 ]  
Newbie
Rank
Total Posts:  20
Joined  2009-10-19

Hi Bob,

Thanks for your answer.

I set the cost in the dataset as bellow:

E=setcost(E,[0 1;2 1],[1 2]’)

Then I run a feature selection

[W,R]=featselp(E,’NN’)
Floating FeatSel, 7 to 4 trained mapping --> featsel

R =

1.0000 0.8667 6.0000
2.0000 0.8733 3.0000
3.0000 0.9200 7.0000
2.0000 0.9133 -6.0000
3.0000 0.9333 1.0000
4.0000 0.9467 2.0000
5.0000 0.9467 6.0000
6.0000 0.9400 5.0000
7.0000 0.8400 4.0000

>> +W

ans =

3 7 1 2

Then I changed the cost matrix

EN=setcost(E,[0 1;1 0],[1 2]’)
150 by 7 dataset with 2 classes: [75 75]

getcost(EN)

ans =

0 1
1 0

Then I run the feature selection again

[W,R]=featselp(EN,’NN’)

and I found the same selected features as before.

I thought that it would give me different features since the performance of the leave-one-out would be altered by the cost.

Why it did not happen? Could it be a coincidence?

Thanks,

Jorge

Profile
 
 
Posted: 14 October 2010 12:15 PM   [ Ignore ]   [ # 3 ]  
Moderator
RankRankRankRank
Total Posts:  253
Joined  2008-11-08

The ‘NN’ criterion just counts errors in NN assigments. There are no costs there. You may try to change the criterion by a classifier like parzenc*classc. In this case the posterios are weighted by the costs. In experimenting myself I found a bug in costm. May be it is better to replace it (attached).

Bob Duin

File Attachments
costm.zip  (File Size: 2KB - Downloads: 67)
Profile