Username Remember Me?
Password   forgot password?
   
   
using sdp_affine in a pipeline
Posted: 22 September 2010 04:57 PM   [ Ignore ]  
Novice
Rank
Total Posts:  11
Joined  2008-11-10

Dear PRSD Studio Team,

I use sdp_affine for scaling feature vectors and get different classification results depending on the grouping of sdp_affine in the call. Here is an example:

a=mysddata; %some dataset
w
=myclassifier; %some classifier with decision
s
=sdp_affine(scaleFact,[],sdlab()); %scaleFact...vector of scaling parameters

out1
=a*s*w
out2=(a*s)*w;
out3=a*(s*w);
out4=a*[s w]; %create a pipeline which scales and classifies

In my opinion, all out1 to out 4 should give the same result. But only out1 and out2 are equal and out3 equals out4.
The reason why I get into a problem with this is that I want to export a classifier including the scaling stage.

Best Regards,

Bernhard

Profile
 
 
Posted: 23 September 2010 06:59 AM   [ Ignore ]   [ # 1 ]  
Administrator
Avatar
RankRankRankRank
Total Posts:  371
Joined  2008-04-26

Dear Bernhard,

in out1 and out2, the data set a is first projected by scaling pipeline s, creating a temporary data set to which w is applied. In out3 and out4, the two-stage pipeline is first created applied to a data set in one step. The concatenation with brackets in out4 is deprecated, use the functionally identical concatenation by multiplication (as in out3).

Now to the differences. Redoing your example, I found out that the reason lies in the empty offset you’re passing in to sdp_affine (turning empty offset to zero vector is not a documented behaviour).

>> rand('state',42); a=sddata(rand(3,2))
3 by 2 sddata, class: 'unknown'
>> +a

ans 
=

    
0.1591    0.4966
    0.3710    0.0149
    0.9005    0.4225

Lets define two sdp_affine pipelines, once with offset specified, once empty.

>> p1=sdp_affine([2.3 4; -7 5],[0 0],sdlab('out'))
Affine pipeline         2x2   (sdp_affine)
>> 
p2=sdp_affine([2.3 4; -7 5],[],sdlab('out'))
Affine pipeline         2x2   (sdp_affine)

>> 
p=a*sdparzen just to have some 2nd stage
Parzen pipeline         2x1  one 
class, 3 prototypes (sdp_parzen)

Applying scaling with offset set to zero:

>> out1=a*p1*p
3 by 1 sddata
, class: 'unknown'
>> out2=(a*p1)*p
3 by 1 sddata
, class: 'unknown'
>> out3=a*(p1*p)
3 by 1 sddata, class: 'unknown'
>> out4=a*[p1 p]
3 by 1 sddata
, class: 'unknown'

>> [+out1 +out2 +out3 +out4]

ans 
=

    
0.0006    0.0006    0.0006    0.0006
    0.2175    0.2175    0.2175    0.2175
    0.0004    0.0004    0.0004    0.0004

>> [+out1 - +out3]

ans 
=

     
0
     0
     0

Applying the scaling with empty offset:

>> out1=a*p2*p
3 by 1 sddata
, class: 'unknown'
>> out2=(a*p2)*p
3 by 1 sddata
, class: 'unknown'
>> out3=a*(p2*p)
3 by 1 sddata, class: 'unknown'
>> out4=a*[p2 p]
3 by 1 sddata
, class: 'unknown'
>> 
>> 
[+out1 +out2 +out3 +out4]

ans 
=

    
0.0006    0.0006         0         0
    0.2175    0.2175         0         0
    0.0004    0.0004         0         0

We will add the conversion of empty offset to zero in the next release.

For some commonly-used scaling approaches, look at sdscale (http://perclass.com/doc/ref/sdscale.html)

Kind Regards,

Pavel

Profile
 
 
Posted: 23 September 2010 09:31 AM   [ Ignore ]   [ # 2 ]  
Novice
Rank
Total Posts:  11
Joined  2008-11-10

Thanks a lot for the quick answer!

I thought that using an empty offset would avoid calculating an offset. I made this assumption because in an elder version of prsd studio it worked and for my current implementation the error was almost not noticeable in the results. By changing the structure of the classification stages I now had some strange missbehaveings so I looked deeper into the data and isolated the problem as stated in my post.

Now everything works fine:-).

Best Regards,

Bernhard

Profile