Username Remember Me?
Password   forgot password?
   
   
datalength
Posted: 22 April 2011 08:57 AM   [ Ignore ]  
Master
RankRankRank
Total Posts:  69
Joined  2010-04-27

Dear PRSD team,

When evaluating the number of samples in a dataset I always type length(data) or numel(data) before realizing that I should type instead length(+data) or size(+data,1).

To my opnion since we can adress the dataset samples using data(a:b) (i.e data could be considered as an array of 1-sample datasets), it would make sense that length(data) returns the number of samples contained in the dataset. Does it make sense?

I’m still using the version of December 2010. So please forget my question is this functionality has already been implemented since then.

Best regards,

JM

Profile
 
 
Posted: 02 May 2011 09:21 AM   [ Ignore ]   [ # 1 ]  
Administrator
Avatar
RankRankRankRank
Total Posts:  371
Joined  2008-04-26

Dear Jean-Michel,

thanks for your valuable feedback! We went through some discussions on this one. So far, we were using size(data,1) to request sample size. The sddata acts is a multi-dimensional array (samples vs features vs classes). The Matlab convention for the length of multi-dimensional arrays is to return the maximum dimension, which would be confusing for sddata objects. That’s why, so far, the length was not defined.

However, you are right pointing out the existing data(a:b) construct which acts one-dimensionally. We do find it very convenient for common operations on the data.

We have decided that this convenience when working with pattern recognition problems is more important than clinging to the Matlab convention in this case.
We will define length(data) to always return the number of samples. The next release will contain this update.

With Kind Regards,

Pavel

Profile