Username Remember Me?
Password   forgot password?
   
   
Interpreting figures
Posted: 11 April 2010 01:08 AM   [ Ignore ]  
Newbie
Rank
Total Posts:  4
Joined  2010-04-09

Hi,
I will be straight to the point. I am training a mixture of gaussians to estimate a persons age from their speech. I have 168 objects with 13 features in each, 4 classes of data. I trained the gaussians using gaussm and mogc and I used plotm. I know that gaussm trains one gaussian per calss of data object while mogc computes one over all gaussian density.
I provided prior probabilities when I created my dataset object and set the labtype to ‘soft’ in order to use the expectation maximisation algorithm. The following is my code:
A = dataset(inputs,labels’);
A = setlabtype(A,’soft’);
A = setprior(A, [0.16666666666667 0.22619047619048 0.33333333333333 0.27380952380952]);
W1 = mogc(A,4,0,0.8);
figure(2);
scatterd(A,[10,5]);
plotm(W1,6,10);

and attached is the resulting plot using plotm as shown.
Can anyone please explain to me how to interpret the plot?
What does “feature 1” and “feature 2” mean?

Kind regards,
Faiyo.

Image Attachments
mog.JPG
Profile
 
 
Posted: 14 April 2010 08:57 AM   [ Ignore ]   [ # 1 ]  
Administrator
Avatar
RankRankRankRank
Total Posts:  240
Joined  2008-04-26

Hi Faiyo,

the second parameter of scatterd is according to scatterd help the number of dimensions (scalar) and may be only 1,2, or 3. These are dimensions of the scatter plot, not of your data.

I don’t know what happens when you use a DIM vector in scatterd(A,[10,5]). When I’m trying this with a dataset with more features, I get an error.
I can only use the scatterd without additional parameter - then it plots the first two features.

plotm draws a grid over the 2D scatter plot and renders the output of mogc (conditional density). Looking at the code of plotm, it is padding the data set constructed from 2D scatter axes with zeros to form the 13D data and then executes your mixture mapping on it.
This means that the visualization shows the mixture density in 2D plane relative to the origin (which works well if your data is around zero but may show nothing if not). Note that plotm assumes your scatter plot renders features 1 and 2. If you draw scatter plot for a different pair of features, plotm output will not be valid.

If your goal is to understand what is happening in your data, you may consider the interactive scatter plot sdscatter from PRSD Studio. You may flip through the feature combinations by cursor keys and get the feeling for complexity and overlap in your data. Second method, I’m using, is to visualize the distributions (histograms) of each feature in your data set using sdfeatplot. Also here, you may switch between features by cursor keys.

Hope it helps,

Pavel

Profile