04.05.2010  pavel

Live feature distributions in scatter plots

imageThe 2.2.1 release brings new interactive tool into the sdscatter: the feature distribution plot. It shows the histogram for each class for the currently selected feature on the horizontal and vertical axis of the sdscatter. This gives us better understanding of true nature of class overlap especially in large data sets where traditional scatter is very cluttered.

The feature plots are updated live with scatter operations (showing class subsets, hiding classes, painting labels).


Using the live feature distributions is simple. Just visualize your data set with sdscatter command:
>> a
'medical D/ND' 6400 by 11 sddata, 3 classes: 'disease'(1495) 'no-disease'(4267) 'noise'(638) 
>> sdscatter(a)

Now, select "Show feature distributions" from the Scatter menu or press 'd' (as distributions). The scatter figure will be extended with horizontal and vertical scatter plot:

image

You may now use all functionality of the scatter plot such as changing features by cursor keys, view different sets of labels and constrain the view to sample subsets. Feature plots will always show the up-to-date distributions of classes present in the scatter plot.

This video requires a more recent version of the Adobe Flash Player to display. Please update your version of the Adobe Flash Player.

Comments

Name:

Email:

Location:

URL:

Remember my personal information

Notify me of follow-up comments?

Submit the word you see below: