- 1.1. This manual
- 1.2. Introduction to perClass
- 1.2.1. Versions
- 1.2.2. System requirements
- 1.2.3. Useful general commands
- 1.2.3.1. Displaying perClass version and license information
- 1.2.3.2. Demo examples
- 1.2.3.3. Provide direct feedback to PR Sys Design
- 1.2.3.4. Control messages displayed by perClass
- 1.3. Release notes
1.1. This manual ↩
This manual assumes basic knowledge of pattern recognition and Matlab environment. In order to embed trained classifiers into custom applications, basic familiarity with C language is also assumed.
The manual is structured in four parts:
- User's guide - explains software functionality
- Reference manuals for the perClass Toolbox and for perClass runtime library describe the programing interface
- Knowledge base - collects number of step-by-step usage examples and "howtos"
- Glossary - explains basic pattern recognition terminology
1.2. Introduction to perClass ↩
perClass is a software package that provides quick development of custom machine learning solutions. It is composed of two parts, namely perClass Toolbox for quick design of classifiers in Matab and perClass Runtime for classifier deployment in production.

perClass provides tools for:
- Construction of data sets
- Handling of multiple sets of labels and arbitrary meta-data
- Interactive visualization of data and meta-data
- Training statistical detectors and discriminants
- Quick evaluation of classifiers
- Optimizing classifier decisions according to performance requirements using two-class and multi-class ROC analysis
- Building hierarchies of classifiers
- Deploying trained classifiers in custom applications out of Matlab
1.2.1. Versions ↩
perClass comes in the following versions for commercial use:
perClass Toolbox for development of machine learning algorithms. The permanent license is bound to a hardware dongle.
perClass Pro: the complete solution for design of algorithms with perClass Toolbox and embedding them in custom applications with perClass Runtime using a hardware dongle.
perClass Enterprise: the complete solution for design and deployment in products. Enterprise version offers OEM deployment license without dongles.
These versions are available for academic research and teaching:
Lite: Free limited version for non-commercial use intended for people who are learning about pattern recognition. It contains only perClass Toolbox and is limited to data sets with maximum 300 samples and three classes.
perClass Toolbox Academic: perClass Matlab Toolbox discounted for use by university students and researchers for non-commercial projects. The license is permanent and bound a hardware dongle which allows the researchers to move between different machines.
perClass Pro Academic: perClass Pro discounted for use by university students and researchers for non-commercial projects only. The license is permanent, bound a hardware dongle and includes both the perClass Toolbox and the perClass runtime library for execution of trained classifiers out of Matlab.
For Academic and Commercial versions, also group licensing is available using floating licenses provided by a license server.
1.2.2. System requirements ↩
perClass is supported on the following platforms:
- MS Windows 32-bit
- MS Windows 64-bit
- Linux 32-bit (x86)
- Linux 64-bit (x86)
- Apple Mac OS X 32-bit (x86)
- Apple Mac OS X 64-bit (x86)
perClass requires Matlab 7.5 or later
1.2.3. Useful general commands ↩
1.2.3.1. Displaying perClass version and license information ↩
perClass may be displayed using sdversion. It consists of a
numerical part (e.g. 2.0.9) and a build date (08-Mar-2010).
sdversion also provides several license-related details such as license
type (Commercial, Academic or Lite), licensee name and the license expiration
date.
>> sdversion
perClass Toolbox 3.0 (01-May-2011), Copyright (C) 2007-2011, PR Sys Design,
All rights reserved Commercial license for perClass. The license will expire on 26-apr-2011.
1.2.3.2. Demo examples ↩
sddemo lists several basic examples to get started
>> sddemo
run perclass_demo(num) where num is the index of the desired example
1 : Working with data sets
2 : Training a classifier and visualizing decisions
3 : Tuning a classifier using ROC analysis
4 : Multi-class ROC analysis
5 : Building detectors
6 : Building a detector-classifier cascade
1.2.3.3. Provide direct feedback to PR Sys Design ↩
sdfeedback command allows users to submit feedback such as error
messages to PR Sys Design directly from within Matlab. Running
sdfeedback without arguments opens an edit dialog where the user may
paste or type the desired message. An alternative is to provide the message
to sdfeedback as a string.
1.2.3.4. Control messages displayed by perClass ↩
sddisplay command provides global verbosity control in perClass.
Running sddisplay without arguments prints the current display state
(on/off). To switch off messages printed by perClass, use:
>> sddisplay off
Default sddisplay state is on. When perclass_mex library is re-loaded
into memory, this default state is re-introduced.
Alternatively, you may use the 'nodisplay' option in the functions that
support it: sdrelab, sdroc, sddetector and sdcrossval.
1.3. Release notes ↩
Version 3.4 (9-Oct-2012)
- local image feature extraction with custom callbacks in
sdextract(out=sdextract(data,'block',16,'feat',@my_extractor))- user-defined feature extractors may use additional parameters
out=sdextract(data,'block',16,'feat',@my_extractor,{'levels',8','range',[0 256]} ), read more
- user-defined feature extractors may use additional parameters
sdimageimprovements- creating RGB label image impainting labels/decisions to image regions
LI=sdimage(im,'labim'), read more - label images may be blended with another RGB image with
LI=sdimage(im,'labim','blend',origim) - shrink image data set to grid by a command in Image menu or using
sdimage(im,'grid'). This allows us to easily inspect feature images. read more
- creating RGB label image impainting labels/decisions to image regions
- new
sdsegmentcommand to define connected components based on labels of an image data set - support for regularization in
sdgauss,sdlinearandsdquadratic. This is an alternative to dimensionality reduction. Regularization allows for training good models in problems with limited amount of training data. read more sdcrossvalsupport for precomputed proximity matrices. With 'prox' option, care is taken that both training and test sets are represented only by training prototypes.sdpcacan optimize dimensionality based on error of a given classifier. Example:p=sdpca(data,sdlinear)returns PCA minimizing thesdlinearerror, read moresdexportgot a 'no header' option and support for export of raw data matrices- added support for direct assignments into feature and data properties:
a.featlab(3)='moment' - added
sdlabsupport for concatenation with a string[lab; 'aaa'] - fixed handling of features with constant values in
sdscatterplots - fixed a problem under Windows where scatter plot became unresponsive when changing dimennsions
- fix of
sdconfmatdisplay output for normalized matrices
Version 3.3.1 (7-Aug-2012)
sdlabobject supports==and~=operators for quick comparisons (example: get a number of errors withsum(a.lab~=dec))sdscattertoolbar legend button shows current data set legend- fix for possible crash when creating nested stacked combiners
- fix for
sdrandforestrepeatability problem (example:rand('state',1); p=sdrandforest(data);gives identical results) - fix for
sdrelabdisplay with multiple rounds of renaming
Version 3.3 (21-May-2012)
sdversionnow displays installation directory- improvements of
sdsvc- reporting a clear error message when libsvm optimizer does not find any solution
- improving grid-search performance
- removing output normalization for multi-class
sdscatterimprovements:- sample inspector is positioned next to the scatter figure
- user may pass a parameter to a calback functions which is executed when the user clicks on data sample
sdtreeandsdrandforestshow number of thresholds and base classifiers in the pipeline display string- fix in
sdldaimproving performance for badly-conditioned data sets e.g. with binary values sdrelabnow shows informative error message if relabeling map is not composed of input/output pairs.- fixes of
sddataclass subset allowing logical or cell arraysub=a(:,:,{'banana'})orsub=a(:,:,a.lab.list~='stone') sddecidefix to properly handlesddecide(p*r)callsdfeatplotadds 'absolute' option to visualize absolute frequencies instead of default relative onessdrunMEX library can now return decision names withL=sdrun(pind,'list')call.
Version 3.2 (14-Mar-2012)
- improvements of interactive image view
sdimage- interactive crop function
- definition of connected objects. Small object are by default tagged for easy removal.
- custom level of label trasparency may be set using 'alpha' option (
sdimage(im,'alpha',0.4))
- hand-drawn polygon classifier improvements in
sdscatter(see the video)- polygon classifier returns directly decisions
- inside/outside decisions may be changed from
sdscatter - confusion matrix may be shown for the current subset of data
- feature handling improvements
sdfeatselcan find/remove features with zero variance- example:
sdfeatsel(data,'var>0')returns features with non-zero variance
- example:
- features may be selected in
sddataandsdfeatselusing a cell string of names. Groups of features may be easily selected by substring with support for regular expressions. For example:data(:,'/Moment')selects all features containing substring 'Moment'sdfeatsel(data,{'Skew','~/Energy'})selects 'Skew' and all features that do not contain 'Energy'
- features can be removed from
sddatagiven logical or cell array with names (data(:,data.featlab=='/Moment')=[])
Version 3.1.2 (22-Dec-2011)
sdexportnow supports export to C45 data formatsdscattervisualization ofsdtreedecisions now allows to interactively change the number of thresholds with a slider- fixing the problem with
sdrunMEX which was returningintnot adoublepipeline index (issue pC-1267) - fixing the issue with setting the number of decision tree nodes
Version 3.1.1 (24-Nov-2011)
sdscatternow shows a value of the soft output under the mouse cursor in the figure title- polygon drawn in
sdscatterfigure may be saved also by pressing the 's' key - fix of
sdscatterproblem when showing soft outputs sddecidenow gives informative error message when given an empty ROC object (result of constraining when no operating point is available)- fix of erroneous
sdp_combineproduct combiner output
Version 3.1 (14-Nov-2011)
- fast and highly scalable decision tree (
sdtree) and random forest (sdrandforest) classifiers - polygon classifier may be drawn interactively in
sdscatterfigures - classifier acceleration with 2D lookup table (
sdlut) classifier sdfeatselnow supports feature selection based on trained decision treesdimportsupports reading ofsdlablabels separatelyrandsubsetmethod supports bootstrap samplingsdexportdisplays minimum version of perClass runtime required to execute exported pipeline. This means that deployed runtimes may be updated only when needed.sdfeatplotadds interactive selection of feature threshold.sdcascadenow supports classifier cascades as inputs- Improvements of
sdscatterdisplaying classifier decisions:- Decision under cursor is now shown in the Figure title
- The color of the decision under cursor may be changed using 'c' key
- Pipeline drawing decisions may be saved back into Matlab workspace with the 's' key. Pipeline contains changed decision colors.
sdimageimprovements:- show decisions of arbitrary pipeline in Matlab workspace
- execute k-means clustering on image data using 'Cluster with k-means' menu command ('c' keystroke)
- switching to a different class by pressing a digit
- fix for the sdscatter with backdrop problem with flipped fonts (bug in Windows ATI drivers). added Shift-A keystroke to switch off the alpha level (see http://perclass.com/index.php/forums/viewthread/260)
- fix for crash due to memory leak occuring when computing local histogram features for specific data ranges
- fix for internal use of tic/toc functions inside
sdexeand classifier execution. - fix for possible crash when using rejection-based operating point
- fix for the error raised when switching between multiple scatter/ROC plots
Version 3.0.0 (6-Jun-2011)
PRSD Studio is renamed into perClass (How to transition to 3.0)
new functionality for handling image data
- support for storing image data in data sets through
sdimagecommand. Arbitrarily-shaped regions from multiple images may be stored in a sddata object. Single or multi-band images are supported. - visualization of arbitrarily-shaped pixel subsets. See example
- texture and appearance features may be computed in local image regions using new
sdextractfunction- support for user-defined grid (region size and step)
- local histograms, features of local histograms, co-occurrence matrices
- high-speed feature extraction (extracting 86000 co-occurrence matrices from 1024x1300 image takes 230 ms on a laptop)
- support for storing image data in data sets through
new functionality for execution runtime
- significantly faster execution runtime
- labels and decisions are represented by integers, not doubles
- C API for precision timers (
sd_Ticandsd_Toc) available to custom applications out of Matlab on all platforms - new deployment tool
sdrunfor easy execution of trained classifiers using Matlab compiler.sdrunis implemented as a single statically-linked mex. To bring perClass classifier execution into custom Matlab application you only need to copy one mex binary and include the pipeline and license files. - new ASCI-based pipeline file format allows embedding of classifiers directly in a source code (see
ex_buffer.cSDK example)
new toolbox functionality
- new
sdclusterfunction for direct clustering of a data set with user-defined model.sdclusterreturns data set with cluster labels. Clustering is performed per class and original labels are preserved. See the new chapter on clustering .*operator. Applying a pipeline returning decisions to a data set with.*operator returns a data set with decisions set as new labels. This is useful to get clustering results or image labels in one step. Example:b=a.*pis equivalent todec=a*p; b=a; b.lab=dec;sdscatterimprovements- 'show all' menu command for each property
- save filter to workspace. Filter is stored as a structure which may be easily edited by hand and loaded back into
sdscatter
sdfeatplotimprovements (read more)- left/right cursor keys move to first/last feature, respectively
- 's' keystroke switches to stem-plot highlighting individual histogram bins
- 'u' keystroke uses only unique values, instead of default histogram bining
- 'a' keystroke sets automatic binning
- 'x' keystroke allows to specify name of variable defining x-axis bins (e.g. logarithmic)
- 'lab' option specifies the label set used (default: 'lab', example:
sdfeatplot(data,'lab','tissue')) - 'bins' option allows bin specification from command-line
sdrelabincludes new 'all' option that sets all samples to a specific class. This works both for labels and data sets.sdrelabmay rename pipeline decisions by providing a new list. Example:pd=sddetector(a,'target',sdgauss); pd2=sdrelab(pd,sdlist('accept','reject'));read more- generate more data from a Gaussian model using
sdgenerate. Example:p=sdmixture(data); b=sdgenerate(p,1000); sdconfmatheader lines may be suppressed with 'no header' option. This is usefull when concatenating multiple confusion matrices, e.g. for each patient into one larger table.sdknnaccepts k also directly after data set as the second argument. Examplep=~sdknn~(a,10)instead ofp=~sdknn~(a,'k',10)sdsvcaccepts the type (linear, RBF or polynomial) as a direct parameter. Example:p=sdsvc(a,'linear')- operating point marker and color in
sddrawrocmay be changed by the second parameter. See example
- new
new core-level functionality
- length of
sddatareturns number of samples (see discussion at: http://prsdstudio.com/index.php/forums/viewthread/301) subsetandsdrelabpreserve the user-specified order of classes when processing a single set of labels.- sorting label list using
sdlistsortmethod orsdlabsortlistmethod. Only order of classes is changed, not the sample labeling. sddatafindsupports regular expressions to return sample indices. Example:ind=find(a,'/substring')returns indices of all samples with classname containingsubstring.- direct assignments into
sddataproperty. Example:a(1:10).lab='orange'ora.lab(1:10)='orange' - label assignment supports also class indices. Example:
lab(1:10)=2assigns first ten samples to second class in thelab.list.
- length of
fixes:
sdp_affinefix for scaling, labels optionalsdlabdoes not include extra space insdlab('Feature',1:10)constructorsdpcaallows 1D outputsdscatterwindow will not jump out of the screen when switching on the distribution plotssddata/randsusbetandsddata/subsetreturn subset indices in column order
Version 2.4.0 (7-Feb-2011)
- new execution utilities (commercial and academic full versions only)
- execution of classifiers from Microsoft Excel worksheets (Windows only)
- command-line utility sdrun for direct execution of classifiers outside Matlab (all platforms)
- GUI execution demo (Windows only)
- LabView interface example
- toolbox improvements
- improvements in horizontal label concat (omitting internal spaces + scalability to very large data sets (one million samples uner half a second))
- fixing
sdmixtureproblem where training set priors were not used by default sdtreeclassifier adds 'levels' option that may significantly speedup training- PRTools AdaBoost classifiers with decision tree or stump base learners may be converted into pipelines using the
sdconvertcommand.
Version 2.3.0 (13-Dec-2010)
sdcrossvalnow provides per-fold measurements of execution speed- fixing a bug in
sdkmeansthat could cause crash for multi-class data sets with high overlap - fixing the problem in
sdscatterwhere regular expression could not be applied to a subset of samples - fix for the call
subset(data,'lab',{})which was not throwing error randsubsetnow works for PRTools datasets
Version 2.2.5 (24-Nov-2010)
- regular expressions allow simple definition of data subsets and
sdrelab. Strings starting with slash/character are interpreted as regular expressions. For example:subset(data,'/good')returns all classes containing the word'good'. sdscatterenhancements:- undo the last label painting operation (
ukey orUndo paintingcommand in scatter right-click menu) - cycle through all classes showing one at a time (
<and>keys) - select class subset by regular expression (
/key) - class to top (
tkey)
- undo the last label painting operation (
- new dissimilarity measures in
sdprox(Spectral Angle Mapper, Kolmogorov distance, Match distance) sdmindistclassifier directly applicable to dissimilarity representationssdfeatplotenhancements:- allows selection of a label set used for plotting the per-group distributions.
sdfeatplot(data,'lab','patient')will show per-patient histogram for each feature. - change of default behaviour:
sdfeatplotnow uses all data to construct histograms, use'maxsamples' option to limit sample count used for large data sets. - fixing the problem in
sdfeatplotrelated to constant-value features;sdfeatplotnow also shows the constant feature value if present.
- allows selection of a label set used for plotting the per-group distributions.
sddrawrocsupports interactive zooming- fix in
sdscattersample inspector showing correct labels when focusing on a sample subset
Version 2.2.4 (5-Oct-2010)
- support for Mac OS X 64-bit platform
- new
sdimportcommand for loadingsddataobjects from text files. User may specify what columns correspond to data matrix, labels and additional sample properties. (read more) sdexportcommand can storesddatain a comma-separated file (read more)sdsvcsupport for linear and polynomial kernels including automatic grid search (read more)- support for incremental Support Vector Data Description (
incsvdd) from DD_Tools. - adding support for creating
sdlablabels using a vector and class names sdfeatplotallows user definition of line styles used for plotting class-feature distributions (see example)sdp_affineturns empty offsets into zero vectors (forum discussion)sddetectornow supports test sets also in one-class mode ('reject' and 'test' options used together)- fix of a bug related to scaling proximity data with
sdscale - fix of a bug in
sdproxwhere prototypes were unnecessarily sorted
Version 2.2.3 (29-July-2010)
sdsvcallows to identify training samples that became support vectors (usingoriginalproperty of support vectors setp{1}.proto)sddetectorsupport for externally defined test set usingtestoptionsdfeatselfloating search provides history of feature subsets selected by individual steps.sdfeatseladds atestoption which may be used to supply external data set used for evaluation of 1-NN error criterionsddecideallows construction of an operating point manually. Support for both weighting-based discriminants and thresholding-based detectors.sdsvcsupport for setting external data set used for error estimation in parameter grid search with `test optionsddatasupports cell array propertiessdscatteruser callbacks are now accessible using 'callback' option- untrained classifier pipelines now return names using
getname
Version 2.2.2 (22-June-2010)
- fast feature selection
sdfeatselscalable to large data sets (forward search with 1000 samples, 50 features under five seconds). Individual, random selection, forward, backward and floating searches are supported using 1-NN error on a validation set as a criterion. Feature subset size is selected automatically. - sdscatter called when clicking on an data sample. This allows to custom visualization such as loading an image corresponding to a sample form disk and showing it in a separate figure.
- support for untrained high-level operations on data (subset, randsubset, sdrelab, sdroc). This allows one to easily express complex sequences of training operations.
- extended
sdscalesupporting also robust domain scaling (robust in presence of outliers) - cascades may be now trimmed to return output after specific stage using
sdconvert, e.g.pc2=sdconvert(pc,'until',2). This helps us to understand how the later stages of hierarchical classifiers improve performance. - experimental support for Mac OS X 64-bit platform
sdscatterfix for decision colormap when showing classifier decisions
Version 2.2.1 (3-May-2010)
- interactive visualization of feature distributions in
sdscatterfor both axes (use 'Show feature distribution' in 'Scatter' menu or press 'd'). This greatly simplifies understandingo of overlap in very large data sets where scatter plot is not too informative. (example) sdkmeansclassifier and clustering scalable to very large data sets (1 million samples, 10 clusters in 3.3 sec).sdkmeansprovides fast prototype selection method for k-NN classifiers. Classification performance is further improved by prototype pruning (similar effect to editing the training set).sdkcentresclassifier and clusteringrandsubsetallows to limit the maximum number of samples using 'atmax' option. This is useful to limit samples size but tolerate that some classes have less samples.findandsubsetnow allow that some of the class names do not exist and return what is present (and not empty [] as before)
Version 2.1.0 (21-Apr-2010)
- fixing a bug in sddecide related to adding an operating point in an ROC object
- fixing an error message in sdlab constructor
- adding RBF support vector machine training using
sdsvccommand.sdsvcis based on libSVM and offers automatic grid search for sigma and C parameters and one-against-all multi-class support. (examples) - adding a reject option to a trained discriminant using the
sdrejectfunction (also for multi-class classifiers; both outlier rejection and rejection close to the decision boundary) (examples) sdcrossvalsupport for estimating ROC with variances using operating point averaging (cross-validate pipline returning soft outputs and provide fixed operating points using the 'ops' option), (example)- adding
sdcrossvalsupport for customsdalgalgorithms that are not convertible into a pipeline (algorithm needs to return the list of all possible decisions) sddrawrocnow saves completesdrocobjects back in the workspace, not only operating points (by pressing 's' key)sddecidesupport for default op.point based on thresholding (e.g. forsdsvcon two-class problems)- support for clustering using
sdmixturewith 'cluster' option sdscatteradding the "show only this class" command (press 'o' key)- default mean-error performance measure in
sdcrossvalis not anymore included if user requests a specific set of measures sdneuralmay switch off the default use of validation for teaching purposes (to illustrate overfitting of the network). Use'valfrac',[]to suppres the use of validation set.- fixing the problem with
sdrocusing 'confmat' and 'reject' options together - fixing the bug in
sdlabconstructor for single label per class - improving compatibility with PRTools (
sdimage,sddetector,sdreject,sdcrossval,sdstackgen,sdscattervisualizing images using sample inspector)
Version 2.0.9 (8-Mar-2010)
- adding support for subset by logical array for
sddataandsdlabobjects (example:a( a.lab=='banana' )) sdtestraises a warning if some of the true classes are not matched to classifier decisions (all samples from these classes are considered misclassified)- fixed sdscatter problem with the order of classes in "class on top" and "change markers"
- usability improvements in
sdfeatplot(click to change figure title; legend properly displaying special characters) - 'mean-error' performance measure may specify optional class priors used for weighting the class errors
- global display verbosity may be handled using
prsd_displaycommand (useprsd_display offto switch off display output of PRSD Studio functions). - 'nodisplay' options added in
sdmixture,sdparzen,sdcrossval randsubsetsupports random selection of objects from some classes only (example:[tr,ts]=randsubset(a,[0.5 0])returns 50% of the first class for the training)sdcrossvaloutputs string with the result summary, result struct and the evaluation object.
Version 2.0.8 (19-Feb-2010)
- fix in
sdimagefor multi-dimensional images (image cubes) - pipelines now provide operating points via
p.opsfield - API interface simplification and cleanup
- low-level output of pipelines on matrices and using C API returns indices to decision list as decisions, not the internal codes
sdlistandsdlabinternal numerical representation is not exposed to the user anymore- feature selection pipelie
sdp_fselnow may get the feature labels directly from the data setpf=sdp_fsel(data,[3 4]) sddetectorhandles output polarity automatically (k-NN output is distance, mixture output is similarity)- adding easy display of
sdlabobject details (class sizes, fractions) using the transpose operator (lab')
Version 2.0.5 (22-Dec-2009)
- classifier output visualization using
sdscattercan now switch between different soft outputs interactively using cursor keys - added
constrainmethod for easy application of ROC performance constraints - enhanced
setcuropmethod to choose operating point minimizing or maximizing specific performance measure or setting op.point based on costs - new performance measure
nconfmat- the entry in normalized confusion matrix - 'target' and 'non-target' options in
sddetectorandsdrocsetting the desired target/non-target names setstatemethod insdalgalgorithm allows to call algorithm function directly (instead of using the multiplication operator)
Version 2.0.4 (14-Dec-2009)
sdrelabnow allows to add string prefix to all classes in all labels present using 'add to all' option. This makes it easy to compare two data sets with multiple labelings (classes, patients, tissues).- adding
sdscalecommand for data scaling
Version 2.0.3 (9-Dec-2009)
isclassmethod for quick check if certain classes are present (useful for custom algorithms)sdnormfunction adding normalization step to a trained pipeline (this construct a general discriminant)sdlabfix for incorrect class size when initialized with a list and indices- adding initial version of auto-conversion for older-format
@sdppl/sdppland@sdops/sdopsobjects - fix for the inMathOverflow warning/error in
sdtreetraining
Version 2.0.2 (4-Dec-2009)
- new
sdlabobject simplifies handling of labels, decisions and indexed meta-data - new
sddataobject brings easy handling of sample meta-data - multiple sets of labels or meta-data in a dataset, unified access to sample properties
- simple queries using multiple criteria (give me all samples labeled as "Cancer" from patient 1,2 and 5 using
subset(a,'class','Cancer','patient',[1 2 5])) - access to classes is greatly simplified
- sdroc handles classifier output polarity automatically (sdexe stores the output type in
output_typedata property) - user may change class markers. Data set remembers class markers. Scatter markers are stored in the 'marker' property inside the class list.
- dissimilarity representation contains as feature properties all prototype sample properties
- labels and decisions may be easily concatenated. This allows us to add new labels with brake-down of errors (confusion-matrix entries) in one command.
- writing custom sdalg algorithms is significantly simplified
1.x Compatibility changes
- sdppl objects use new internal format.
- sderror replaced by sdtest
Version 1.3 (30-Nov-2009)
- fix in
sdnmeanclassifier: now computing pooled diagonal covariance using class priors - adding missing
parse_measures.pfile - fixing p-code copatibility problem with Matlab 7.4
Version 1.2.5 (12-Oct-2009)
- fixes in
findpropfor numerical properties - adding 'all' and 'nodisplay' options to
sdrelab
Version 1.2.4 (13-Aug-2009)
sdtreeimplements training of decision tree classifier scalable to large number of samples (example)- fix in
prsd_feedbackcorrecting the problem with PRTools not on Matlab path
Version 1.2.3 (15-Jul-2009)
- visualization:
sdscatterprovides more detailed information in sample inspector including all sample meta-data sdrelab: adding prefix or suffix to all class names. (example)sdrelab: renaming a single class by relative index- simpler installation: PRSD Studio Lite installation does not anymore require software activation
sdroc: support for reject option on classifiers with distance soft output (sdknn)selprop,findpropsupport for set of property values defined by cell array
Version 1.2.2 (16-Jun-2009)
- libPRSD: support for AdaBoost execution using decision tree as base classifiers
- visualization:
sdscatterallows interactive change of classifier parameters using slider (k in k-NN, smoothing in Parzen, number of base classifiers in AdaBoost) - visualization:
sdimagemay be connected to ROC plot and visualize decisions at different operating points in real-time sdneuralprovidestargetoption that allows one to approximate trained classifiers (example)sdroc: fraction of all objects may be rejected by specifying fraction afterrejectoption
Version 1.2.1 (27-May-2009)
sdnbayesimplementing Naive Bayes classifier with automatic selection of number of histogram binssdrocnow supports cost-based selection of operating point for two-class scenario (in addition to the existing multi-class cost-based optimization)sddecidemay be used in pipelines to define default operating pointsdp_affinecan construct simple feature scaling pipelines
Version 1.2 (19-May-2009)
sdmixturesupports automatic estimation of number of componentssdneuralimplementing feed-forward neural network trainingsdcrossvalnow supports untrained pipelines
Version 1.1.6 (9-May-2009)
sdparzenParzen classifier implementing scalar and vector smoothingsdknnk-th nearest neighbor classifier with for prototype selection and support for both detection and multi-class classification
Version 1.1.5 (1-May-2009)
- libPRSD now supports loading pipelines also from a buffer using
prsd_LoadPipelineFromBuffer(pipelines may be now stored in application resources or sent over network). sdrocsupports rejection both far away and close to the decision boundary using therejectoption.sdscatter: the figure title may be selected interactively by clicking on the title area- simplified selection us performance measures in
sdroc
Version 1.1.4 (19-Mar-2009)
- adding support for group licenses via license server
- support for construction of arbitrary hierarchical classifiers using decision-level fusion and their execution through libPRSD
sddetectorbrings one-command construction of detectors based on arbitrary model (both in one-class setting specifying a threshold using fraction of rejected samples and in two-class setup using ROC analysis to fix the threshold minimizing mean error).sddrawrocallows to save the current operating point into any relevant object (sdroc,sdops, pipelines,sddecidemappings, customsdalgalgorithms)- introducing
sdmixturefor training Gaussian mixture models (one- or multi-class, variable number of components per class, different stopping criteria (iterations or likelihood delta)) sdrelaballows to define classes by ~ (tilda) negation operator (e.g. turn all what is not not apple into "non-apple")sdscatterallows the user to flip through order of classes (z-order) by + and - key-strokes- number of usability improvements in construction of pipelines and interaction with PRTools (
sdrocandsdopsobjects can be now directly concatenated into pipelines;sdmapwraps pipelines for use in PRTools) - many improvements in confusion matrix estimation:
sdconfmat setpropnow allows to quickly set property to a constant value. This makes it very easy to quickly tag a group of samples with a specific label.sdconfmatcan now add new labels with all confusion matrix combinations as a property. This can be used to quickly visualize different types of error directly in the feature-spacesdconfmatcosmetic fix: string confusion matrix scales nicely with long class names- new function
selpropreturning a subset of a dataset with given property values - significant improvements in scalability of sdroc to large datasets in speed and memory usage. Practical even for datasets with 100 000 samples and tens of thousands of operating points.
- improved ROC optimizer brings better quality sets of operating points
sdconfmatcan now estimate confusion matrices for sets of operating points from the soft outputssdexecan return numerical decision codes ('code' option). This is useful for low-level work with classifier outputs.- pipelines can return numerical decision codes using
.*operator (e.g.dec=data.*p) sdeaclustclustering can be now executed on new data. Scalable to very large datasets (images).
Version 1.1.3 (26-Jan-2009)
- fix in sdscatter allowing to paint labels with legend switched on
- fix in sdscatter retaining the type of numerical properties in a dataset saved back to workspace
- sdscatter can now switch visibility of classes or groups on/off. That's helpful when inspecting large datasets with many overlapping sample groups (patients). See context menu in sdscatter Figure windows. Painting now applies only to visible samples.
- initial support for hierarchical systems composed of multiple classifiers returning decisions (
sdp_cascade)- support for meta-classes and different features at each classifier node.
- ROC analysis for hierarchical systems
sdconfmatadded- the order of labels and decisions (lablists) can be fixed by the user
- sdconfmat can correctly handle situations where only some classes/decisions are present in the test set (given the full lablists)
- sdconfmat can return the string with a table
- support for normalization of confusion matrices
- lablists may be supplied as cell arrays of strings or string arrays
- support for weight-based operating points with reject option (rejection both close to the boundary and distance-based)
sdrocautomatically shows rejected fraction and all per-classTPrs- support for similarity-based nad distance-based classifier outputs
- adding reject fraction estimate to sdroc
- support for leave-one-out over a property (object, person, patient...)
- fix for the bug where sdscatter made error when mouse pointer was moved too quickly over the new window
Version 1.1.2 (18-Nov-2008)
- adding fast approximated k-NN see example in our blog
- adding a k-centres classifier capable of both one-class classification and multi-class discrimination
- feature selection algorithm
sda_featselnow supports also backward feature selection
Version 1.1.1 (09-Nov-2008)
- adding leave-one-out evaluation to
sdcrossval - adding sdfeatsel: robust feature selection using internal cross-validation loop. It supports custom-made feature selection algorithms
- two example algorithms added illustrating the use of feature selection during training (
sda_featsel_example1) and in inner cross-validation loop based onsdfeatsel(sda_featsel_example2)
Version 1.1 (04-Nov-2008)
- fixing critical bug in 1.1 26-Oct-2008 related to problem with dongles
- fixing the issues with one-sample test sets in ROC
Version 1.1 (26-Oct-2008)
sdscattergets full support for GUI menus and class renaming- new
sdimagecommand visualizing image stored in a dataset. Support for label paiting, class renaming, multiple sample groupings, connection to sdscatter sdscattersupport for interactive sample inspector (datasets with 1D data using bar plot or 2D images)sddrawroccan now show confusion matrices at the cursor and at the selected operating point (if present i.e. if 'confmat' flag was specified in sdroc command)sdexenow automatically converts sdalg algorithms into pipelinessdstackgennow returns also a robust base classifier (mean fusion of per-fold trained base classifiers) as a second output- improved support for prtools classifiers with output conversion
- fix:
sdrocnow stores confusion matrices in multi-class situations using 'confmat' flag - fix to scaling using affine projection. scalem is now supported for all affine scaling types
Version 1.0 (15-Sep-2008)
- added randomization cross-validation scheme
sdcrossval(nmc,data,'method','random') - ROC object may be queried using short names of measurements
r(:,'err(Cancer)') - activation support for commercial demos
Version 1.0 (02-Sep-2008)
- fix: included missing sda_prtools wrapper
- new feature: sdscatter now allows for user-defined titles (sample details moved to the figure title bar)
