perClass Documentation
development version 3.2 (14-Mar-2012)
Content

Comments? Ideas? Compliments?

Your email (only if you wish to be contacted)

Chapter 16: Classifier deployment using perClass runtime library

Table of contents

16.1. Introduction ↩

perClass provides tools for easy deployment of pattern recognition algorithms in products. The corner-stone of the deployment framework is the pipeline object, discussed in Chapter 7. Pipelines always execute through the perClass runtime library writen in C. Under Matlab, the execution is routed through the MEX interface. In order to execute the pipeline in a custom application outside Matlab, we need to export it using the sdexport function.

Let us illustrate classifier deployment on a simple example. We build a simple Gaussian classifier and add to it the default operating point:

>> load fruit; a
'Fruit set' 260 by 2 sddata, 3 classes: 'apple'(100) 'banana'(100) 'stone'(60) 
>> pd=sdgauss(a)*sddecide
sequential pipeline     2x1 'Gaussian model+Decision'
 1  Gaussian model          2x3  3 classes, 3 components (sdp_normal)
 2  Decision                3x1  weighting, 3 classes, 1 ops at op 1 (sdp_decide)

We may execute the classifier on a new sample:

>> sddata([1 2])*pd
sdlab with one entry: 'apple'

Now we export the pipeline pd to the external pipeline file myclassifier.ppl:

>> sdexport(pd,'myclassifier.ppl')

Classifier may be now directly executed outside Matlab using the command-line sdrun utility (in interfaces/sdrun directory):

> sdrun.exe myclassifier.ppl -d " 1 2 "
apple

We have provided sdrun command with the pipeline file and data. It outputs the classifier decision. sdrun is the simplest way to execute the classifiers outside Matlab.

In addition to sdrun utility, perClass offers tools for easy embeddedding of classifiers in C/C++ programs or their execution from any environment capable of calling a DLL.

16.2. Execution of classifiers with command-line sdrun utility ↩

The sdrun utility allows execution of perClass classifiers from operating system command line (Windows key+R to open the Run dialog and enter cmd). It is available for each supported platform under interfaces\sdrun directory. Because it's statically linked with the perClass runtime, sdrun utility does not have any external dependencies. It requires only a license file and a pipeline file. The license file is assumed to be located in the same directory as the sdrun executable.

16.2.1. Displaying pipeline information ↩

When providing sdrun only with a pipeline file, it displays the basic pipeline info:

> ./sdrun.exe myclassifier.ppl 
perClass Pro 3.0.0 (01-Jun-2011), Copyright (C) 2007-2011, PR Sys Design, All rights reserved
Commercial license. This license will expire on 1-aug-2011 (PR Sys Design)

pipeline name: 'Gaussian model+Decision'
input type: double, dimensionality: 2
output type: int, dimensionality: 1, decisions
possible decisions: apple,banana,stone

sdrun lists the pipeline name, input and output dimensionality and data type and the type of output (soft outputs or decisions). For the decision-returning pipelines, it also provides the list of possible decisions.

The pipeline name may be set by the user in Matlab using the setname command:

>> pd2=setname(pd,'myclassifier')
sequential pipeline     2x1 'myclassifier'
 1  Gaussian model          2x3  3 classes, 3 components (sdp_normal)
 2  Decision                3x1  weighting, 3 classes, 1 ops at op 1 (sdp_decide)

>> sdexport(pd2,'../src/perclass/myclassifier.ppl')
Exporting pipeline for deployment using perClass runtime

Note, that in the final product using perClass DLL, the pipelines do not need to be stored as separate files visible to the end-user. Instead, they may be stored in an internal application resource or buffer. Example of loading pipelines from a buffer using C API is given in ex_buffer.c file in SDK directory.

16.2.2. Executing classifier on a data file ↩

sdrun utility may execute the pipeline on a set of observations stored in a comma-separated text file. This option is useful for a quick batch processing. The data file should store individual samples (feature vectors) as comma-separated lists, one row per sample.

> sdrun.exe myclassifier.ppl data.txt 
apple
banana
banana
apple
apple
...

16.2.3. Executing classifier on samples provided in a string ↩

sdrun utility can be executed on one or few data samples provided directly on the command-line input. Using the -d option, we may specify the string with space-separated feature values. Multiple feature vectors may be separated by semicolons.

> sdrun.exe myclassifier.ppl -d "1 2; -4 10; 0 4.55"
apple
stone
stone

16.3. Classifier execution in Microsoft Excel worksheets ↩

perClass classifiers may be executed directly in MS Excel worksheet. The example for Excel 2010 is provided in the interfaces\Excel\perclass_excel_example directory.

Note: It is important to open the Excel example worksheet perClass_Excel.xlsm from Excel File/Open menu, not by double-clicking on the file icon. The reason is that only then Excel correctly assigns the "default file location" and will find the perclass.dll runtime library.

When we open the example worksheet, we can see the three parameters needed for classifier execution on the left and the green execution button on the right.

Classifier execution in Excel

We will first need to specify the path to the example directory in the B1 cell. This helps Excel to locate the perClass.dll library. Second, we need to provide the path to the pipeline file in B2. We may use our own classifier exported using sdexport command or try one of the three pipelines included with the example (fisher.ppl, parzen.ppl and parzen_dec.ppl).

Now, we may prepare the input data. We fill the data matrix directly in the worksheet. We must only specify the range of input data in the cell B3.

Classifier execution in Excel

By clicking on "Execute classifier" button the pipeline outputs are written to the right of the input data:

Classifier decisions in Excel

16.3.1. Execution on data stored in a different workbook ↩

Input data may reside in any other sheet or even in an entirely different workbook. We only need to provide the correct reference in the B3 cell of the perClass_example.xlsm file. This allows us to execute classifiers directly in our spreadsheets, without including any specific code or libraries.

Classifier execution in other Excel workbook

16.4. Executing classifiers from LabView ↩

perClass classifiers may be executed from LabView environment using the example interface perClass_example1.vi in interfaces\LabView directory.

Classifier execution in LabView

16.5. Executing classifiers from Matlab/Matlab compiler ↩

perClass 3.0 introduces new deployment tool for execution of classifiers from Matlab or Matlab compiler. The MEX library sdrun uses perClass deployment license instead of the perClass toolbox license. Therefore, you may use it to distribute classifiers to third parties.

The sdrun MEX is fully self-contained. To embed classifier execution in your Matlab/Matlab compiler application, we only need to provide the sdrun binary, license file and a classifier pipeline.

To use sdrun MEX in Matlab, simply add the interfaces/MatlabCompiler/PLATFORM directory on your Matlab path:

>> addpath /home/pavel/perClass_Demo/interfaces/MatlabCompiler/mac64/

Typing sdrun, we will receive basic information and usage example:

>> sdrun
perClass Demo 3.0.0 (18-Jun-2011), Copyright (C) 2007-2011, PR Sys Design, All rights reserved
Demo license. Only for evaluation purposes. This license will expire on 03-aug-2011 ()

sdrun mex allows execution of classifiers trained in perClass Toolbox
in deployment mode (e.g. in custom applications made with Matlab compiler).

Usage: pind=sdrun('pipeline.ppl')   Load pipeline from file
   out=sdrun(pind,data)         Execute pipeline pind on the data

No pipelines loaded

16.5.1. Loading a classifier pipeline ↩

To load a classifier pipeline file, such as the myclassifier.ppl created above, we provide the file name to the sdrun MEX:

>> i=sdrun('myclassifier.ppl')
Pipeline 1 loaded: 'myclassifier.ppl' ('Gaussian model+Decision')

i =

       1

The sdrun loads the pipeline and returns the pipeline index.

Typing sdrun again, we will see the list of loaded pipelines:

>> sdrun
perClass Demo 3.0.0 (18-Jun-2011), Copyright (C) 2007-2011, PR Sys Design, All rights reserved
Demo license. Only for evaluation purposes. This license will expire on 03-aug-2011 ()

sdrun mex allows execution of classifiers trained in perClass Toolbox
in deployment mode (e.g. in custom applications made with Matlab compiler).

Usage: pind=sdrun('pipeline.ppl')   Load pipeline from file
   out=sdrun(pind,data)         Execute pipeline pind on the data

One pipeline loaded: 
pind  name
 1 : 'Gaussian model+Decision'  input dim: 2, output dim: 1 (decisions: apple,banana,stone)

16.5.2. Executing a classifier on new data ↩

To execute the classifier on new data, simply provide pipeline index and data matrix. The sdrun MEX will return decisions as numerical indices.

>> out=sdrun(1,[0 0; 0 1; -10 10])

out =

       1
       1
       3

16.5.3. Working with multiple pipelines ↩

sdrun allows us to work with multiple pipelines. Here we load another pipeline:

>> sdexport(pd(1),'myclassifier2.ppl')
Exporting pipeline..ok
>> i=sdrun('myclassifier2.ppl')
Pipeline 2 loaded: 'myclassifier2.ppl' ('Gaussian model')

i =

       2

>> sdrun
perClass Demo 3.0.0 (18-Jun-2011), Copyright (C) 2007-2011, PR Sys Design, All rights reserved
Demo license. Only for evaluation purposes. This license will expire on 03-aug-2011 ()

sdrun mex allows execution of classifiers trained in perClass Toolbox
in deployment mode (e.g. in custom applications made with Matlab compiler).

Usage: pind=sdrun('pipeline.ppl')   Load pipeline from file
   out=sdrun(pind,data)         Execute pipeline pind on the data

2 pipelines loaded: 
pind  name
 1 : 'Gaussian model+Decision'  input dim: 2, output dim: 1 (decisions: apple,banana,stone)
 2 : 'Gaussian model'  input dim: 2, output dim: 3 (soft outputs)

Executing pipeline 2 on the same data returns soft outputs (probability densities):

>> out=sdrun(2,[0 0; 0 1; -10 10])

out =

0.0038    0.0012    0.0001
0.0026    0.0012    0.0003
0.0000    0.0000    0.0004

16.5.4. Removing pipelines from memory ↩

To remove pipelines from memory, use clear mex:

>> clear mex
>> sdrun
perClass Demo 3.0.0 (18-Jun-2011), Copyright (C) 2007-2011, PR Sys Design, All rights reserved
Demo license. Only for evaluation purposes. This license will expire on 03-aug-2011 ()

sdrun mex allows execution of classifiers trained in perClass Toolbox
in deployment mode (e.g. in custom applications made with Matlab compiler).

Usage: pind=sdrun('pipeline.ppl')   Load pipeline from file
   out=sdrun(pind,data)         Execute pipeline pind on the data

No pipelines loaded

16.6. Classifier embedding using C/C++ language API ↩

Complete functionality of perClass execution is available through the C/C++ interface in SDK directory. Runtime API changes slightly with 3.0 release, see this article for details.

Steps that need to be taken to execute a classifier from a custom application:

  • initialize the perClass runtime library
  • load the pipeline
  • prepare input data buffer and attach it to the pipeline
  • prepare output data buffer and attach it to the pipeline

After that, the pipeline is ready for execution. To process more data, you can write them directly into the input buffer and execute the pipeline again.

16.6.1. Complete C application example ↩

This example (ex_basic.c) shows a complete application loading a classifier, processing some data and writing out the results. The example assumes that the input data are 2D feature vectors.

  1. /*  
  2.    ex_basic.c: Example of calling perClass runtime from C code and getting 
  3.    decisions for new data samples. 
  4.  
  5. */  
  6. #include <stdio.h>  
  7. #include <stdlib.h>  
  8. #include "perclass.h"  
  9.   
  10. #define SD_ABORT(pk)   \  
  11.   printf("%d: %s\n",sd_GetErrorCode(pk),sd_GetErrorMsg(pk));  \  
  12.   if( pk!=NULL ) sd_ReleaseKernel(pk);  \  
  13.   return(SD_ERROR);   
  14.   
  15. int main(void)  
  16. {  
  17.   prkernel* pk=NULL;  
  18.   int res,pind,sc,fc1,fc2,i;  
  19.   prbuf* pbin, *pbout;  
  20.   FILE* File;  
  21.   
  22.   /* initialize the PRSD library: pass NULL as we provide the license 
  23.      file in the same directory as the library binary 
  24.    */  
  25.   pk=sd_InitKernel(NULL);  
  26.   if( pk == NULL ) { SD_ABORT(pk); };  
  27.   
  28.   /* initially, the message buffer contais library version information */  
  29.   printf("%s\n",sd_GetErrorMsg(pk));  
  30.   
  31.   /* load the pipeline: returns the pipeline index */  
  32.   pind=sd_LoadPipeline(pk,"fisher_dec.ppl");  
  33.   if( pind==SD_ERROR ) { SD_ABORT(pk); };  
  34.   
  35.   /* print the pipeline name */  
  36.   printf("pipeline name='%s'\n",sd_GetPipelineName(pk,pind));  
  37.   
  38.   /* make sure the pipeline returns decisions */  
  39.   if( sd_GetDecCount(pk,pind)==0 ) {  
  40.     printf("Error: This example assumes that pipeline returns decisions.\n");  
  41.     sd_ReleaseKernel(pk);  
  42.     return(SD_ERROR);  
  43.   }  
  44.   
  45.   /* pipeline input dimensionality */  
  46.   fc1=sd_GetInputFc(pk,pind);  
  47.   
  48.   /* allocate the input buffer for two samples */  
  49.   sc=2;  
  50.   pbin=sd_BufNew(SD_DOUBLE,sc,fc1);  
  51.   
  52.   /* fill-in two samples  
  53.      IMPORTANT: we assume in this example that fc1==2 
  54.    */  
  55.   sd_BufSetValueDouble(pbin, 0, 0, 1.0); /* first sample, first feature */  
  56.   sd_BufSetValueDouble(pbin, 0, 1, 2.0); /* first sample, second feature */  
  57.   sd_BufSetValueDouble(pbin, 1, 0, -5.0); /* second sample, first feature */  
  58.   sd_BufSetValueDouble(pbin, 1, 1, 10.0); /* second sample, second feature */  
  59.   
  60.   /* attach input buffer to the pipeline */  
  61.   res=sd_BufAttachToInput(pk,pind,pbin);  
  62.   if( res==SD_ERROR ) {  
  63.     sd_BufFree(pbin);  
  64.     SD_ABORT(pk);  
  65.   }  
  66.   
  67.   /* pipeline output dimensionality */  
  68.   fc2=1; /* 1D because the pipeline returns decisions */  
  69.   
  70.   /* allocate output buffer */  
  71.   pbout=sd_BufNew(SD_INT,sc,fc2);  
  72.   
  73.   /* attach output buffer to the pipeline */  
  74.   res=sd_BufAttachToOutput(pk,pind,pbout);  
  75.   if( res==SD_ERROR ) {  
  76.     sd_BufFree(pbout);  
  77.     SD_ABORT(pk);  
  78.   }  
  79.   
  80.   /* Execute pipeline */  
  81.   res=sd_Execute(pk,pind);  
  82.   
  83.   /* print out the outputs (we assume 1D output) */  
  84.   for(i=0; i<sc; i++) {  
  85.     res=sd_BufGetValueInt(pbout,i,0); /* decision as integer */  
  86.     printf("out(%d)=%d, '%s'\n",i, res, sd_GetDecName(pk,pind,res) );  
  87.   }  
  88.   
  89.   /* releasing all buffers we allocated ourselves is our responsibility */  
  90.   sd_BufFree(pbin);  
  91.   sd_BufFree(pbout);  
  92.   
  93.   /* release the PRSD library */  
  94.   sd_ReleaseKernel(pk);  
  95.     
  96.   return(0);  
  97. }  

Notes:

  • line 8: The only include needed to use perClass runtime in a custom project is perclass.h header

  • lines 10-13 define abort mechanism in case of error. It prints error message and releases main perClass runtime structure pk.

  • line 25: To use perClass runtime, the library needs to be initialized using sd_InitKernel function. It checks for the license file. If NULL is passed to it, it searches in current directory (directory where the DLL is located). To pass the license in a string, use sd_InitKernelLicString function instead.

  • line 32 loads the pipeline from file and returns pipeline index pind or SD_ERROR.

  • line 36 prints pipeline name. Note that pind pipeline index is used. All the pipeline-handling functions accept pind and thus allow use of multiple pipelines in one session.

  • line 39 asserts that the pipeline returns decisions

  • lines 45-50 prepare the input data buffer pbin. The buffer of SD_DOUBLE type is allocated on line 50 for 2 samples and the number of features read out from the pipeline.

  • lines 55-58 fill the values into the input buffer. Note that this example uses prbuf object and user-friendly functions to manipulate the data. It is also possible to handle input/output buffers using standard mechanisms such as malloc/free or custom allocators and handle memory through low-level pointer access.

  • line 61: input buffer is attached to the pipeline.

  • lines 68-71 prepare the output buffer pbout. Note, that the output buffer is of SD_INT type as our pipeline produces integer decisions. Prior to perClass 3.0, output were always doubles.

  • lines 74 attach the output buffer to the pipeline

  • line 81: pipeline is executed on data from input buffer, results are written to output buffer

  • lines 84-87 print out the decisions. The sd_GetValueInt Note the conversion from integer decision code to decision name.

  • lines 90-94 release the structures. All buffers allocated by us need to be also freed by us explicitly (even if attached to the pipeline).

16.6.2. Using multiple pipelines ↩

libPRSD allows switching between multiple pipelines in one session. A call to sd_LoadPipeline returns a pipeline index pind. All functions operating on a pipeline specify the working pipeline using this index.

16.6.3. Handling decisions ↩

Pipeline may return soft outputs (e.g. probabilities) or decisions. To test if pipeline returns decisions, use sd_getDecCount function. If it returns 0, the pipeline returns soft outputs.

For the complete example, see ex_decisions.c file.

ex_decisions.c example of returning decisions
view plaincopy to clipboardprint?
  1. if( sd_GetDecCount(pk,pind)==0 ) {  
  2.   printf("Expected pipeline returnind decisions. Aborting.\n");  
  3.   SD_ABORT(pk);  
  4. }  

Pipeline decisions are provided as numerical decision codes that need to be translated into decision name if needed. Decision codes preserve the same values that were used when creating the classifier in Matlab.

Let us consider a simple example building a three-class linear classifier:

>> load fruit
>> a
'Fruit set' 260 by 2 sddata, 3 classes: 'apple'(100) 'banana'(100) 'stone'(60) 
>> p=sdlinear(a)*sddecide
sequential pipeline     2x1 'Gauss eq.cov.+Output normalization+Decision'
 1  Gauss eq.cov.           2x3  3 classes, 3 components (sdp_normal)
 2  Output normalization    3x3  (sdp_norm)
 3  Decision                3x1  weighting, 3 classes, 1 ops at op 1 (sdp_decide)

>> sdexport(p,'linear_dec.ppl')
Exporting pipeline for deployment using libPRSD

>> p.list
sdlist (3 entries)
 ind name
   1 apple 
   2 banana
   3 stone 

>> [1 2; -5 10]*p

ans =

 1
 3

The classifier returns one of the three decisions, numerically 1=apple, 2=banana, 3=stone.

To display the pipeline decisions in C as strings, we use the following code:

ex_decisions.c example of returning decisions
view plaincopy to clipboardprint?
  1. /* print out the outputs (we assume 1D output) */  
  2. for(i=0; i<sc; i++) {  
  3.   res=sd_BufGetValueInt(pbout,i,0);  
  4.   
  5.   printf("out(%d): %d='%s'\n",i, res,sd_GetDecName(pk,pind,res) );  
  6. }  

On the line 3, we read the integer pipeline output into res. On the line 5, we print both the numerical decision code res and the corresponding decision name.

The outputs of the C code for the two samples used also in Listing 1 ([1 2; -5 10]) are:

pipeline name='Gauss eq.cov.+Output normalization+Decision'
out(0): 1='apple'
out(1): 3='stone'