- 7.1. Introduction
- 7.2. Working with image grid
- 7.3. Types of local image features
- 7.3.1. Local mean and standard deviation
- 7.3.2. Local histograms
- 7.3.3. Features computed from local histograms
- 7.3.4. Co-occurrence matrices
- 7.4. Propagating image labels
- 7.5. Computing features for image regions
This chapter describes extraction of local image features.
7.1. Introduction ↩
perClass provides a simple-to-use framework for working with local image
information. This significantly reduces time needed to build classifiers in
problems such as medical diagnostics or machine inspection. Typically, we
need to detect a concept such as "disease" in patients' scans based on
local textural and appearance patterns. We start from images with disease
regions annotated by an expert. First, we must extract features in local
image neighborhoods (blocks). Secondly, we need to collect such labeled
information from multiple images because only then our classifiers may
reach the necessary generalization capability. perClass sdextract
command simplifies extraction of local image features.
It takes image data as input and returns a data set with feature vectors computed from image blocks on a regular grid.
Let us take a gray-level microscopic image of a detergent particle:
>> im
16384 by 1 sddata, class: 'unknown'
>> sdimage(im)

By default, sdextract computes mean and variance features in 8x8 pixel
blocks with one pixel step:
>> a=sdextract(im)
14641 by 2 sddata, class: 'unknown'
We may visualize any data set with image data with sdimage:
>> sdimage(a)
Space-bar removes the blue label layer. Using up and down cursor keys, we
may flip between the two features (bands in image a):

Note that the data set a contains less samples than the original image
im. This is because each sample in a represents an image block centered
around it. Therefore, features cannot be computed for border pixels due to
missing data.
7.2. Working with image grid ↩
7.2.1. Computing features on image grid ↩
sdextract computes local image features on a grid defined by two
parameters: block and step. While block specifies the size of the local
neighborhood, step denotes the spacing between two blocks. We may change
these parameters using block and step options:
>> b=sdextract(im,'block',4,'step',2)
3969 by 2 sddata, class: 'unknown'
The block size is important to control the level of local details captured by the features. Setting a larger step is useful to reduce the amount of samples extracted from high-resolution imagery.
Any image, stored in a data set, provides the information on the original
image size and grid settings used to compute the features via the
getiminfo command. Lets display the image info for the original
image im and the data set extracted with sdextract:
>> getiminfo(im)
ans =
imsize: [128 128]
>> getiminfo(b)
ans =
imsize: [128 128]
grid: 1
block: 4
step: 2
gridsize: [63 63]
Specific field may be quickly returned from the image info structure by
providing it in getiminfo:
>> s=getiminfo(im,'imsize')
s =
128 128
7.2.2. Visualization of feature images computed on a grid ↩
Data sets with feature vectors computed on the grid may still be visualized
using the sdimage command.
>> sdimage(b)

Note the gaps between data points. We may wish to remove the grid pattern for the sake of visualization purposes. This may be done using
>> b2=sdimage(b,'grid')
3969 by 2 sddata, class: 'unknown'
>> getiminfo(b2)
ans =
imsize: [63 63]
>> sdimage(b2)

7.3. Types of local image features ↩
sdextract provides several types of local image features. The feature
type may be computed using the 'feat' option.
7.3.1. Local mean and standard deviation ↩
This set of two features is a default setting, accessible also using 'feat','moments'.
7.3.2. Local histograms ↩
Local histogram with hist_bins histogram bins is estimated in each image
block. The histogram is normalized to sum to one. By default, 8 bins are
spread between the min and max value of the input data set. The data range
may be adjusted using data_range option.
>> c=sdextract(im,'feat','hist')
14641 by 8 sddata, class: 'unknown'
>> c=sdextract(im,'feat','hist','hist_bins',16)
14641 by 16 sddata, class: 'unknown'
>> [min(+im) max(+im)]
ans =
0 197
>> c=sdextract(im,'feat','hist','data_range',[0 255])
14641 by 8 sddata, class: 'unknown'
7.3.3. Features computed from local histograms ↩
The 'histfeat' feature set provides five features computed from the local histogram, namely histogram mean, 2nd moment, skewness, kurtosis and entropy.
Similarly to the 'histogram' feature set, we may specify hist_bins and
data_range.
7.3.4. Co-occurrence matrices ↩
Co-occurrence matrix is a two-dimensional histogram estimating probability that a pixel has a specific gray-level while a displaced pixel exhibits another gray-level. Co-occurrence matrix encodes structural information which is useful for derivation of informative data representation in texture classification problems.
This feature extractor has three parameters. The first is the number of
gray-level bins considered 'cmbins'. Output data set will contain square
of the 'cmbins' features. The default value is 8, leading to 64 features.
It is often useful to reduce the number of bins so that the co-occurrence
matrices are better filled with values. Second parameter is the
displacement distance 'cmdispl' denoting the number of pixels between the
pixel pairs used to fill the 2D histogram. By default, 'cmdispl' is 1. The
value cannot grow higher than block size. Increasing the displacement value
reduces the number of pixel pairs and thus the amount of information in the
co-occurrence matrix. Finally, being a histogram, the option data_range
controls the known range of values. Similarly to histogram features above,
the default is minimum and maximum of the image data.
>> c=sdextract(im,'feat','cm')
14641 by 64 sddata, class: 'unknown'
>> c=sdextract(im,'feat','cm','cm_bins',4,'block',16)
12769 by 16 sddata, class: 'unknown'
>> +c(1)
ans =
Columns 1 through 7
0.5417 0.1708 0.0063 0 0.1708 0.0833 0.0104
Columns 8 through 14
0 0.0063 0.0104 0 0 0 0
Columns 15 through 16
0 0
We may reshape the values to see the 2D co-occurrence histogram:
>> reshape(+c(1),[4 4])
ans =
0.5417 0.1708 0.0063 0
0.1708 0.0833 0.0104 0
0.0063 0.0104 0 0
0 0 0 0
7.4. Propagating image labels ↩
When applied to sddata data set with image data, sdextract
propagates image labels and all sample properties to the output data.
We may, for example, define labels my hand, painting directly in the sdimage
figure. In this example, we painted background, particle and "interesting
texture" regions. We save the data set in Matlab workspace using Image menu:
>> sdimage(im)

>> Creating data set data2 in the workspace.
16384 by 1 sddata, 3 classes: 'background'(7629) 'particle'(7760) 'interesting texture'(995)
We may now compute the features on data set data2. The class labels get
propagated to the pixels that serve as block centers in feature
computation:
>> a=sdextract(data2,'block',4,'feat','histfeat')
15625 by 5 sddata, 3 classes: 'background'(7517) 'particle'(7113) 'interesting texture'(995)
>> sdimage(a)

7.5. Computing features for image regions ↩
sdextract supports computation of features in user-defined image
regions. This is typically used in situation where the image covers larger
area that the object of interest. Using external classifier or image
processing techniques, we may often easily distinguish foreground and
background. For example, separate the object from the conveyor belt. We may
compute the features only on the object, not waste time on the background
area.
7.5.1. Defining a mask matrix ↩
One way to limit the feature extraction is to provide a mask matrix with the same size as the original image. Let us create a mask matrix and fill the region of interest with ones:
>> m=zeros(getiminfo(im,'imsize'));
>> m(20:100,30:100)=1;
>> d=sdextract(im,'mask',m)
4736 by 2 sddata, class: 'unknown'
>> sdimage(d)

Masking operation has a user-defined parameter called 'mask_frac'
specifying what must be the minimum fraction of a mask in an image block to
accept the block into the sdextract output. By default, 'mask_frac' is 1
which means that only blocks fully inside the mask region are included.
7.5.2. Computing features on a data set subset ↩
Because any subset of a data set is itself an image, sdextract may be
applied to it. This includes, for example, class-specific regions or
regions defined by classifier decisions.
What happens on the region boundaries? perClass takes a simple approach: Features are only computed in local blocks that do not contain holes. This is important because if blocks contain holes, the extracted features may be entirely uninformative and introduce unnecessary noise in the classifier training.
Lets take the example where we compute features on the "interesting
texture" regions in data set a:
>> data2
16384 by 1 sddata, 3 classes: 'background'(7629) 'particle'(7760) 'interesting texture'(995)
Let us first extract features from the entire image and then visualize the
feature image for the "interesting texture" class only. Note that we use a
regular expression /int to retrieve this class by substring:
>> out1=sdextract(data2)
14641 by 2 sddata, 3 classes: 'background'(7330) 'particle'(6316) 'interesting texture'(995)
>> sdimage(out1(:,:,'/int'))

Alternatively, we may first extract the class and then run sdextract on this image subset:
>> input=data2(:,:,'/int')
995 by 1 sddata, class: 'interesting texture'
>> out2=sdextract(input)
372 by 2 sddata, class: 'interesting texture'
>> sdimage(out2)

Note that in the second case, we get less output samples because only 372 image blocks are entirely contained in the class region.
