Improvement Algorithm for CNN Image Classification. Part 1 . Image Feature Extraction

Alexander Laskorunsky
5 min readJul 25, 2019

I want to show some effective ways to extract features from images using the power of numpy vectorization.

As subject for experiments I have chosen the Fashion MNIST dataset.

Dataset consists of:

  • 60000 and 10000 train/test images;
  • each image has 28x28px dimension;
  • and they represent 10 different types of clothes.

As picture train dataset could be represented as 60000 sheets of square-shaped paper (28х28 dim) of 10 different colors that are put each on other:

Vectorized approach of data manipulating allows us to bypass 60000 “for” loops and extract the same number of values through 1 operation.

But before start we are going to:

  • reshape our dataset into (28, 28, 60000) shape;
  • initialize some variables: “m” as number of images, “h” and “w” as height and width of image;
  • create numpy array to store new features;
  • divide all input pixel values by 255.

The first two features will be the coefficients of variation (CV) of diagonal and anti-diagonal pixels of image:

This scheme can help you to understand better what exact features were obtained:

  • take 28 values of diagonal pixels (and anti-diagonal) of each image (28, 60000);
  • calculate standard deviation (std) of these 28 diagonal values;
  • calculate mean of these 28 diagonal values;
  • divide std by mean and get 1 value (CV) for each image (1, 60000);
  • store these 60000 values as one column (feature).

To see the physical meaning of these features we can do the next:

From the graph we can understand what pixels we gonna take to calculate the first feature (CV of diagonal of all images).

But it has not been yet clear how it looks like from machine learning point, so we can plot another graph:

As we see, only 2 columns allow us to separate “ankle boots” and almost all “sandals” from other classes.

Let’s add some other features.

The idea of this feature is to define “shapes” of pictures: “trousers” have vertical oriented rectangular shape, while sneakers — horizontal.

Because of well-prepared pictures (non-rotated, non-skewed, etc.) we’ve got clear separation.

Plotting full calculations for these two classes will show the power of this feature for such type of pictures.

Calculate this feature for all dataset:

The next 4 features will be represented by calculations of 4 quadrants ([:14, :14], [:14, 14:], [14:, 14:], [14:, :14]) of each of 60000 images.

The calculations are the following:

  • split our dataset for 4 sub-datasets with (14, 14, 60000) shape;
  • calculate for each quadrant standard deviation of all 60000 images;
  • then, in the same way get mean of values of quadrants;
  • and finally, divide std by the mean and get CV.

Looking at the picture above we can write code for vectorized calculations:

In the same way we can calculate the CV for diagonal and anti-diagonals of this quadrants and CV of all its sum:

- split our dataset for 4 sub-datasets with (14, 14, 60000) shape (done above);
- take two diagonals of each image from each quadrant;
- calculate standard deviation for each diagonal ;
- get means of two diagonals.
- divide STDs by the means.
- then, sum std of diagonal and anti-diagonal, sum mean, divide them and get CV.

Thus, we gonna get new 12 features.

To have a clear understanding what has to be calculated, I created a toy example with two “images” 4x4.

To get these features for all the dataset we need few lines of code:

Here and further, time to time, we will receive Nan values because of dividing by zero: in these calculations some quadrants of some images are empty and the means of their pixels equal zeros. To cover this issue, we will use “np.nan_to_num” to convert Nans to zeros after all calculations.

To view features in 3D view mode we can use “mayavi” library. Here I will add video but feel free to read my other article with links to already made mayavi 3D visualization tool.

In the video we can find that step by step, each next added feature tears apart all classes from each other: some are well separated, some are not yet. To explore data in high-dimensional space you can use Google open-source library: https://experiments.withgoogle.com/visualizing-high-dimensional-space.

In fact, we can get tons of features from images: sometimes it could be more useful than deploying CNN networks, because thus you can extract some specific features that will speed up your neural network convergence. For example, to classify objects by shape (circle, square, rectangle, etc.) you may need to identify just few specific features to achieve good results.

Let’s move on. These 28 features are very “simple” to get:

  • calculate std of rows and columns and sum it;
  • calculate mean of rows and columns — sum it;
  • divide std by mean and get CV.

To calculate this, we need 1 line of code:

The next 150 features are not fully vectorized because of the their extraction specifics. So, I’ll give the code how to get them and you can study it to understand what exactly is calculated. Also, I’ll be appreciated for any suggestions how to make them totally vectorized.

Anyway, their sense may not be so important, because the main goal is to increase prediction accuracy of the classification model that will use them. How much are these features useful for this task we’ll consider in the second part of this article.

Here you can download the full code to extract features. Create experiments and get your own powerful features that can improve classification tasks. Don’t forget to share them in comments!

--

--