Sign in

Dog Breed Classifier

A dog identification app using machine learning

Photo by Alice Castro from Pexels

Project Definition

Project Overview

In this project I will develop ideas for a dog identification app using deep learning concepts. The software is intended to accept any user-supplied image as input. If a dog is detected in the image, it will provide an estimate of the dog’s breed. If a human is detected, it will provide an estimate of the dog breed that is most resembling.

Problem Statement

As described before, the goal of this project is to make the first steps towards developing an algorithm that could be used as part of a mobile or web app for dog breed recognition/classification. At the end of this project, the code will accept any user-supplied image as input. If a dog is detected in the image, it will provide an estimate of the dog’s breed. If a human is detected, it will provide an estimate of the dog breed that is most resembling.

  • Step 1: Detect Humans
  • Step 2: Detect Dogs
  • Step 3: Create a CNN to Classify Dog Breeds (from Scratch)
  • Step 4: Use a CNN to Classify Dog Breeds (using Transfer Learning)
  • Step 5: Create a CNN to Classify Dog Breeds (using Transfer Learning)
  • Step 6: Write your Algorithm
  • Step 7: Test Your Algorithm

Metrics

The accuracy of the model is expected to achieve no less than 80%. Here, the loss function is computed as the crossentropy loss between the labels and predictions. The acccuracy measures the frequency with which the predicted breed type (prediction) matches the true breed type (the label).

Analysis

Data Exploration

Here, the dataset of dog images are imported. A few variables are created to save the data:

  • train_targets, valid_targets, test_targets — numpy arrays containing onehot-encoded classification labels
  • dog_names — list of string-valued dog breed names for translating labels
  • human_files — numpy array containing file paths of human images
There are 133 total dog categories.
There are 8351 total dog images.

There are 6680 training dog images.
There are 835 validation dog images.
There are 836 test dog images.
There are 13233 total human images.

Data Visualization

A sample visualization of the data set is shown below:

Sample human images
Sample dog images
Number of images by breed in the training set
Image height and width distribution
Height/Width distribution of the images

Methodology

Data Preprocessing

When using TensorFlow as backend, Keras CNNs require a 4D array (which we’ll also refer to as a 4D tensor) as input, with shape

Implementation

The implementation follows the steps designed earlier in the problem statement section. The details of each part is described in more detail as follows:

Accuracy history of training and validation data

Refinement

Use a CNN to Classify Dog Breeds (using transfer learning)

Accuracy history of training and validation data using VGG-16
Accuracy history of training and validation data using ResNet-50

Results

Model evaluation and validation

In this step, we write an algorithm that takes an image of a dog or a human and outputs a dog breed that closely resembles the dog or human. The algorithm is as follows:

Justification

The dogs were correctly identified as dogs. The breed looks a little bit off — it is possible that including more training data or augmenting the dataset would increase the test accuracy thus increase the prediction. But in general the prediction is pretty close. The human face were all detected as human faces. The predicted dog breed type is also pretty close. Again, there are still space for the improvement of the model, which will be discussed next.

Conclusion

Reflection

In this project, several approaches are developed for the development of an app for the identification of dog breeds. We achieved our best results with the application of a transfer learning model. An accuracy of 84.21% is achieved in our tests.

Lesson Learned

In this project, we learned how to build convolution networks from scratch, which was a very educational undertaking. This project has a very smooth design for me to follow each step. The tools and algorithms (keras ResNet, VGG-16, etc) embedded in this project give me great insights into usage of a broader range of built-in machine learning models.

Challenges

The challenges that I faced in this project is not having enough background knowledge of all the newly developed the machine learning algorithms and tools. Therefore, building the model from scratch took me some time.

Improvement

Despite the decent test accuracy, there are several options to further improve the model/algorithm:

  • add more layers to make our model more complex and more powerful.
  • build a simple web app to leverage the model and predict breeds through user-input images.

Get the Medium app