Image Classification with a Neural Network
Image classification has numerous real-world applications, ranging from object detection in
self-driving cars, to medical image analysis, and even identifying galaxies in astronomical images.
With the help of neural networks, we can develop models that can automatically learn and improve
their performance in identifying patterns and features within images, leading to highly accurate
predictions.
Throughout this project, we will guide you through the entire process of building, training, and
evaluating a neural network for image classification. You will gain hands-on experience working with
popular deep learning frameworks, such as TensorFlow and Keras, to create a powerful image
classification model. We will also explore techniques for data preprocessing, model optimization,
and performance evaluation, ensuring that you develop a comprehensive understanding of the various
aspects involved in image classification.
The code establishes a convolutional neural network (CNN) for image classification using TensorFlow
and Keras.
It employs an ImageDataGenerator to preprocess and augment images, enhancing the model's robustness.
The CNN architecture consists of convolutional and dense layers, offering an effective framework for
recognizing patterns in images. The model is trained on a dataset containing classes like 'dog,'
'strawberry,' and 'something else' using a generator to handle batch loading. Throughout 30 epochs,
the code visualizes the training progress, presenting accuracy and loss curves, and concludes by
making predictions on a new image using the trained model.
Importing Libraries
Import necessary libraries for building and training the neural network, data augmentation, and
visualization.
from tensorflow.keras import layers, models from tensorflow.keras.preprocessing.image import ImageDataGenerator import matplotlib.pyplot as plt from tensorflow.keras.preprocessing import image import numpy as np
ImageDataGenerator for Data Augmentation
Use ImageDataGenerator to perform data augmentation and preprocessing. This includes rescaling pixel
values, shearing, zooming, horizontal flipping, and more.
validation_datagen = ImageDataGenerator( rescale=1. / 255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, validation_split=0.2, rotation_range=40, width_shift_range=0.2, height_shift_range=0.2, brightness_range=[0.8, 1.2], fill_mode='nearest' )
Creating Data Generators
Use the data generators to create batches of augmented images for training and validation. Specify
the target size, batch size, class mode, and training/validation subset.
train_generator = validation_datagen.flow_from_directory( train_dir, target_size=(28, 28), batch_size=32, class_mode='categorical', subset='training' ) validation_generator = validation_datagen.flow_from_directory( train_dir, target_size=(28, 28), batch_size=32, class_mode='categorical', subset='validation' )
Building the Neural Network Model
Create a sequential model with convolutional and pooling layers followed by dense layers. The output
layer has three nodes corresponding to the classes: 'Dog', 'Strawberry', and 'Something Else'.
model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Flatten()) model.add(layers.Dense(128, activation='relu')) model.add(layers.Dense(3, activation='softmax'))
Compiling the Model
Compile the model using the Adam optimizer, categorical cross-entropy loss (suitable for multi-class
classification), and accuracy as the metric.
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
Training the Model
Train the model using the data generator for a specified number of epochs. The training progress
is stored in the history object.
history = model.fit(train_generator, epochs=30, validation_data=validation_generator)
Using the Trained Model for Prediction
Load an image, convert it to an array, normalize, and use the trained model to predict its class.
Display the predicted class.
guess_image_path = [image location] img = image.load_img(guess_image_path, target_size=(28, 28)) img_array = image.img_to_array(img) img_array = np.expand_dims(img_array, axis=0) img_array /= 255.0
Printing the Predicted Class
Print the final predicted class based on the model's output and the class names.
class_names = ['Dog', 'Strawberry', 'Something Else'] predicted_class_index = np.argmax(prediction) predicted_class = class_names[predicted_class_index] print(f"The model predicts: {predicted_class}")
Data set
The testing dataset comprised two image sets: one containing pictures with a dog (~2500 files) and the
other featuring
strawberries (~2100 files).
Target image
The target dataset, consisting of approximately 4500 files, was categorized into 'dog,'
'strawberry,'
and 'something else.' Remarkably, the model achieved a 97% accuracy rate in correctly classifying
the images across multiple runs.
Visualize Training Results
Plot training and validation accuracy and loss to visualize how the model is learning over epochs.
Below is the full code with additional comments embedded.
from tensorflow.keras import layers, models from tensorflow.keras.preprocessing.image import ImageDataGenerator import matplotlib.pyplot as plt from tensorflow.keras.preprocessing import image import numpy as np # Define the paths to the dataset folders train_dir = [folder location] # Use ImageDataGenerator for data augmentation and preprocessing validation_datagen = ImageDataGenerator( rescale=1. / 255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, validation_split=0.2, rotation_range=40, # Rotation width_shift_range=0.2, # Horizontal shift height_shift_range=0.2, # Vertical shift brightness_range=[0.8, 1.2], # Brightness changes fill_mode='nearest' # Fill in missing pixels with the nearest value ) # Create a data generator for training train_generator = validation_datagen.flow_from_directory( train_dir, target_size=(28, 28), # Adjust the target size batch_size=32, class_mode='categorical', subset='training' # Use training subset ) # Create a data generator for validation validation_generator = validation_datagen.flow_from_directory( train_dir, target_size=(28, 28), batch_size=32, class_mode='categorical', subset='validation' # Use validation subset ) # Build the neural network model model = models.Sequential() # Convolutional layers model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) # Flatten the output for dense layers model.add(layers.Flatten()) # Dense layers model.add(layers.Dense(128, activation='relu')) model.add(layers.Dense(3, activation='softmax')) # Three classes: dog, strawberry, something else # Compile the model model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) # Print the summary of the model model.summary() # Train the model using the data generator history = model.fit(train_generator, epochs=30, validation_data=validation_generator) # Visualize training results acc = history.history['accuracy'] val_acc = history.history['val_accuracy'] loss = history.history['loss'] val_loss = history.history['val_loss'] epochs = range(1, len(acc) + 1) plt.figure(figsize=(12, 4)) plt.subplot(1, 2, 1) plt.plot(epochs, acc, 'b', label='Training accuracy') plt.plot(epochs, val_acc, 'g', label='Validation accuracy') plt.title('Training and Validation accuracy') plt.xlabel('Epochs') plt.ylabel('Accuracy') plt.legend() plt.subplot(1, 2, 2) plt.plot(epochs, loss, 'r', label='Training loss') plt.plot(epochs, val_loss, 'm', label='Validation loss') plt.title('Training and Validation loss') plt.xlabel('Epochs') plt.ylabel('Loss') plt.legend() plt.tight_layout() plt.show() # Use the trained model to make predictions on a new image guess_image_path = [image location] img = image.load_img(guess_image_path, target_size=(28, 28)) img_array = image.img_to_array(img) img_array = np.expand_dims(img_array, axis=0) img_array /= 255.0 prediction = model.predict(img_array) class_names = ['Dog', 'Strawberry', 'Something Else'] predicted_class_index = np.argmax(prediction) predicted_class = class_names[predicted_class_index] print(f"The model predicts: {predicted_class}")