Cat vs. Dog Classifier

Image
Image
Image

My Role

AI Engineer – Computer Vision Pipeline Specialist

  • API Automation: Engineering Kaggle API handshake in Colab environment
  • Directory Mapping: Designing resilient path-finder logic for nested structures
  • Data Augmentation: Implementing real-time image transformations (zoom, shear, flip)
  • Preprocessing Pipeline: Creating ImageDataGenerator workflows for pixel normalization
  • Dataset Management: Processing thousands of labeled images for neural network training

Key Features & Code Logic

  • Resilient Directory Detection: Dynamic path-finder logic for nested folder structures
  • Real-Time Data Augmentation: Image transformations to prevent overfitting
  • Batch Processing: Efficient learning with BATCH_SIZE = 32 for memory management
  • Binary Classification: Optimized for cat vs. dog distinction with class_mode='binary'
  • Pixel Normalization: Rescaling (1./255) for faster neural network convergence

This project builds a deep learning model capable of binary image classification. By processing thousands of labeled images, the AI learns to identify the unique visual features (ears, whiskers, snout shapes) that distinguish a feline from a canine.

The current focus of the notebook is establishing a Robust Data Pipeline—automating the retrieval of images from Kaggle and preparing them for the neural network through advanced preprocessing and augmentation techniques.

The project implements a comprehensive computer vision pipeline:

  1. Data Ingestion: Automated Kaggle API integration for dataset retrieval
  2. Directory Management: Resilient path detection for nested folder structures
  3. Image Augmentation: Real-time transformations (shear, zoom, flip) for dataset variety
  4. Preprocessing: Pixel normalization and batch processing for neural networks
  5. Feature Learning: Focus on visual patterns (ears, whiskers, snout shapes)
  6. Memory Optimization: Batch processing with size 32 for efficient training

Technologies Used

  • TensorFlow & Keras – Deep Learning framework
  • ImageDataGenerator – Real-time image augmentation
  • Kaggle API – High-speed dataset ingestion
  • OS Library – Directory traversal and file management
  • Python 3 – Core workflow orchestration
  • Google Colab – Cloud development environment
  • NumPy – Numerical computations for image processing
  • Computer Vision – Image classification techniques

Technical Features

  • Resilient Path Detection: Handles nested directory structures
  • Automated Dataset Retrieval: Kaggle API integration
  • Advanced Augmentation: Shear, zoom, flip transformations
  • Pixel Normalization: Rescaling to 0-1 range for faster training
  • Batch Processing: Memory-efficient training with batches of 32
  • Binary Classification: Optimized for cat vs. dog distinction
  • Overfitting Prevention: Augmentation for better generalization
  • Production Pipeline: End-to-end data processing workflow

Project Status & Next Steps

  • Current Status: Data pipeline established - ready for neural network training
  • Action Required: Manual upload of kaggle.json authentication file
  • Next Phase: Define CNN layers (Convolution, Pooling, Dense layers)
  • Future Development: Model training, evaluation, and deployment
  • Application Potential: Foundation for various computer vision applications