Cat vs. Dog Classifier

My Role

AI Engineer – Computer Vision Pipeline Specialist

API Automation: Engineering Kaggle API handshake in Colab environment
Directory Mapping: Designing resilient path-finder logic for nested structures
Data Augmentation: Implementing real-time image transformations (zoom, shear, flip)
Preprocessing Pipeline: Creating ImageDataGenerator workflows for pixel normalization
Dataset Management: Processing thousands of labeled images for neural network training

Key Features & Code Logic

Resilient Directory Detection: Dynamic path-finder logic for nested folder structures
Real-Time Data Augmentation: Image transformations to prevent overfitting
Batch Processing: Efficient learning with BATCH_SIZE = 32 for memory management
Binary Classification: Optimized for cat vs. dog distinction with class_mode='binary'
Pixel Normalization: Rescaling (1./255) for faster neural network convergence

GitHub Repository

This project builds a deep learning model capable of binary image classification. By processing thousands of labeled images, the AI learns to identify the unique visual features (ears, whiskers, snout shapes) that distinguish a feline from a canine.

The current focus of the notebook is establishing a Robust Data Pipeline—automating the retrieval of images from Kaggle and preparing them for the neural network through advanced preprocessing and augmentation techniques.

The project implements a comprehensive computer vision pipeline:

Data Ingestion: Automated Kaggle API integration for dataset retrieval
Directory Management: Resilient path detection for nested folder structures
Image Augmentation: Real-time transformations (shear, zoom, flip) for dataset variety
Preprocessing: Pixel normalization and batch processing for neural networks
Feature Learning: Focus on visual patterns (ears, whiskers, snout shapes)
Memory Optimization: Batch processing with size 32 for efficient training

Technologies Used

TensorFlow & Keras – Deep Learning framework
ImageDataGenerator – Real-time image augmentation
Kaggle API – High-speed dataset ingestion
OS Library – Directory traversal and file management

Python 3 – Core workflow orchestration
Google Colab – Cloud development environment
NumPy – Numerical computations for image processing
Computer Vision – Image classification techniques

Technical Features

Resilient Path Detection: Handles nested directory structures
Automated Dataset Retrieval: Kaggle API integration
Advanced Augmentation: Shear, zoom, flip transformations
Pixel Normalization: Rescaling to 0-1 range for faster training

Batch Processing: Memory-efficient training with batches of 32
Binary Classification: Optimized for cat vs. dog distinction
Overfitting Prevention: Augmentation for better generalization
Production Pipeline: End-to-end data processing workflow

Project Status & Next Steps

Current Status: Data pipeline established - ready for neural network training
Action Required: Manual upload of kaggle.json authentication file
Next Phase: Define CNN layers (Convolution, Pooling, Dense layers)
Future Development: Model training, evaluation, and deployment
Application Potential: Foundation for various computer vision applications

View GitHub Repository