Convolutional Neural Networks (CNNs) have become the standard architecture for image recognition tasks. This section covers five projects that progress from a simple two-class classifier up to a full multi-class agricultural identification system built with an Artificial Neural Network (ANN). Each project demonstrates a distinct dataset, framework choice (TensorFlow/Keras or PyTorch), and classification scenario, giving a practical tour of the computer-vision landscape in the repository.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/dronabopche/100-ML-AI-Project/llms.txt
Use this file to discover all available pages before exploring further.
Deep learning training is computationally intensive. A GPU is strongly recommended for all five projects. Install the appropriate CUDA toolkit alongside your framework of choice:
- TensorFlow –
pip install tensorflow(GPU support viatensorflow[and-cuda]on Linux) - PyTorch – follow the official PyTorch installation guide and select the CUDA version matching your driver
Project comparison
| Project | Architecture | Classes | Dataset | Accuracy |
|---|---|---|---|---|
| Binary Image Classification (30) | Custom CNN | 2 | Binary image dataset (Kaggle) | — |
| Food Image Classification (31) | Custom CNN / Transfer Learning | Multi-class (food categories) | Food image dataset (Kaggle) | — |
| CIFAR-10 Classification (32) | Custom CNN (PyTorch) | 10 | CIFAR-10 (60 000 images) | — |
| MNIST Digit Classification (33) | CNN / Dense ANN | 10 | MNIST (70 000 grayscale images) | — |
| Date Fruit Classification (12) | ANN (PyTorch) | 7 | UCI Date Fruit dataset (898 rows, 34 features) | — |
Loading and running a trained model
The pattern below applies to any Keras/TensorFlow model saved in the repository’sModels/ directory. For PyTorch projects, see the PyTorch variant beneath it.
30 – Binary Image Classification
30 – Binary Image Classification
What the project does
Trains a CNN to distinguish between exactly two image classes (e.g. cats vs. dogs, or any binary domain). This is the entry-level vision project in the repository and provides a clean, well-commented baseline for understanding CNN construction, binary cross-entropy loss, and sigmoid activation at the output layer.Algorithm used
A custom CNN with stackedConv2D → MaxPooling2D → Dropout blocks, a Flatten layer, and a single-neuron sigmoid output. The architecture is deliberately lightweight to keep training time short on CPU.Dataset / domain
A binary image dataset sourced from Kaggle. Images are organised into two class folders underdataset/train/ and dataset/test/ following the Keras ImageDataGenerator directory convention.Key techniques
- Data augmentation – random horizontal flips, zoom, and rotation via
ImageDataGeneratorto reduce overfitting on small datasets. - Binary cross-entropy loss with
sigmoidoutput activation. - Callbacks –
EarlyStoppingandModelCheckpointto save the best epoch. - Evaluation – accuracy, precision, recall, and a confusion matrix on the test set.
How to run
31 – Food Image Classification
31 – Food Image Classification
What the project does
Classifies food images into multiple cuisine or dish categories. The project explores both a custom CNN trained from scratch and optional transfer learning from a pre-trained backbone (e.g. MobileNetV2 or VGG16), demonstrating how feature reuse from ImageNet accelerates convergence on small domain-specific datasets.Algorithm used
Custom CNN (multi-class withsoftmax output) and optionally a transfer-learning variant using a frozen pre-trained base model with a custom classification head.Dataset / domain
A food image dataset sourced from Kaggle with multiple dish classes. Images are stored in class-labelled subdirectories underdataset/.Key techniques
- Transfer learning – freeze the convolutional base of a pre-trained model; fine-tune the top layers on food images.
- Categorical cross-entropy loss with
softmaxoutput. - Class imbalance handling –
class_weightargument inmodel.fit()to up-weight underrepresented food categories. - Top-5 accuracy – additional metric alongside top-1 accuracy for multi-class evaluation.
How to run
32 – CIFAR-10 Image Classification
32 – CIFAR-10 Image Classification
What the project does
Implements a CNN on the canonical CIFAR-10 benchmark — 60 000 colour images (32 × 32 px) spread evenly across 10 object classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. This project is the only one in the vision section that uses PyTorch rather than TensorFlow/Keras.Algorithm used
A custom CNN built withtorch.nn modules: Conv2d → BatchNorm2d → ReLU → MaxPool2d blocks followed by fully connected layers and a 10-class softmax. Optimisation uses torch.optim (SGD or Adam).Dataset / domain
CIFAR-10 downloaded automatically viatorchvision.datasets.CIFAR10. The dataset is ~170 MB and is cached under dataset/ after the first download.Key techniques
- Normalisation – pixel values scaled to
[-1, 1]usingtransforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)). - DataLoader – batched loading with shuffle for training, deterministic order for evaluation.
- Learning-rate scheduling –
StepLRorCosineAnnealingLRto decay the learning rate during training. - GPU support – move tensors to
device = torch.device("cuda" if torch.cuda.is_available() else "cpu").
How to run
33 – MNIST Digit Classification
33 – MNIST Digit Classification
What the project does
Recognises handwritten digits (0–9) from the MNIST dataset — the “Hello World” benchmark of deep learning. Despite its simplicity, this project provides a rigorous demonstration of CNN design, dropout regularisation, and evaluation via a confusion matrix across 10 digit classes.Algorithm used
A CNN (or optionally a dense ANN as a baseline comparison) withConv2D → MaxPooling2D → Dropout blocks, trained on grayscale 28 × 28 images. The output layer uses softmax over 10 digit classes.Dataset / domain
MNIST — 70 000 grayscale images (60 000 train / 10 000 test), loaded directly viakeras.datasets.mnist or torchvision.datasets.MNIST. No external download is required.Key techniques
- Grayscale normalisation – pixel values divided by 255 to map to
[0, 1]. - Dropout regularisation – reduces overfitting on the relatively small MNIST images.
- Batch normalisation – accelerates convergence and improves generalisation.
- Confusion matrix – per-class breakdown of correct and misclassified digits.
How to run
12 – Date Fruit Classification
12 – Date Fruit Classification
What the project does
Classifies seven varieties of date fruit (BERHI, DOKOL, SAFAVI, ROTANA, DEGLET, SOGAY, IRAQI) from 34 morphological and colour features extracted from fruit images. Unlike the other vision projects, the classification here operates on pre-extracted tabular features (area, perimeter, colour statistics, wavelet coefficients) rather than raw pixel data, making it a bridge between classical ML and deep learning.Algorithm used
An Artificial Neural Network (ANN) built in PyTorch: two hidden layers of 64 neurons each with ReLU activation, trained withCrossEntropyLoss and the Adam optimiser. Input dimensionality equals 34 features; output is a 7-class probability distribution.Dataset / domain
dataset/datefruit_dataset.csv — 898 rows × 35 columns (34 numeric features + Class label). The seven classes are distributed as: DOKOL (204), SAFAVI (199), ROTANA (166), DEGLET (98), SOGAY (94), IRAQI (72), BERHI (65).Key techniques
- Label encoding –
sklearn.preprocessing.LabelEncodermaps string class names to integer indices. - Train/test split – 80 / 20 stratified split via
train_test_split. - Feature scaling –
StandardScalerfitted on the training set and applied to both splits. - TensorDataset + DataLoader – wraps NumPy arrays as PyTorch tensors for mini-batch training (batch size 32).
- Training loop – manual epoch loop with loss logging; validation accuracy evaluated after each epoch.
How to run
dataset/datefruit_dataset.csv relative to the notebook.