Model architecture
The model implements a sequential architecture with 21 layers organized into convolutional blocks, pooling layers, and fully connected layers.Input layer
The model expects preprocessed images with specific dimensions to ensure accurate predictions.
- Input shape:
[75, 100, 3](height × width × channels) - Data type:
float32 - Color format: RGB channels (channels-last format)
- Preprocessing: Images are resized using nearest neighbor interpolation and converted to float tensors
Convolutional blocks
The architecture consists of two main convolutional blocks followed by dense layers.Block 1: Feature extraction (6 Conv2D layers)
Block 1: Feature extraction (6 Conv2D layers)
Layers 1-6: Initial feature extraction with 64 filters each
- Filters: 64 per layer
- Kernel size: 3×3
- Strides: (1, 1)
- Padding:
- Layers 1-5:
same(maintains spatial dimensions) - Layer 6:
valid(reduces dimensions)
- Layers 1-5:
- Activation: ReLU (applied via separate Activation layers)
- Kernel initializer: Glorot Uniform (Xavier initialization)
- Bias initializer: Zeros
Block 2: Deep feature extraction (2 Conv2D layers)
Block 2: Deep feature extraction (2 Conv2D layers)
Layers 7-8: Deeper feature extraction with increased filter depth
- Filters: 128 per layer
- Kernel size: 3×3
- Strides: (1, 1)
- Padding:
- Layer 7:
same - Layer 8:
valid
- Layer 7:
- Activation: ReLU (via separate Activation layers)
- Kernel initializer: Glorot Uniform
- Bias initializer: Zeros
Pooling and regularization layers
MaxPooling2D layers (2 instances):- Pool size: 2×2
- Strides: (2, 2)
- Position: After each convolutional block
- Purpose: Spatial dimension reduction and translation invariance
- After Block 1: 25% dropout rate
- After Block 2: 25% dropout rate
- Before output layer: 50% dropout rate
- Purpose: Prevent overfitting during training
Dropout layers are typically disabled during inference, so they don’t affect prediction performance in production.
Fully connected layers
Flatten layer:- Converts 3D feature maps to 1D feature vector
- Positioned between convolutional blocks and dense layers
- Units: 512 neurons
- Activation: ReLU (via separate Activation layer)
- Purpose: High-level feature combination and representation learning
- Units: 7 neurons (one per class)
- Activation: Softmax (via separate Activation layer)
- Output: Probability distribution across 7 skin lesion types
Output classes
The model performs 7-class classification for the following skin lesion types:- Actinic Keratoses
- Basal Cell Carcinoma
- Benign Keratoses
- Dermatofibroma
- Melanoma
- Melanocytic Nevus
- Vascular Lesion
Model specifications
Model format
TensorFlow.js Layers Model
Total size
99.3 MB (25 weight shards)
Framework
Keras v2.8.0 / TensorFlow backend
Converter
TensorFlow.js Converter v3.19.0
Weight distribution
- Format: Binary shards for efficient loading
- Shards: 25 files (group1-shard1of25.bin through group1-shard25of25.bin)
- Shard size: ~4 MB per shard
- Loading: Sequential download managed by TensorFlow.js
Hyperparameters
Key hyperparameters extracted from the model configuration:| Parameter | Value | Description |
|---|---|---|
| Input dimensions | 75×100×3 | Height, width, and RGB channels |
| Conv filters (early) | 64 | First convolutional block |
| Conv filters (deep) | 128 | Second convolutional block |
| Kernel size | 3×3 | All convolutional layers |
| Pool size | 2×2 | MaxPooling layers |
| Hidden units | 512 | Fully connected layer |
| Dropout rates | 0.25, 0.25, 0.5 | After each block and before output |
| Output units | 7 | Number of diagnostic classes |
Initialization strategies
Glorot Uniform (Xavier) for convolutional and dense layer kernels:- Draws weights from uniform distribution:
[-limit, limit] limit = sqrt(6 / (fan_in + fan_out))- Maintains variance across layers during forward and backward passes
- All bias values initialized to 0
- Allows symmetric learning in early training epochs
Model loading
The model is loaded asynchronously in the browser using TensorFlow.js:Configuration files
- model.json: Contains architecture definition and weight manifest
- Weight shards: Binary files containing trained parameters
- Format: Layers model (supports full Keras Sequential API)
Technical considerations
Why VGG-16 architecture?
Why VGG-16 architecture?
VGG-16 is well-suited for medical image classification due to:
- Small receptive fields: 3×3 kernels capture fine-grained details in skin lesions
- Deep architecture: Multiple layers learn hierarchical features from textures to patterns
- Proven performance: VGG-style networks excel at visual recognition tasks
- Browser compatibility: Architecture runs efficiently with TensorFlow.js
Data format and preprocessing
Data format and preprocessing
Input images undergo standardized preprocessing:
- Resize to 75×100 pixels using nearest neighbor interpolation
- Convert to float32 tensor
- Expand dimensions to create batch of size 1:
[1, 75, 100, 3] - Feed to model for inference
Activation function choices
Activation function choices
The model uses two activation functions strategically:
- ReLU (Rectified Linear Unit): Applied after convolutional and hidden dense layers
- Introduces non-linearity:
f(x) = max(0, x) - Prevents vanishing gradients in deep networks
- Computationally efficient
- Introduces non-linearity:
- Softmax: Applied to output layer
- Converts logits to probability distribution
- Ensures outputs sum to 1.0
- Enables confidence-based predictions