Skip to main content
The skin cancer detection model is built on a VGG-16 inspired Convolutional Neural Network (CNN) architecture, trained using Keras v2.8.0 and converted to TensorFlow.js format v3.19.0 for browser-based inference.

Model architecture

The model implements a sequential architecture with 21 layers organized into convolutional blocks, pooling layers, and fully connected layers.

Input layer

The model expects preprocessed images with specific dimensions to ensure accurate predictions.
  • Input shape: [75, 100, 3] (height × width × channels)
  • Data type: float32
  • Color format: RGB channels (channels-last format)
  • Preprocessing: Images are resized using nearest neighbor interpolation and converted to float tensors

Convolutional blocks

The architecture consists of two main convolutional blocks followed by dense layers.
Layers 1-6: Initial feature extraction with 64 filters each
  • Filters: 64 per layer
  • Kernel size: 3×3
  • Strides: (1, 1)
  • Padding:
    • Layers 1-5: same (maintains spatial dimensions)
    • Layer 6: valid (reduces dimensions)
  • Activation: ReLU (applied via separate Activation layers)
  • Kernel initializer: Glorot Uniform (Xavier initialization)
  • Bias initializer: Zeros
Each convolutional layer is followed by a dedicated ReLU activation layer for non-linear transformation.
Layers 7-8: Deeper feature extraction with increased filter depth
  • Filters: 128 per layer
  • Kernel size: 3×3
  • Strides: (1, 1)
  • Padding:
    • Layer 7: same
    • Layer 8: valid
  • Activation: ReLU (via separate Activation layers)
  • Kernel initializer: Glorot Uniform
  • Bias initializer: Zeros

Pooling and regularization layers

MaxPooling2D layers (2 instances):
  • Pool size: 2×2
  • Strides: (2, 2)
  • Position: After each convolutional block
  • Purpose: Spatial dimension reduction and translation invariance
Dropout layers (3 instances):
  • After Block 1: 25% dropout rate
  • After Block 2: 25% dropout rate
  • Before output layer: 50% dropout rate
  • Purpose: Prevent overfitting during training
Dropout layers are typically disabled during inference, so they don’t affect prediction performance in production.

Fully connected layers

Flatten layer:
  • Converts 3D feature maps to 1D feature vector
  • Positioned between convolutional blocks and dense layers
Dense layer 1 (Hidden layer):
  • Units: 512 neurons
  • Activation: ReLU (via separate Activation layer)
  • Purpose: High-level feature combination and representation learning
Dense layer 2 (Output layer):
  • Units: 7 neurons (one per class)
  • Activation: Softmax (via separate Activation layer)
  • Output: Probability distribution across 7 skin lesion types

Output classes

The model performs 7-class classification for the following skin lesion types:
  1. Actinic Keratoses
  2. Basal Cell Carcinoma
  3. Benign Keratoses
  4. Dermatofibroma
  5. Melanoma
  6. Melanocytic Nevus
  7. Vascular Lesion
Each prediction returns a probability distribution across all seven classes, with the highest probability indicating the most likely diagnosis.

Model specifications

Model format

TensorFlow.js Layers Model

Total size

99.3 MB (25 weight shards)

Framework

Keras v2.8.0 / TensorFlow backend

Converter

TensorFlow.js Converter v3.19.0

Weight distribution

  • Format: Binary shards for efficient loading
  • Shards: 25 files (group1-shard1of25.bin through group1-shard25of25.bin)
  • Shard size: ~4 MB per shard
  • Loading: Sequential download managed by TensorFlow.js

Hyperparameters

Key hyperparameters extracted from the model configuration:
ParameterValueDescription
Input dimensions75×100×3Height, width, and RGB channels
Conv filters (early)64First convolutional block
Conv filters (deep)128Second convolutional block
Kernel size3×3All convolutional layers
Pool size2×2MaxPooling layers
Hidden units512Fully connected layer
Dropout rates0.25, 0.25, 0.5After each block and before output
Output units7Number of diagnostic classes

Initialization strategies

Glorot Uniform (Xavier) for convolutional and dense layer kernels:
  • Draws weights from uniform distribution: [-limit, limit]
  • limit = sqrt(6 / (fan_in + fan_out))
  • Maintains variance across layers during forward and backward passes
Zeros initialization for all bias terms:
  • All bias values initialized to 0
  • Allows symmetric learning in early training epochs

Model loading

The model is loaded asynchronously in the browser using TensorFlow.js:
var model = null;

async function loadModel() {
    console.log("Loading Model");
    model = await tf.loadLayersModel('cnn_model/model.json');
    console.log("Loaded Model");
}
The model must be fully loaded before making predictions. Always await the loadModel() function completion.

Configuration files

  • model.json: Contains architecture definition and weight manifest
  • Weight shards: Binary files containing trained parameters
  • Format: Layers model (supports full Keras Sequential API)

Technical considerations

VGG-16 is well-suited for medical image classification due to:
  • Small receptive fields: 3×3 kernels capture fine-grained details in skin lesions
  • Deep architecture: Multiple layers learn hierarchical features from textures to patterns
  • Proven performance: VGG-style networks excel at visual recognition tasks
  • Browser compatibility: Architecture runs efficiently with TensorFlow.js
Input images undergo standardized preprocessing:
  1. Resize to 75×100 pixels using nearest neighbor interpolation
  2. Convert to float32 tensor
  3. Expand dimensions to create batch of size 1: [1, 75, 100, 3]
  4. Feed to model for inference
The preprocessing is handled by TensorFlow.js browser utilities:
let tensorImg = tf.browser.fromPixels(imgtag)
                .resizeNearestNeighbor([75, 100])
                .toFloat().expandDims();
The model uses two activation functions strategically:
  • ReLU (Rectified Linear Unit): Applied after convolutional and hidden dense layers
    • Introduces non-linearity: f(x) = max(0, x)
    • Prevents vanishing gradients in deep networks
    • Computationally efficient
  • Softmax: Applied to output layer
    • Converts logits to probability distribution
    • Ensures outputs sum to 1.0
    • Enables confidence-based predictions

Build docs developers (and LLMs) love