Detect Kermit Quickstart: install, load model, and predict

The fastest path to a working prediction is to use the pre-trained model that ships with the repository. Because the weights file (kermit_finder.h5) is stored via Git LFS, no retraining is necessary — clone the repo, install the dependencies, and you are one command away from classifying any image or video as kermit or no-kermit. This guide walks through exactly that path.

Prerequisites

Make sure you have the following before you begin:

Python 3.x — the evaluation script uses asyncio and type annotations that require Python 3.5+.
pip — used to install all Python dependencies.
Git LFS — required to download the pre-trained .h5 model that is tracked as a large file in the repository.

Install Git LFS for your platform:macOS (Homebrew)

brew install git-lfs

Ubuntu / Debian

sudo apt-get install git-lfs

After installation, initialise Git LFS for your user account (only needed once):

git lfs install

Then clone the repository so that LFS objects are downloaded automatically:

git clone https://github.com/ilirosmanaj/detect_kermit.git
cd detect_kermit

Install dependencies

Install all required Python packages from the pinned requirements.txt:

pip install -r requirements.txt

The file installs the following exact versions to ensure reproducibility:

Package	Version / Source
`tensorflow`	`1.12.0`
`tensorflow-gpu`	`1.12.0`
`keras`	`2.2.4`
`opencv-python`	`4.0.0.21`
`imageai`	`2.0.2` (GitHub wheel URL)
`pillow`	`5.4.1`
`scipy`	`1.2.0`
`matplotlib`	`3.0.2`
`h5py`	`2.9.0`
`pandas`	`0.23.4`
`google_images_download`	latest

ImageAI 2.0.2 is installed directly from GitHub because it is not published to PyPI:

https://github.com/OlafenwaMoses/ImageAI/releases/download/2.0.2/imageai-2.0.2-py3-none-any.whl

On macOS, the standard pip distribution of TensorFlow does not work. Install it from the official Google storage URL instead:

python3 -m pip install --upgrade https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.12.0-py3-none-any.whl

Run this command before pip install -r requirements.txt, or replace the tensorflow==1.12.0 line in requirements.txt with the URL above.

Verify the pre-trained model exists

After cloning with Git LFS enabled, confirm the two files the inference script depends on are present:

ls -lh data/images/models/kermit_finder.h5
ls -lh data/images/json/model_class.json

data/images/models/kermit_finder.h5 — the serialised ResNet weights produced by imageai_build_model.py. This file is tracked by Git LFS and should be several hundred megabytes.
data/images/json/model_class.json — the class index map used at inference time. Its contents are:

{
    "0": "kermit",
    "1": "no-kermit"
}

If kermit_finder.h5 is only a few hundred bytes, Git LFS did not download the real file. Run git lfs pull to fetch it:

git lfs pull

Run a prediction on an image

Use kermit_model_evaluation.py with -t image to classify a single image. Pass the file path with -f:

python kermit_model_evaluation.py -t image -f kermit.jpeg

Expected output:

Predicting the kermit.jpeg image
 kermit: 99.87 no-kermit: 0.13

The script prints the probability for each class. A kermit score near 100 confirms the model has high confidence that Kermit the Frog appears in the image.You can also classify multiple images in a single run by passing a comma-separated list of file paths:

python kermit_model_evaluation.py -t image -f kermit.jpeg,frog.jpeg,scene.jpeg

Each image is predicted independently and its result is printed in turn.

If TensorFlow throws an error about no available GPU devices or a CUDA_VISIBLE_DEVICES conflict, force CPU-only execution by exporting:

export CUDA_VISIBLE_DEVICES=''

Run this in the same shell session before executing the prediction script.

Run a prediction on a video

To classify every second of a video file, use -t video:

python kermit_model_evaluation.py -t video -f MuppetsEpisode3.avi

The script performs the following steps automatically:

Frame extraction — OpenCV seeks to each 1-second mark (CAP_PROP_POS_MSEC) and captures one frame per second for the full duration of the video.
Async batch inference — each extracted frame is submitted as an asyncio coroutine so all frames are predicted concurrently rather than sequentially.
Annotation — once all predictions are collected, the confidence scores for both classes are rendered as a red text banner at the top of each frame image using cv2.putText.
Output — annotated frames are saved as JPEGs in the episode3_results/ directory, named sequentially (ep3_frame0.jpg, ep3_frame1.jpg, …).

You can flip through the output directory to review which seconds of the episode contain Kermit, or stitch the annotated frames back into a video for a quick visual audit.

Next Steps

Prediction Guide

Deep-dive into image and video prediction: batch processing, understanding confidence scores, and interpreting edge cases like Kermit-lookalike characters.

Training Guide

Learn how to build your own labelled dataset, configure training epochs and batch size, and fine-tune the ResNet model from scratch with ImageAI.

Get Started

Guides

Helper Scripts

Reference

Detect Kermit Quickstart: install, load model, and predict

Next Steps

Prediction Guide

Training Guide

Build docs developers (and LLMs) love

Get Started

Guides

Helper Scripts

Reference

Documentation Index

​Next Steps

Prediction Guide

Training Guide

Build docs developers (and LLMs) love

Next Steps