Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ilirosmanaj/detect_kermit/llms.txt
Use this file to discover all available pages before exploring further.
kermit_model_evaluation.py loads the trained ResNet model and runs inference via ImageAI’s CustomImagePrediction class. It accepts either a single image, a comma-separated list of images, or a video file as input, and returns class probabilities for both kermit and no-kermit on each frame or image. For video input, the script also burns the prediction text directly onto each extracted frame and saves the annotated images to disk.
How the evaluation script works
On startup, the script:- Instantiates a
CustomImagePredictionmodel and sets it to ResNet architecture. - Loads the trained weights from
data/images/models/kermit_finder.h5. - Loads the class label map from
data/images/json/model_class.json. - Accepts the
-tflag (imageorvideo) to select the input type. - Accepts the
-fflag with the path (or comma-separated paths) to the target file(s). - Returns probabilities for both
kermitandno-kermitclasses for every input.
Predicting a single image
Pass-t image and the path to your image file:
Predicting multiple images
Supply a comma-separated list of file paths to the-f flag — no spaces around the commas:
Predicting a video
Pass-t video and the path to an .avi file:
- Frame extraction — OpenCV reads one frame per second (at 1 000 ms intervals) for the entire duration of the video.
- Frame storage — Each extracted frame is saved as a JPEG to
episode3_results/ep3_frameN.jpg, whereNis the frame index. - Async batch prediction — All frames are dispatched concurrently using Python’s
asyncioevent loop via agather_dictutility, making full use of available compute without waiting for one frame to finish before starting the next. - Annotation — After all predictions return, OpenCV writes a text banner onto each saved frame image showing the
kermitandno-kermitprobabilities (e.g.kermit 99.87% no-kermit 0.13%), and overwrites the JPEG on disk.
The predict_image async function
The core prediction primitive is an async function that wraps ImageAI’s synchronous predictImage call:
CustomImagePrediction model instance. The return value is a dict mapping each class name to its probability formatted as a two-decimal percentage string — for example:
Known limitations
-
Kermit-like false positives — The model may occasionally misclassify visually similar characters (such as green frogs or other amphibian-like Muppets) as Kermit. This is noted in the project README as a known behaviour, and arises because the decision boundary between “green frog” and “Kermit” is inherently subtle. Adding more diverse
no-kermitexamples — especially of green frogs — during training can reduce this error. -
Video format — The video prediction path is designed around
.avifiles and usescv2.VideoCapturedirectly. Other container formats may work depending on your OpenCV build, but only.avihas been tested. -
Output directory — All annotated video frames are written to a flat
episode3_results/directory relative to wherever you run the script. There is currently no option to change this output path via CLI flags.
For a full list of available command-line flags (including
--file_type / -t and --files / -f), see the CLI Reference.