The project exposes several importable Python functions acrossDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/ilirosmanaj/detect_kermit/llms.txt
Use this file to discover all available pages before exploring further.
kermit_model_evaluation.py and helpers/utils.py. All prediction functions are async coroutines and must be awaited (or run via asyncio.get_event_loop().run_until_complete()). The helper utilities in helpers/utils.py can be imported independently of the model to support concurrent task coordination and progress reporting.
kermit_model_evaluation module
predict_image
Either a file path string (relative to
EXECUTION_PATH, i.e. the repository root) or a NumPy array representing an already-loaded image.A fully loaded
CustomImagePrediction instance (model path, JSON path, and loadModel already called). See main for the standard initialisation pattern.A dictionary mapping class name strings to probability strings, for example:
predict_video
.avi video file, samples one frame per second, runs predict_image concurrently on every frame via gather_dict, then writes prediction text onto each saved JPEG.
Path to the
.avi video file to process (e.g. MuppetsEpisode3.avi).A fully loaded
CustomImagePrediction instance. The same model object used for image prediction is reused here.- Frame extraction — uses
cv2.CAP_PROP_POS_MSECto seek to each whole-second boundary (counter * 1000ms) and writes each frame toepisode3_results/ep3_frameN.jpg. - Concurrent prediction — all per-frame
predict_imagecoroutines are gathered into atasksdict and dispatched together withgather_dict. - Annotation — after all predictions resolve, each JPEG is re-opened with OpenCV and a results string (e.g.
kermit 99.87% no-kermit 0.13%) is drawn at pixel position(130, 25)usingcv2.FONT_HERSHEY_PLAINsize1.5in red (0, 0, 255).
main
CustomImagePrediction model, then dispatches to either predict_image (for each image in a comma-separated list) or predict_video depending on file_type.
Either
"image" or "video". Determines which prediction path is taken.For
"image": a comma-separated string of image file paths. For "video": a single .avi file path.main:
main is invoked from the __main__ block via loop.run_until_complete(main(...)). You can also call it directly from your own async code with await main(file_type, files).helpers.utils module
print_progress
\r). Useful for displaying frame-extraction progress without flooding the terminal with newlines.
A value between
0.0 and 1.0 representing completion. Rendered as an integer percentage, e.g. 0.42 → 42%.gather_dict
{key: coroutine} pairs, runs all coroutines concurrently using asyncio.gather, and returns a dictionary of {key: result} preserving the original keys.
A dictionary mapping arbitrary keys (e.g. image file paths) to awaitables (coroutines or futures). All values are scheduled concurrently.
A dictionary with the same keys as
tasks, where each value is the resolved result of the corresponding coroutine.gather_dict is used by predict_video to fan out predict_image calls across all extracted frames in a single await, maximising I/O concurrency.
imageai_build_model module
main
ModelTraining instance, configures it to use the ResNet architecture, points it at the data directory, and kicks off training.
trainModel parameters used:
Number of output classes the model is trained to predict. Here:
kermit and no-kermit.Number of training epochs (full passes over the training dataset).
Enables ImageAI’s built-in data augmentation (random flips, rotations, shifts) during training to improve generalisation.
Number of training samples processed per gradient update step.
When
True, prints a Keras-style layer summary of the ResNet architecture to stdout before training begins.