Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/ollm/opencomic-ai-training/llms.txt

Use this file to discover all available pages before exploring further.

fix-images.mjs is a standalone validation utility for inspecting the consistency of a generated paired dataset. After a generation run — especially one that was interrupted and resumed — it is possible for the clean/ and degraded/ folders to fall out of sync: a file may exist in one folder but not the other, or the pixel dimensions of a clean image may not match the expected scaled dimensions of its degraded counterpart. This script scans both halves of the dataset, identifies every such mismatch, and can either report the problems or permanently delete the offending file pairs.

Invocation

node fix-images.mjs --dataset <name> [--scale <n>] [--print] [--delete]

Flags

--dataset
string
required
The name of the dataset folder to validate. The script resolves the full path as ./datasets/<name>/ relative to the repository root. Both a clean/ and a degraded/ subdirectory are expected to exist inside that folder.
--dataset opencomic-ai-upscale-2x
--scale
number
The integer scale factor used to verify dimension relationships. For a given file, the script checks that clean.width === degraded.width × scale and clean.height === degraded.height × scale. If this flag is omitted, the scale is auto-detected from the dataset name (see tip below). If no numeric scale can be detected from the name, or if the detected value is NaN, the value defaults to 1.
--print
flag
Print the filename and dimensions of each dimension-mismatched file pair to stdout without taking any action. Note that unpaired files (where one side is missing entirely) are counted in the mismatch total but are not individually printed. Use this flag first to review dimension mismatches before running with --delete.
--delete
flag
Delete both the clean/ and degraded/ copies of every mismatched file. If only one side of a pair exists, that single file is deleted. The deletion is performed with fs.unlinkSync and cannot be undone.
--delete is irreversible. Files removed by this flag cannot be recovered. Always run with --print first to confirm the list of dimension-mismatched files before passing --delete.
If --scale is not provided, the script parses the dataset name for a pattern matching <number>x or x<number> (regex /([0-9]+)x|x([0-9]+)/). For example, a dataset named opencomic-ai-upscale-2x will be assigned a scale of 2 automatically, and opencomic-ai-upscale-3x will use 3. For datasets where no scale pattern appears in the name (such as opencomic-ai-artifact-removal or opencomic-ai-descreen-hard), the scale falls back to 1, which means only paired-file existence and equal dimensions are checked.

What constitutes a mismatch

The script considers a file to be mismatched under either of the following conditions:
  1. Unpaired file — The filename exists in clean/ but not in degraded/, or vice versa. This can happen when a generation run is interrupted mid-image.
  2. Dimension mismatch — Both files exist, but the clean image dimensions do not equal the degraded image dimensions multiplied by the scale factor:
    clean.width  ≠ degraded.width  × scale
    clean.height ≠ degraded.height × scale
    
    This guards against partially-written images or files from a different generation run being mixed into the dataset folder.

Output format

When the script runs, it first prints a configuration summary, then (if --print is set) each individual dimension mismatch, and finally a totals line:
Dataset : opencomic-ai-upscale-2x
Scale   : 2
Print   : true
Delete  : false

Mismatched size: 000123.png (clean: 512x512, degraded: 128x256)
Mismatched size: 000456.png (clean: 512x512, degraded: 200x256)

Total mismatched files: 2 / 10000 (0.02%)
When --delete is active instead, each deleted filename is logged as it is removed:
Dataset : opencomic-ai-upscale-2x
Scale   : 2
Print   : false
Delete  : true

Deleted: 000123.png
Deleted: 000456.png

Total mismatched files: 2 / 10000 (0.02%)
The summary line always shows the count of mismatched files, the total number of unique filenames seen across both folders, and the mismatch rate as a percentage.

Example commands

Inspect dimension mismatches in the upscale-2x dataset without deleting anything:
node fix-images.mjs --dataset opencomic-ai-upscale-2x --print
Verify and remove all mismatched pairs from the upscale-2x dataset with an explicit scale:
node fix-images.mjs --dataset opencomic-ai-upscale-2x --scale 2 --delete
Check the artifact-removal dataset (scale 1 — dimensions must match exactly):
node fix-images.mjs --dataset opencomic-ai-artifact-removal --scale 1 --print

Build docs developers (and LLMs) love