Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/ollm/opencomic-ai-training/llms.txt

Use this file to discover all available pages before exploring further.

OpenComic AI models are trained using traiNNer-redux, an open-source image restoration and super-resolution training framework. The repository includes ready-to-use traiNNer-redux configuration files under options/train/ that match the architecture and data paths used for the official OpenComic AI model releases. You can use these configs directly or adapt them to your own dataset paths and hyperparameters.
Full documentation for traiNNer-redux configuration options, architecture references, and loss functions is available at trainner-redux.readthedocs.io.

Model Architectures

OpenComic AI ships three architecture tiers across all task families:
ArchitectureConfig folderDescription
ESRGANoptions/train/ESRGAN/Full-size ESRGAN generator. Highest quality; highest VRAM and inference time.
ESRGAN Liteoptions/train/ESRGAN/Lighter ESRGAN variant (*-lite configs). Balanced quality and speed.
Compactoptions/train/Compact/Smallest architecture (*-compact configs). Fastest inference; suitable for real-time or CPU use.
All three architectures are trained on the same generated datasets. Only the network_g.type field and the pretrained model path differ between configs.

Pre-Training Model Chain

The OpenComic AI release models follow a deliberate pre-training chain that transfers general restoration knowledge before specializing for a harder task:
  • Artifact removal — trained from scratch (no pretrained base).
  • Descreen — initialized from the artifact-removal weights of the matching architecture tier, then fine-tuned on descreen data. Artifact removal features are a prerequisite because halftone patterns are a class of compression artifact.
  • Upscale 2x — initialized from artifact-removal weights. Real-world scanned comics almost always carry compression and halftone artifacts before upscaling, so artifact-removal priors improve upscale quality on degraded inputs.
  • Upscale 3x / 4x — initialized from upscale-2x weights of the matching tier, not from artifact-removal.
This means you should generate and train in this order: artifact-removal → descreen, and artifact-removal → upscale-2x → upscale-3x/4x. The pretrained model path is set in the path.pretrain_network_g field of each training config.

Training Workflow

1

Generate the paired dataset

Follow the steps in Generate Paired Training Datasets to produce a clean/ and degraded/ folder. For example, to generate the upscale-2x dataset:
npm run prepare && npm run generate -- \
  --options ./options/opencomic-ai-upscale-2x.yml \
  --krita ./krita-5.3.1-x86_64.AppImage
This writes image pairs to datasets/opencomic-ai-upscale-2x/clean/ and datasets/opencomic-ai-upscale-2x/degraded/.
2

Validate the dataset (recommended)

Before training, run fix-images.mjs to remove any incomplete or dimension-mismatched pairs that could cause data loader errors. See Validate and Fix Paired Dataset Consistency for details.
node fix-images.mjs --dataset opencomic-ai-upscale-2x --print
node fix-images.mjs --dataset opencomic-ai-upscale-2x --scale 2 --delete
3

Point the training config to your dataset

Open the relevant config in options/train/ESRGAN/. The upscale-2x config at options/train/ESRGAN/opencomic-ai-upscale-2x.yml contains the following dataset section:
datasets:
  train:
    name: Train Dataset
    type: pairedimagedataset
    # Path to the HR (high res / clean) images
    dataroot_gt:
      - datasets/train/opencomic-ai-upscale-2x/clean
    # Path to the LR (low res / degraded) images
    dataroot_lq:
      - datasets/train/opencomic-ai-upscale-2x/degraded

    lq_size: 64       # Crop size for LR images during training
    use_hflip: true   # Random horizontal flip augmentation
    use_rot: true     # Random rotation augmentation

    num_worker_per_gpu: 8
    batch_size_per_gpu: 8
Update dataroot_gt and dataroot_lq to point at your generated dataset folders if they differ from the defaults. The lq_size of 64 corresponds to a 128×128 ground-truth crop at 2× scale (lq_size = gt_size / scale).
4

Set the pretrained model path

For upscale-2x and descreen models, set path.pretrain_network_g to the artifact-removal checkpoint of the matching architecture:
path:
  pretrain_network_g: experiments/pretrained_models/opencomic-ai-artifact-removal-1000000.safetensors
  param_key_g: ~
  strict_load_g: false   # Allows loading a pretrain model with a different scale
  resume_state: ~
For artifact-removal training from scratch, set pretrain_network_g to ~ (null).
5

Configure network and scale

The top of each config sets the scale and generator architecture:
name: opencomic-ai-upscale-2x
scale: 2        # 1 for artifact-removal/descreen, 2/3/4 for upscale

network_g:
  type: esrgan
  use_pixel_unshuffle: false
For artifact-removal and descreen models, scale: 1 and the clean and degraded images are the same spatial size.
6

Run traiNNer-redux training

With traiNNer-redux installed, start training by pointing it at your config file:
python train.py -opt options/train/ESRGAN/opencomic-ai-upscale-2x.yml
Checkpoints are saved every save_checkpoint_freq iterations (default 10000) to experiments/<name>/models/ in safetensors format. Training runs for total_iter iterations (default 1,000,000).

Training Config Reference (upscale-2x)

The following excerpt covers the key fields from options/train/ESRGAN/opencomic-ai-upscale-2x.yml:
name: opencomic-ai-upscale-2x
scale: 2
use_amp: false
use_channels_last: true

datasets:
  train:
    type: pairedimagedataset
    dataroot_gt:
      - datasets/train/opencomic-ai-upscale-2x/clean
    dataroot_lq:
      - datasets/train/opencomic-ai-upscale-2x/degraded
    lq_size: 64
    use_hflip: true
    use_rot: true
    batch_size_per_gpu: 8

network_g:
  type: esrgan
  use_pixel_unshuffle: false

path:
  pretrain_network_g: experiments/pretrained_models/opencomic-ai-artifact-removal-1000000.safetensors
  strict_load_g: false

train:
  total_iter: 1000000
  optim_g:
    type: AdamW
    lr: !!float 2e-4
  scheduler:
    type: MultiStepLR
    milestones: [200000, 400000, 600000, 800000]
    gamma: 0.5
  losses:
    - type: charbonnierloss
      loss_weight: 1.0
    - type: mssimloss
      loss_weight: 0.1
    - type: perceptualloss
      criterion: charbonnier
      loss_weight: 0.01
    - type: hsluvloss
      criterion: charbonnier
      loss_weight: 0.2
    - type: cosimloss
      loss_weight: 0.2

logger:
  save_checkpoint_freq: 10000
  save_checkpoint_format: safetensors

Build docs developers (and LLMs) love