Documentation Index
Fetch the complete documentation index at: https://mintlify.com/QwenLM/Qwen3-VL/llms.txt
Use this file to discover all available pages before exploring further.
Function Signature
Description
Loads an image from various sources (local file, URL, base64 string, or PIL.Image object) and applies smart resizing based on the model’s requirements. The function automatically handles different image formats and converts them to RGB.Parameters
Dictionary containing image information and optional resize parameters.Required keys:
imageorimage_url: The image source (file path, URL, base64 string, or PIL.Image)
resized_height: Target height for resizingresized_width: Target width for resizingmin_pixels: Minimum number of pixels (default: 4 * patch_factor²)max_pixels: Maximum number of pixels (default: 16384 * patch_factor²)
The patch size used by the vision encoder. Affects the resizing factor calculation.Common values:
14for Qwen2VL and Qwen2.5VL16for Qwen3VL
image_patch_size * 2 (spatial merge size).Returns
Processed RGB PIL Image resized to dimensions divisible by the patch factor.The image dimensions are calculated using smart_resize to maintain aspect ratio while staying within min/max pixel constraints.
Supported Image Sources
Local File Path
HTTP/HTTPS URL
Base64 Encoded
PIL.Image Object
Custom Resize Parameters
Specify Exact Dimensions
Control Pixel Range
Image Processing Steps
- Load Image: Detects source type (file, URL, base64, PIL) and loads the image
- Convert to RGB: Handles RGBA images by compositing on white background
- Calculate Resize Dimensions: Uses smart_resize to determine optimal dimensions
- Resize: Applies resize maintaining aspect ratio within constraints
Error Handling
Usage with process_vision_info
While you can usefetch_image directly, it’s typically called internally by process_vision_info:
RGBA Image Handling
Images with alpha channels (RGBA) are automatically composited onto a white background:See Also
- process_vision_info - Process all vision content from conversations
- smart_resize - Understanding the resize algorithm
- fetch_video - Load video files