Prowl.Quill is designed to be fast out of the box — the entire rendering core is under 1 000 lines of executable code, and the API eliminates unnecessary allocations wherever possible. Even so, understanding how the library batches draw calls, how tessellation tolerances trade quality for speed, and when to reuse pre-built text layouts will help you sustain high frame rates even when drawing tens of thousands of shapes per frame.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ProwlEngine/Prowl.Quill/llms.txt
Use this file to discover all available pages before exploring further.
Draw call batching
The canvas automatically merges consecutive shapes into a single GPU draw call whenever they share the same scissor region, brush (colour, gradient, texture), and shader + uniforms. No manual grouping is required — draw similar shapes back-to-back and the renderer handles the rest.What breaks a batch
State changes
Setting a different fill colour, stroke colour, brush (gradient/texture), scissor rectangle, or custom shader always opens a new draw call on the next shape.
RequestNewDrawCall()
Call
RequestNewDrawCall() when you need to guarantee a rendering-order boundary — for example, when a transparent shape must composite over an opaque one drawn in the same frame.Prefer hardware-accelerated primitives
The*Filled family of methods (RectFilled, RoundedRectFilled, CircleFilled, PieFilled) uses shader-based antialiasing and bypasses the CPU path-tessellation pipeline entirely. They are significantly faster than the equivalent path API for simple shapes:
- Fast (shader-based AA)
- Slower (path tessellation)
BeginPath / Fill / FillComplex for genuinely irregular shapes — concave polygons, shapes with holes, and SVG-style paths — where no built-in primitive applies.
Tessellation tolerance
0.5 (half a logical pixel), which is barely visible. Raising it reduces the triangle count for curved paths at the cost of slight faceting:
BezierCurveTo, QuadraticCurveTo, and path-based Arc. Shader-based primitives (CircleFilled, etc.) are unaffected.
Arc segment density
Arc, Circle, RoundedRect, and their filled equivalents. A higher value means fewer segments per arc — fewer CPU triangles but potentially more visible faceting on very large circles:
For shader-based filled primitives (
CircleFilled, RoundedRectFilled), segment count also affects CPU cost, but the GPU AA fringe still looks smooth regardless. Raising RoundingMinDistance to 5–8 is a safe optimisation for scenes with many small circles.Reuse text layouts across frames
Pre-building a text layout once is one of the most impactful optimisations available for text-heavy UIs.Static text: CreateLayout + DrawLayout
CreateLayout runs the font engine’s glyph shaping and line-breaking pipeline and returns an object whose geometry is frozen. DrawLayout simply submits that geometry — no shaping, no line-breaking, no per-character measurement:
Rich text: reuse QuillRichText
QuillRichText.Reset() to replay animations without recreating the object. Only call CreateRichText again when the source text or layout settings change.
Markdown: reuse QuillMarkdown
QuillMarkdown object only when the source or the available width changes.
HiDPI and framebuffer scale
framebufferScale parameter tells the canvas how many physical pixels correspond to one logical unit. On a Retina or HiDPI display, this is typically 2.0. Passing the correct value is critical for both visual quality and performance:
Correct scale
At
framebufferScale = 2.0, glyph atlases are rasterised at 2× density, AA fringe widths shrink to sub-pixel size, and everything looks crisp without any changes to your drawing code.Wrong scale
At
framebufferScale = 1.0 on a HiDPI display, glyphs are rasterised at 1× and upscaled by the OS, producing blurry text and thick AA fringes. Conversely, using 2.0 on a 1× display over-rasterises and wastes GPU memory.canvas.PixelToLogical(rawMousePos) to convert physical-pixel mouse coordinates back to logical units for hit-testing.
Benchmark insights
TheBenchmarkScene sample draws 79 000 rectangles and 1 000 circles every frame, each with an independent transform (translate + rotate + scale) and an animated colour. Key techniques it uses that you can apply in production:
Deterministic random numbers with a custom LCG
Deterministic random numbers with a custom LCG
Creating a
System.Random inside the hot loop is expensive. The benchmark uses a simple LCG (Linear Congruential Generator) to produce pseudo-random floats at near-zero cost:Precomputed time values outside the loop
Precomputed time values outside the loop
Time-dependent colour calculations use
Maths.Sin which is not free. Computing the time argument multipliers once before the loop avoids redundant floating-point work:Transform caching
Transform caching
Rather than calling
GetTransform inside the loop, the benchmark caches the root transform once and restores it with CurrentTransform for each shape — avoiding multiple matrix multiplications:Quick reference
| Practice | Impact |
|---|---|
Use RectFilled, CircleFilled, RoundedRectFilled for simple shapes | High — skips CPU tessellation |
| Draw shapes with the same shader/brush consecutively | High — maximises batching |
Use CreateLayout + DrawLayout for static text | High — eliminates per-frame font shaping |
Reuse QuillRichText / QuillMarkdown across frames | High — layout computed once |
Set framebufferScale correctly for HiDPI | High — crispness and atlas efficiency |
Increase SetRoundingMinDistance for many small arcs | Medium — fewer CPU triangles |
Increase SetTessellationTolerance for curved paths | Medium — fewer Bézier subdivisions |
Call RequestNewDrawCall() only when necessary | Medium — unnecessary calls fragment batches |