Overview
At the heart of iSH is a high-performance interpreter that uses threaded code - a technique where instructions are represented as an array of function pointers, with each function (called a “gadget”) ending in a tailcall to the next. This achieves a 3-5x speedup compared to traditional switch-based dispatch.Threaded Code Technique
Traditional Interpretation (Switch Dispatch)
A typical emulator uses a loop with a switch statement:- Branch misprediction on the switch
- Loop condition check
- Instruction pointer update
Threaded Code (How iSH Works)
iSH generates an array of function pointers, where each function tailcalls the next:- No dispatch loop overhead
- Better branch prediction (direct calls)
- CPU’s return stack buffer optimizes tailcalls
- 3-5x faster than switch dispatch
Gadget System
What is a Gadget?
A gadget is a small assembly function that:- Executes a single operation (or part of an x86 instruction)
- Reads its parameters from the code stream
- Tailcalls the next gadget
asbestos/gadgets-aarch64/entry.S:
Gadget Types
Gadgets are organized by function in separate assembly files:Example: Memory Gadgets
Fromasbestos/gadgets-aarch64/memory.S:
gretis a macro that performs the tailcall_addr,_tmp,espare register aliaseswrite_prepandread_prepare macros for TLB lookup
Gadget Parameters
Gadgets read parameters from the code stream pointed to by_ip (instruction pointer in the gadget stream, not the x86 EIP):
asbestos/gen.c:
Code Generation
gen.c - 32-bit Code Generator
The code generator translates x86 instructions to gadget sequences. Fromasbestos/gen.c:
Fiber Blocks
Generated code is stored in “fiber blocks”:Translation Example
Translatingmov eax, [ebx+8]:
- Decode x86 instruction
- Generate gadget sequence:
- This produces an array:
Assembly Implementation
Why Assembly?
From the README:Unfortunately, I made the decision to write nearly all of the gadgets in assembly language. This was probably a good decision with regards to performance (though I’ll never know for sure), but a horrible decision with regards to readability, maintainability, and my sanity.Reasons for assembly:
- Precise control over tailcall generation
- Direct register allocation
- Avoid compiler optimization interference
- Consistent calling convention across gadgets
The Cost
The amount of bullshit I’ve had to put up with from the compiler/assembler/linker is insane. […] I’ve had to ignore best practices in code structure and naming. You’ll find macros and variables with such descriptive names asssandsanda. Assembler macros nested beyond belief. And to top it off, there are almost no comments.
Entry Point
Fromasbestos/gadgets-aarch64/entry.S:
- Save host (ARM64) registers
- Set up interpreter state (
_ip,_cpu,_tlb) - Load x86 registers into ARM64 registers
- Jump to first gadget
- Gadgets execute until
fiber_ret
Register Allocation
On ARM64, x86 registers are mapped to ARM64 registers:Performance Characteristics
Speedup Metrics
From the README:The result is a speedup of roughly 3-5x compared to emulation using a simpler switch dispatch.Factors contributing to speedup:
- No dispatch overhead: Each gadget jumps directly to the next
- Branch prediction: CPUs predict direct calls better than switch statements
- Register allocation: x86 registers stay in host registers
- Reduced memory access: Instruction decoding happens once during translation
Bottlenecks
- Memory operations: TLB lookups for guest memory access
- Block transitions: Jumping between fiber blocks has overhead
- Code generation: First execution of a block requires translation
- Cache: Large working sets can evict fiber blocks
Challenges and Trade-offs
Maintainability
Problems:- Assembly code is hard to read
- Macros are heavily nested
- Variable names are cryptic (
ss,s,a) - Few comments
- Platform-specific (separate gadgets for ARM64, x86_64)
- Maximum performance
- Fine-grained control
- Consistent behavior
Debugging Difficulty
Debugging gadget code is challenging:- Stack traces are confusing (no returns, only jumps)
- Register state is split between x86 and host
- Errors in gadgets can crash the entire emulator
Portability
Each host architecture needs its own gadget implementation:gadgets-aarch64/- ARM64 (iOS, Apple Silicon)gadgets-x86_64/- x86_64 (Linux, older Macs)gadgets-aarch64-64/- ARM64 for 64-bit guest
Code Structure
Asbestos Module
The interpreter is called “asbestos” (a play on “fiber”):Block Lifecycle
- Creation: x86 instruction decoded, gadget sequence generated (
gen.c) - Execution:
fiber_enterjumps to gadget array, gadgets execute - Caching: Block stored in hash table by x86 address
- Invalidation: When guest memory changes, affected blocks marked as jetsam
- Freeing: Jetsam blocks freed at next safe point
Example: Complete Instruction Flow
Let’s tracemov eax, 42 (opcode: B8 2A 00 00 00):
1. First Execution (Translation)
2. Execution
3. Result
EAX register (w0) now contains 0x2A, and execution continues with the next gadget.Gadget Categories in Detail
Entry/Exit (entry.S)
fiber_enter- Enter gadget executionfiber_exit- Return to native codefiber_ret- Normal block terminationinterrupt- Handle exceptions
Memory (memory.S)
push/pop- Stack operationsaddr_*- Address calculation (base, index, scale, displacement)seg_gs- Segment override- TLB miss handlers
Control Flow (control.S)
jmp,call,ret- Control transfer- Conditional jumps (Jcc)
- Loops
Math (math.S)
add,sub,mul,div- Logical operations (
and,or,xor) - Comparisons (
cmp,test) - Flag computation
Bits (bits.S)
- Shifts (
shl,shr,sar) - Rotates (
rol,ror) - Bit tests (
bt,bts,btr)
String (string.S)
movs,stos,lods,scas,cmpsrepprefix handling
Misc (misc.S)
cpuidrdtsc- System calls
- Everything else
Advanced Topics
Block Chaining
When one block ends with a jump to another block, they can be “chained” - the jump gadget is patched to jump directly to the target block instead of exiting and looking up the target.Invalidation
When guest code modifies its own memory (self-modifying code), affected blocks must be invalidated:Interrupts and Exceptions
Gadgets can trigger interrupts:See Also
- Debugging - Debug gadget execution
- 64-bit Port - 64-bit gadget implementation
- Logging - Trace gadget execution with
instrchannel