Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/AmolPardeshi99/android-performance-skills/llms.txt

Use this file to discover all available pages before exploring further.

When an ANR surfaces in Android Vitals, a bug report, or via ApplicationExitInfo, a structured analysis workflow helps you quickly determine whether the root cause is in your app code or in the system (OEM freeze policy, Binder thread pool pressure, memory overload). This page walks through the complete triage process.

Six-step ANR triage process

1

Locate the exact ANR time

Search the EventLog for am_anr — this timestamp is the most accurate marker of when the ANR was triggered.
am_anr: [0, 2169, com.example.app, 820526660, Input dispatching timed out ...]
The am_anr timestamp is more accurate than the "ANR in" log line, which can appear slightly later as the system writes diagnostics. Always anchor your investigation to am_anr.
2

Read the ANR reason from MainLog / SystemLog

Search for "ANR in" to find:
  • Time of ANR
  • Process name and PID
  • Reason string (e.g., Input dispatching timed out, Broadcast of Intent)
  • CPU load averages (1/5/15 min)
  • CPU usage per process, before and during the ANR
  • Memory pressure (PSI)
3

Assess system state at ANR time

Before reading thread stacks, rule out system-wide causes:
  • Was CPU load more than 3× the device core count? (system-wide overload)
  • Was system_server or surfaceflinger consuming >50%?
  • Were there lowmemorykiller events? Was kswapd0 running continuously?
  • Was iowait high? (check major page fault counts in the CPU log)
  • Were other unrelated apps being killed (am_kill, am_proc_died)?
4

Read the thread stacks in the ANR trace file

Open the ANR trace file and find the "main" thread entry:
  • Is the main thread Blocked, TimedWaiting, or Native (normal idle)?
  • If Blocked: which lock object? Which thread holds it?
  • If Native: is it a Binder call? Which system service?
Cross-reference with worker threads — find the thread that holds the lock the main thread is waiting for and read its stack.
If the main thread shows Native state with nativePollOnce in the stack (idle Looper), the app was not executing any work when the ANR fired. This is a strong signal of a system-induced ANR, not an app bug.
5

Collect the context window (5 seconds before am_anr)

Combine EventLog, MainLog, and SystemLog into one file. Search the 5-second window before the am_anr timestamp for:
  • Slow operation, Slow dispatch, Slow delivery
  • dvm_lock_sample
  • App launches, screen on/off transitions
  • OEM freeze/unfreeze log lines (HansManager, RefrigeratorManager)
  • Lock release events that would explain why the main thread was finally unblocked
6

If inconclusive: add tracing or reproduce locally

When the thread stacks and CPU log do not point to a clear root cause:
  • Enable StrictMode with penaltyDeath() in a debug build to hard-fail on the first violation
  • Capture a Perfetto trace to correlate timeline events with thread activity
  • Use ApplicationExitInfo.traceInputStream to upload the full tombstone trace to your observability backend

Reading thread state in traces.txt / anr_* files

Pulling the trace files

# Pull all ANR traces from device (requires root or userdebug build)
adb pull /data/anr/ ./anr_traces/

# On older devices, single aggregated file:
#   /data/anr/traces.txt
# On newer devices (Android 11+), per-event files:
#   /data/anr/anr_YYYY-MM-DD-HH-MM-SS-mmm_<pid>

# Full bugreport (captures traces, EventLog, MainLog, CPU info — preferred for production):
adb bugreport bugreport.zip

Normal idle main thread

A healthy main thread waiting for work looks like this. The nativePollOnce call means the Looper is blocked in epoll_wait, which is the correct idle state — not a problem:
"main" prio=5 tid=1 Native
  | state=S  ...
  native: #00 ... libc.so (__epoll_pwait+8)
  native: #01 ... libutils.so (android::Looper::pollInner+184)
  at android.os.MessageQueue.nativePollOnce(Native method)
  at android.os.Looper.loop(Looper.java:198)
  at android.app.ActivityThread.main(ActivityThread.java:8142)

Blocked main thread (app bug)

When the main thread is waiting for a monitor lock held by another thread, the trace shows Blocked state with an explicit lock reference:
"main" prio=5 tid=1 Blocked
  | state=S  ...
  at com.example.SomeClass.doWork(SomeClass.java:98)
  - waiting to lock <0x0e57c91f> (a java.lang.Object) held by thread 34
  at android.os.Handler.dispatchMessage(Handler.java:99)
  at android.os.Looper.loop(Looper.java:254)
  at android.app.ActivityThread.main(ActivityThread.java:8142)
Note the held by thread 34 reference. Find thread 34 in the same trace file to read its stack and determine what it is doing while holding the lock.

Thread state reference

State in traceJava Thread.StateMeaning
NativeRUNNABLEExecuting JNI or waiting in Looper epoll (normal idle)
RunnableRUNNABLEActively executing Java code
BlockedBLOCKEDWaiting for a monitor lock held by another thread
WaitingWAITINGIn Object.wait() with no timeout
TimedWaitingTIMED_WAITINGIn Object.wait(timeout) or Thread.sleep()
kWaitingForGcToCompleteWAITINGGC pause — indicates GC-pressure ANR

CPU usage log interpretation

The "ANR in" log section prints CPU data over two time windows. Both windows are needed to understand whether the ANR was sudden or building over time:
// Window 1: ~13s window covering the ANR
CPU usage from 0ms to 13135ms later:
  191%  1948/system_server: 72% user + 119% kernel / faults: 78816 minor 9 major
  30%   5991/com.example.app: 23% user + 6.4% kernel / faults: 118172 minor 2 major

// Window 2: ~1s window (snapshot closer to ANR)
CPU usage from 246ms to 1271ms later:
  290%  1948/system_server: 114% user + 176% kernel / faults: 9353 minor
  44%   5991/com.example.app: 37% user + 7.4% kernel

Load average line

Load: 15.29 / 5.19 / 1.87   ← 1-min / 5-min / 15-min average load
Values significantly above the device’s CPU core count indicate system-wide overload. On an 8-core device, a 1-minute load of 15.29 is extreme overload — a system-induced ANR is likely. Low load averages with a high-CPU app process point to an app-side root cause.

Page faults

Each process entry reports minor and major page fault counts:
  • Minor faults — the page was already in memory cache. This is normal memory access activity and not a concern.
  • Major faults — the page had to be read from disk. A high major count means the process is triggering disk I/O and is a potential ANR contributor. High major counts on system_server or kswapd0 indicate memory pressure driving the ANR.

Distinguishing app bug vs system-induced ANR

Strong signals the ANR is an application bug

  • Main thread Blocked on a lock your own code created
  • Main thread inside a synchronized {} block that is doing I/O or waiting for a sub-thread result
  • Main thread stack shows your code in a Binder call to a service you host
  • CPU usage of your process is very high before the ANR (tight loop, heavy computation)
  • No system-wide CPU overload or memory pressure is visible in the log

Strong signals the ANR is system-induced

  • Main thread Native (idle Looper) — a message dispatch simply never arrived from the system
  • OEM freeze/unfreeze log lines appear around the ANR time (e.g., HansManager, RefrigeratorManager)
  • system_server or surfaceflinger CPU usage is extremely high
  • High major page faults on system_server, or kswapd0 continuously active
  • Multiple unrelated apps are ANR-ing in the same time window
  • Load average is well above the device core count for a sustained period
If the ANR is system-induced, the fix belongs on the OEM or system side. Your mitigation is to keep all lifecycle callbacks as lightweight as possible so the app can recover quickly once system load subsides.

Real-world pattern: lock inversion deadlock

One of the most common production ANR patterns is a lock inversion deadlock between the main thread and a background worker. The main thread calls into a utility (logging, image cache, analytics) that holds Lock A and tries to acquire Lock B, while a background thread holds Lock B and tries to acquire Lock A. The ANR trace signature for this pattern:
"main" prio=5 tid=1 Blocked
  at com.example.EventLogger.logEvent(EventLogger.java:98)
  - waiting to lock <0x0d3fbd00> held by thread 34

"EventLogger" prio=5 tid=34 TimedWaiting
  at java.lang.Object.wait(...)
  - waiting on <0x00fc7065> (AtomicInteger)
  - locked <0x0d3fbd00> (a EventLogger)  ← holds the lock main thread wants
Thread 34 holds the EventLogger lock that the main thread needs, and is itself waiting on an AtomicInteger inside DatabaseHelper. This circular dependency cannot resolve on its own.
// ✅ Prevention: replace nested synchronized blocks with a single-threaded dispatcher
val logDispatcher = Dispatchers.IO.limitedParallelism(1)
suspend fun logEvent(event: Event) = withContext(logDispatcher) {
    // All log writes are serialized here — no explicit lock needed
    db.insert(event)
}
The general rules for avoiding lock inversion:
  1. Never hold a lock while calling into a system or third-party API
  2. Use a limitedParallelism(1) dispatcher instead of nested synchronized {} blocks
  3. In crash and log handlers, avoid any synchronized calls that may contend with worker threads

Build docs developers (and LLMs) love