Overview
This guide provides step-by-step emergency procedures for handling critical situations in the UMA CTF Adapter. These procedures should be followed by admins when automatic resolution mechanisms fail or when immediate intervention is required.Emergency Response Framework
Assessment Phase
-
Identify the issue
- Question resolution failure
- Optimistic Oracle manipulation
- Technical malfunction
- External event impact
-
Determine severity
- Critical: Immediate action required (pause)
- High: Manual resolution likely needed (flag)
- Medium: Investigation required (pause temporarily)
- Low: Monitor and document
-
Document the situation
- Record questionID
- Note current state and timestamp
- Document evidence and reasoning
- Communicate with stakeholders
Procedure 1: Emergency Pause
When to Use
- Suspected Optimistic Oracle manipulation
- Smart contract vulnerability discovered
- Critical bug in resolution logic
- Need time for investigation without risk of incorrect resolution
Steps
1. Immediate Pause- Review question data:
adapter.getQuestion(questionID) - Check if initialized:
adapter.isInitialized(questionID) - Check if flagged:
adapter.isFlagged(questionID) - Review Optimistic Oracle state
- Consult with technical team
Timing Considerations
- Pause immediately upon detection
- Complete investigation within 24 hours
- Communicate timeline to stakeholders
- Unpause or flag based on findings
Procedure 2: Manual Resolution Workflow
When to Use
- Optimistic Oracle returns incorrect or disputed result
- Ambiguous question wording requiring interpretation
- External events make automatic resolution impossible
- Community governance decision overrides automatic resolution
Prerequisites
- Admin access verified
- Issue investigation completed
- Community/stakeholders consulted (if applicable)
- Correct payout determined
Steps
1. Flag the Question- Announce flagging to community
- Review evidence and reasoning
- Consider alternative perspectives
- Monitor for new information
- Check if unflagging is appropriate
- Document the resolution and rationale
- Announce outcome to community
- Review process for improvements
- Update procedures if needed
Manual Resolution Timeline
Procedure 3: Reset Recovery
When to Use
priceDisputedcallback reverts and fails to reset automatically- Optimistic Oracle request enters invalid state
- Need to resubmit price request after technical failure
Symptoms
- Question shows
reset = trueandrefund = truebut no new price request - Question stuck after dispute
- Optimistic Oracle shows failed transaction for callback
Steps
1. Verify Reset is Needed- Check Optimistic Oracle for new price request
- Monitor for proposal
- Verify normal resolution can proceed
- Admin who called
reset()paid for new price request - Original reward was refunded to question creator
- Consider reimbursement process if applicable
Procedure 4: Pause/Unpause Operations
Pause Use Cases
Temporary InvestigationUnpause Considerations
Before unpausing, verify:- Issue Resolved: Problem that caused pause is fixed
- Price Available: Check if OO has price ready
- No New Issues: No additional concerns discovered
- Stakeholder Agreement: Consensus that automatic resolution is safe
Decision Tree
Emergency Contact List
Internal
- Admin Multi-Sig Signers: Coordinate for urgent admin actions
- Technical Team: Smart contract engineers for investigation
- Operations Team: Coordinate response and communication
External
- UMA Protocol Team: For Optimistic Oracle issues
- Polymarket/CTF Team: For CTF integration issues (if applicable)
- Community Moderators: For stakeholder communication
Monitoring and Detection
Key Metrics to Monitor
-
Question State
- Paused questions
- Flagged questions
- Questions past expected resolution time
-
Events
QuestionPausedQuestionFlaggedQuestionResetQuestionManuallyResolved
-
Optimistic Oracle
- Failed callbacks
- Disputed proposals
- Unusual proposal/dispute patterns
Alert Thresholds
- Critical: Callback reverts, price manipulation detected
- High: Question flagged, multiple disputes
- Medium: Unusual delay in resolution
- Low: Price proposal made
Post-Incident Review
After any emergency procedure:-
Document Incident
- What happened and why
- Timeline of events
- Actions taken
- Outcome
-
Root Cause Analysis
- Identify underlying cause
- Assess if preventable
- Review detection methods
-
Improve Procedures
- Update this documentation
- Enhance monitoring
- Improve response time
- Train admin team
-
Communicate Results
- Share findings with stakeholders
- Publish transparency report (if appropriate)
- Update community on preventive measures
Safety Considerations
The 1-Hour Safety Period
Purpose:- Prevents hasty manual resolution
- Allows time for community review
- Enables unflagging if issue resolves
- Reduces admin error risk
Admin Responsibilities
- Act deliberately: Emergency procedures are powerful, use carefully
- Document everything: Maintain audit trail of all actions
- Communicate proactively: Keep stakeholders informed
- Follow governance: Adhere to established procedures and community decisions
- Verify carefully: Double-check addresses, payouts, and states before executing
Common Mistakes to Avoid
- Flagging without investigation: Flag only when manual resolution is truly needed
- Missing safety period: Cannot manually resolve before safety period expires
- Wrong payouts: Verify payout array carefully before
resolveManually() - Removing last admin: Always maintain multiple admins
- Unpausing prematurely: Ensure issue is fully resolved before unpause
- Not documenting: Always record reasoning and evidence
Testing Emergency Procedures
Testnet Practice
Regularly practice emergency procedures on testnet:Simulation Exercises
- Quarterly drills: Run through emergency scenarios
- Response time metrics: Measure how quickly team can respond
- Documentation review: Ensure procedures are up to date
- Role clarity: Confirm each admin knows their responsibilities