Object Detection

Object detection identifies and localizes multiple objects within images, providing bounding boxes, class labels, and confidence scores for each detection. React Native ExecuTorch supports real-time detection with VisionCamera integration.

Quick Start

import { useObjectDetection, SSDLITE_320_MOBILENET_V3_LARGE } from 'react-native-executorch';

function ObjectDetector() {
  const { isReady, forward } = useObjectDetection({
    model: SSDLITE_320_MOBILENET_V3_LARGE,
  });

  const detectObjects = async (imageUri: string) => {
    const detections = await forward(imageUri, 0.7);
    // [{ bbox: { x1, y1, x2, y2 }, label: 'PERSON', score: 0.95 }, ...]
  };

  return <Button title="Detect" onPress={() => detectObjects(imageUri)} />;
}

Hook API

`useObjectDetection<C>(props)`

Manages an object detection model instance with type-safe labels.

Type Parameters

ObjectDetectionModelSources

Model configuration type that determines available labels

Parameters

model

required

Model configuration object

Show properties

modelName

'ssdlite-320-mobilenet-v3-large' | 'rf-detr-nano'

required

Name of the built-in model to use

modelSource

ResourceSource

required

Source of the model binary

preventLoad

boolean

default:"false"

Prevent automatic model loading

Returns

error

RnExecutorchError | null

Error object if loading or inference fails

isReady

boolean

Whether the model is loaded and ready

isGenerating

boolean

Whether the model is currently processing

downloadProgress

number

Download progress (0-1)

forward

(input: string | PixelData, detectionThreshold?: number) => Promise<Detection[]>

Detect objects in an image. Returns array of detections with bboxes, labels, and scores.Parameters:

input: Image URI or PixelData object
detectionThreshold: Minimum confidence score (0-1), default 0.7

runOnFrame

((frame: Frame, detectionThreshold: number) => Detection[]) | null

Synchronous worklet function for VisionCamera frame processing. Available after model loads.

Available Models

SSDLITE_320_MOBILENET_V3_LARGE

Lightweight SSD detector optimized for mobile.

import { SSDLITE_320_MOBILENET_V3_LARGE } from 'react-native-executorch';

const detector = useObjectDetection({
  model: SSDLITE_320_MOBILENET_V3_LARGE,
});

Specifications:

Architecture: SSDLite with MobileNetV3-Large backbone
Classes: 80 COCO classes (person, car, dog, etc.)
Input Size: 320x320
Inference Time: ~100-150ms
mAP: ~22%

RF_DETR_NANO

Real-time DETR-based detector with improved accuracy.

import { RF_DETR_NANO } from 'react-native-executorch';

const detector = useObjectDetection({
  model: RF_DETR_NANO,
});

Specifications:

Architecture: Real-time DETR Nano
Classes: 80 COCO classes
Input Size: 640x640
Inference Time: ~150-200ms
mAP: ~35%

Detection Types

Detection Interface

interface Detection<L = CocoLabel> {
  bbox: Bbox;
  label: keyof L;
  score: number;
}

interface Bbox {
  x1: number; // Bottom-left x
  y1: number; // Bottom-left y
  x2: number; // Top-right x
  y2: number; // Top-right y
}

COCO Labels

All built-in models detect 80 COCO classes:

import { CocoLabel } from 'react-native-executorch';

// Available labels:
CocoLabel.PERSON
CocoLabel.CAR
CocoLabel.DOG
CocoLabel.CAT
CocoLabel.BICYCLE
CocoLabel.BOTTLE
// ... and 74 more

Complete Example

import React, { useState } from 'react';
import { View, Image, StyleSheet, TouchableOpacity, Text } from 'react-native';
import { useObjectDetection, SSDLITE_320_MOBILENET_V3_LARGE, Detection } from 'react-native-executorch';
import { launchImageLibrary } from 'react-native-image-picker';
import Svg, { Rect, Text as SvgText } from 'react-native-svg';

function ObjectDetectionDemo() {
  const [imageUri, setImageUri] = useState<string | null>(null);
  const [imageDimensions, setImageDimensions] = useState({ width: 0, height: 0 });
  const [detections, setDetections] = useState<Detection[]>([]);

  const { isReady, isGenerating, error, forward } = useObjectDetection({
    model: SSDLITE_320_MOBILENET_V3_LARGE,
  });

  const selectAndDetect = async () => {
    const result = await launchImageLibrary({ mediaType: 'photo' });
    
    if (result.assets && result.assets[0]) {
      const asset = result.assets[0];
      setImageUri(asset.uri!);
      setImageDimensions({
        width: asset.width || 300,
        height: asset.height || 300,
      });

      try {
        const dets = await forward(asset.uri!, 0.5);
        setDetections(dets);
      } catch (err) {
        console.error('Detection failed:', err);
      }
    }
  };

  const renderBoundingBoxes = () => {
    if (!imageUri || detections.length === 0) return null;

    return (
      <Svg
        style={StyleSheet.absoluteFill}
        viewBox={`0 0 ${imageDimensions.width} ${imageDimensions.height}`}
      >
        {detections.map((detection, index) => {
          const { bbox, label, score } = detection;
          const width = bbox.x2 - bbox.x1;
          const height = bbox.y2 - bbox.y1;

          return (
            <React.Fragment key={index}>
              <Rect
                x={bbox.x1}
                y={bbox.y1}
                width={width}
                height={height}
                stroke="#00FF00"
                strokeWidth="3"
                fill="none"
              />
              <SvgText
                x={bbox.x1 + 5}
                y={bbox.y1 + 20}
                fill="#00FF00"
                fontSize="16"
                fontWeight="bold"
              >
                {label} {(score * 100).toFixed(0)}%
              </SvgText>
            </React.Fragment>
          );
        })}
      </Svg>
    );
  };

  if (error) {
    return <Text>Error: {error.message}</Text>;
  }

  if (!isReady) {
    return <Text>Loading model...</Text>;
  }

  return (
    <View style={styles.container}>
      <TouchableOpacity
        style={styles.button}
        onPress={selectAndDetect}
        disabled={isGenerating}
      >
        <Text style={styles.buttonText}>
          {isGenerating ? 'Detecting...' : 'Select & Detect Objects'}
        </Text>
      </TouchableOpacity>

      {imageUri && (
        <View style={styles.imageContainer}>
          <Image
            source={{ uri: imageUri }}
            style={[styles.image, { aspectRatio: imageDimensions.width / imageDimensions.height }]}
          />
          {renderBoundingBoxes()}
        </View>
      )}

      {detections.length > 0 && (
        <View style={styles.results}>
          <Text style={styles.title}>Detected {detections.length} objects:</Text>
          {detections.map((det, idx) => (
            <Text key={idx} style={styles.detection}>
              {det.label}: {(det.score * 100).toFixed(1)}%
            </Text>
          ))}
        </View>
      )}
    </View>
  );
}

const styles = StyleSheet.create({
  container: { flex: 1, padding: 20 },
  button: { backgroundColor: '#007AFF', padding: 15, borderRadius: 8 },
  buttonText: { color: 'white', fontSize: 16, textAlign: 'center' },
  imageContainer: { marginVertical: 20, position: 'relative' },
  image: { width: '100%', borderRadius: 8 },
  results: { padding: 15, backgroundColor: '#f5f5f5', borderRadius: 8 },
  title: { fontSize: 16, fontWeight: 'bold', marginBottom: 10 },
  detection: { fontSize: 14, paddingVertical: 4 },
});

export default ObjectDetectionDemo;

Real-Time Camera Detection

Integrate with VisionCamera for live object detection:

import { Camera, useFrameProcessor } from 'react-native-vision-camera';
import { useObjectDetection, SSDLITE_320_MOBILENET_V3_LARGE } from 'react-native-executorch';
import { useSharedValue } from 'react-native-reanimated';

function LiveObjectDetection() {
  const detections = useSharedValue<Detection[]>([]);
  
  const { runOnFrame, isReady } = useObjectDetection({
    model: SSDLITE_320_MOBILENET_V3_LARGE,
  });

  const frameProcessor = useFrameProcessor(
    (frame) => {
      'worklet';
      if (!runOnFrame) return;
      
      const detected = runOnFrame(frame, 0.6);
      detections.value = detected;
    },
    [runOnFrame]
  );

  if (!isReady) return <Text>Loading...</Text>;

  return (
    <>
      <Camera
        style={StyleSheet.absoluteFill}
        device={device}
        isActive={true}
        frameProcessor={frameProcessor}
      />
      <BoundingBoxOverlay detections={detections} />
    </>
  );
}

Use Cases

Object Counting

const countObjects = async (imageUri: string, targetLabel: string) => {
  const detections = await forward(imageUri, 0.6);
  return detections.filter(d => d.label === targetLabel).length;
};

// Count people in an image
const peopleCount = await countObjects(imageUri, 'PERSON');

Region of Interest

Detect objects only in specific image regions:

const detectInRegion = async (
  imageUri: string,
  region: { x: number; y: number; width: number; height: number }
) => {
  const allDetections = await forward(imageUri, 0.5);
  
  return allDetections.filter(det => {
    const centerX = (det.bbox.x1 + det.bbox.x2) / 2;
    const centerY = (det.bbox.y1 + det.bbox.y2) / 2;
    
    return (
      centerX >= region.x &&
      centerX <= region.x + region.width &&
      centerY >= region.y &&
      centerY <= region.y + region.height
    );
  });
};

Object Tracking

Track objects across frames using IoU:

const calculateIoU = (box1: Bbox, box2: Bbox): number => {
  const x1 = Math.max(box1.x1, box2.x1);
  const y1 = Math.max(box1.y1, box2.y1);
  const x2 = Math.min(box1.x2, box2.x2);
  const y2 = Math.min(box1.y2, box2.y2);
  
  const intersection = Math.max(0, x2 - x1) * Math.max(0, y2 - y1);
  const area1 = (box1.x2 - box1.x1) * (box1.y2 - box1.y1);
  const area2 = (box2.x2 - box2.x1) * (box2.y2 - box2.y1);
  const union = area1 + area2 - intersection;
  
  return intersection / union;
};

const trackObjects = (
  prevDetections: Detection[],
  currDetections: Detection[],
  iouThreshold: number = 0.5
) => {
  return currDetections.map(curr => {
    const match = prevDetections.find(prev => 
      prev.label === curr.label && 
      calculateIoU(prev.bbox, curr.bbox) > iouThreshold
    );
    
    return { ...curr, tracked: !!match };
  });
};

Performance Tips

Threshold Tuning

Adjust detection threshold based on use case:

// High precision (fewer false positives)
const detections = await forward(imageUri, 0.8);

// High recall (catch more objects)
const detections = await forward(imageUri, 0.3);

// Balanced
const detections = await forward(imageUri, 0.5);

Model Selection

SSDLite: Faster, lower accuracy, good for real-time
RF-DETR Nano: Better accuracy, slightly slower

Frame Skipping

For camera processing, skip frames to reduce CPU load:

const frameProcessor = useFrameProcessor(
  (frame) => {
    'worklet';
    if (!runOnFrame || frame.timestamp % 3 !== 0) return; // Process every 3rd frame
    
    const detections = runOnFrame(frame, 0.7);
    // ...
  },
  [runOnFrame]
);

Type Reference

import { ResourceSource, PixelData, Frame, LabelEnum } from 'react-native-executorch';

type ObjectDetectionModelSources =
  | { modelName: 'ssdlite-320-mobilenet-v3-large'; modelSource: ResourceSource }
  | { modelName: 'rf-detr-nano'; modelSource: ResourceSource };

interface ObjectDetectionProps<C extends ObjectDetectionModelSources> {
  model: C;
  preventLoad?: boolean;
}

interface ObjectDetectionType<L extends LabelEnum> {
  error: RnExecutorchError | null;
  isReady: boolean;
  isGenerating: boolean;
  downloadProgress: number;
  forward: (input: string | PixelData, detectionThreshold?: number) => Promise<Detection<L>[]>;
  runOnFrame: ((frame: Frame, detectionThreshold: number) => Detection<L>[]) | null;
}

Classification - Image categorization
Semantic Segmentation - Pixel-level detection
VisionCamera Integration - Real-time processing

Getting Started

Core Concepts

Large Language Models

Computer Vision

Speech & Audio

Text Embeddings

Advanced

Guides

Quick Start

Hook API

`useObjectDetection<C>(props)`

Type Parameters

Parameters

Returns

Available Models

SSDLITE_320_MOBILENET_V3_LARGE

RF_DETR_NANO

Detection Types

Detection Interface

COCO Labels

Complete Example

Real-Time Camera Detection

Use Cases

Object Counting

Region of Interest

Object Tracking

Performance Tips

Threshold Tuning

Model Selection

Frame Skipping

Type Reference

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Large Language Models

Computer Vision

Speech & Audio

Text Embeddings

Advanced

Guides

Documentation Index

​Quick Start

​Hook API

​useObjectDetection<C>(props)

​Type Parameters

​Parameters

​Returns

​Available Models

​SSDLITE_320_MOBILENET_V3_LARGE

​RF_DETR_NANO

​Detection Types

​Detection Interface

​COCO Labels

​Complete Example

​Real-Time Camera Detection

​Use Cases

​Object Counting

​Region of Interest

​Object Tracking

​Performance Tips

​Threshold Tuning

​Model Selection

​Frame Skipping

​Type Reference

​Related

Build docs developers (and LLMs) love

Quick Start

Hook API

`useObjectDetection<C>(props)`

Type Parameters

Parameters

Returns

Available Models

SSDLITE_320_MOBILENET_V3_LARGE

RF_DETR_NANO

Detection Types

Detection Interface

COCO Labels

Complete Example

Real-Time Camera Detection

Use Cases

Object Counting

Region of Interest

Object Tracking

Performance Tips

Threshold Tuning

Model Selection

Frame Skipping

Type Reference

Related