Moondream3 Detect - Object Detection

Object Detection: Find specific objects in images
Bounding Boxes: Get precise coordinate locations
Multiple Objects: Detect multiple instances of the same object
Natural Language: Specify objects using plain English

Moondream3 Detect is a specialized vision-language model for finding objects in images and returning their bounding box coordinates.

Features

{
  "image": "https://example.com/photo.jpg",
  "prompt": "car"
}

{
  "image": "https://example.com/photo.jpg",
  "prompt": "person"
}

{
  "image": "https://example.com/photo.jpg",
  "prompt": "bicycle"
}

Returns bounding box coordinates in the format: [x1, y1, x2, y2] where (x1, y1) is the top-left corner and (x2, y2) is the bottom-right corner.

Fixed price per request. Contact WaveSpeed for volume discounts.