Home/Explore/wavespeed-ai/moondream3-preview/detect

vision-language

wavespeed-ai/moondream3-preview/detect

Detect objects in images and get precise bounding box coordinates. Moondream3 Detect identifies objects and returns their locations for computer vision applications.

Hint: You can drag and drop a file or click to upload

If set to true, the function will wait for the result before returning the response. This property is only available through the API.

Idle

Your request will cost $0.001 per run.

For $1 you can run this model approximately 1000 times.

README

Moondream3 Detect - Object Detection

Moondream3 Detect is a specialized vision-language model for finding objects in images and returning their bounding box coordinates.

Features

  • Object Detection: Find specific objects in images
  • Bounding Boxes: Get precise coordinate locations
  • Multiple Objects: Detect multiple instances of the same object
  • Natural Language: Specify objects using plain English

Example Usage

Detect Cars

{
  "image": "https://example.com/photo.jpg",
  "prompt": "car"
}

Detect People

{
  "image": "https://example.com/photo.jpg",
  "prompt": "person"
}

Detect Any Object

{
  "image": "https://example.com/photo.jpg",
  "prompt": "bicycle"
}

Output Format

Returns bounding box coordinates in the format: [x1, y1, x2, y2] where (x1, y1) is the top-left corner and (x2, y2) is the bottom-right corner.

Best Practices

  • Use specific object names for better accuracy
  • Supported formats: JPEG, PNG, WebP
  • Max image size: 10MB

Pricing

Fixed price per request. Contact WaveSpeed for volume discounts.