zero-shot, open-vocabulary object detection model that combines Transformer-based DINO detectors with grounded pre-training to detect objects using natural language prompts. It acts as an automated detection tool that does not require labeled datasets to identify objects, making it highly effective for auto-labeling images and rapid prototyping
One POST request. Get the result back directly — as an image, video, or JSON depending on the tool.
curl -X POST https://apiai.me/api/workflow/grounding-dino-auto-detect \
-H "X-API-Key: YOUR_API_KEY" \
-F "image=@input.jpg" \
-F "box_threshold=0.25" \
-F "query=VALUE" \
-F "show_visualisation=true" \
-F "text_threshold=0.25"
--output result.png
Test this tool visually before writing any code. Upload an image, set parameters, and see the result live. When it looks right, copy the auto-generated curl command and paste it into your app.
Process hundreds of images at once without writing a loop. Upload a CSV with one row per item, set your parameters, and download the results as a ZIP when they're done.
| Name | Required | Description | Default / Options |
|---|---|---|---|
box_threshold |
optional | Confidence level for object detection | 0.25
|
query |
required | Comma seperated names of the objects to be detected in the image | — |
show_visualisation |
optional | Draw and visualize bounding boxes on the image | trruefalse
|
text_threshold |
optional | Confidence level for object detection | 0.25
|
Get an API key, test Grounding Dino Auto Detect in the dashboard, and copy the curl command.