Visual Intelligence

Grounding Dino Auto DetectAPI

zero-shot, open-vocabulary object detection model that combines Transformer-based DINO detectors with grounded pre-training to detect objects using natural language prompts. It acts as an automated detection tool that does not require labeled datasets to identify objects, making it highly effective for auto-labeling images and rapid prototyping

Test in Dashboard View Docs ← All Tools

Three ways to use this tool

REST API

One POST request. Get the result back directly — as an image, video, or JSON depending on the tool.

terminal

curl -X POST https://apiai.me/api/workflow/grounding-dino-auto-detect \
     -H "X-API-Key: YOUR_API_KEY" \
     -F "image=@input.jpg" \
     -F "box_threshold=0.25" \
     -F "query=VALUE" \
     -F "show_visualisation=true" \
     -F "text_threshold=0.25"
     --output result.png

Dashboard Playground

Test this tool visually before writing any code. Upload an image, set parameters, and see the result live. When it looks right, copy the auto-generated curl command and paste it into your app.

Open the dashboard → API Toolbox
Find Grounding Dino Auto Detect and click it
Upload your input and adjust parameters
Copy the curl command and ship

Open Dashboard →

Batch Processing

Process hundreds of images at once without writing a loop. Upload a CSV with one row per item, set your parameters, and download the results as a ZIP when they're done.

Open the dashboard → Batches
Select Grounding Dino Auto Detect
Add your content and start the batch
Download results ZIP when complete

Go to Batches →

Parameters

Name	Required	Description	Default / Options
`box_threshold`	optional	Confidence level for object detection	`0.25`
`query`	required	Comma seperated names of the objects to be detected in the image	—
`show_visualisation`	optional	Draw and visualize bounding boxes on the image	`truefalse`
`text_threshold`	optional	Confidence level for object detection	`0.25`