Beyond deep learning: Agentic AI framework for object detection
Author Identifier (ORCID)
Syed Afaq Ali Shah: https://orcid.org/0000-0003-2181-8445
Abstract
Object detection remains a fundamental yet challenging problem in machine vision. Over the past decade, numerous state-of-the-art solutions have been developed, predominantly based on deep learning. While effective, these models typically require large-scale annotated datasets and substantial computational resources, limiting their scalability and adaptability. To address these constraints, zero-shot and few-shot learning approaches have been introduced. However, they often struggle with generalization and task-specific performance. Agentic AI has recently emerged as a promising paradigm, enabling autonomous task execution by leveraging powerful vision-language models without the need for task-specific training. In this paper, we propose an agentic AI framework for object detection and investigate its feasibility in the context of assistive robotics. Our experimental results demonstrate the framework’s potential for real-world deployment, highlighting its ability to perform zero-shot detection and reasoning in indoor environments. The source code is available at https://sites.google.com/view/afaqshah/code.
Keywords
Agentic AI, large language models, object detection
Document Type
Conference Proceeding
Date of Publication
1-1-2025
Publication Title
2025 40th International Conference on Image and Vision Computing New Zealand (IVCNZ)
Publisher
IEEE
School
Centre for Artificial Intelligence and Machine Learning (CAIML) / School of Science
RAS ID
84460
Funders
Edith Cowan University
Copyright
subscription content
Comments
Shah, S. A. A. (2025, November 19–21). Beyond deep learning: Agentic AI framework for object detection [Conference presentation]. 2025 International Conference on Image and Vision Computing New Zealand (IVCNZ), Wellington, New Zealand. https://doi.org/10.1109/IVCNZ67716.2025.11281644