Journal of Agricultural Science and Technology ›› 2025, Vol. 27 ›› Issue (8): 100-109.DOI: 10.13304/j.nykjdb.2024.0136

• INTELLIGENT AGRICULTURE & AGRICULTURAL MACHINERY • Previous Articles    

Dragon Fruit Object Detection and Counting Method in Wide Field of View

Chunfan OUYANG(), Jiazheng GAO, Qiao CHEN, Chunlin ZENG, Wentao LI, Mingwei XIAO, Chendi LUO, Xuecheng ZHOU()   

  1. Guangdong Provincial Key Laboratory of Agricultural Artificial Intelligence,Key Laboratory of Key Technology on Agricultural Machine and Equipment of Ministry of Education,College of Engineering,South China Agricultural University,Guangzhou 510642,China
  • Received:2024-02-27 Accepted:2024-06-08 Online:2025-08-15 Published:2025-08-26
  • Contact: Xuecheng ZHOU

大视场下火龙果目标检测与计数方法

欧阳春凡(), 高嘉正, 陈桥, 曾春林, 李文涛, 肖明玮, 罗陈迪, 周学成()   

  1. 华南农业大学工程学院,广东省农业人工智能重点实验室,南方农业机械与装备关键技术 教育部重点实验室,广州 510642
  • 通讯作者: 周学成
  • 作者简介:欧阳春凡 E-mail:oycf0804@163.com
  • 基金资助:
    国家重点研发计划项目(2017YFD0700602)

Abstract:

To overcome hindrances such as low accuracy of small target pitaya recognition, poor real-time performance, and difficulties in fruit counting under expansive field conditions, a method was proposed for pitaya target detection and enumeration within large visual fields. This should allow for precision identification and quantification of small target pitayas, thereby refining the preparatory guidance tasks associated with robotic pitaya harvesting. In the feature extraction stage, the dynamic deformable convolution C2F_DCNV2_Dynamic was employed to replace the C2F module of the YOLOv8 backbone network. Conv_offer_mask was introduced to obtain deformable offsets and masks of input feature maps, enabling the network to better adapt to the features of target shapes and enhance the capability of extracting target features from complex backgrounds. The mechanism module MPCA (multipath coordinate attention) was improved to perform multi-path processing on the input, allowing the model to simultaneously focus on the spatial and channel information of the input tensor, thus improving the feature perception ability of the network for different scales and contexts, and thereby enhancing the accuracy of small target recognition. In the target prediction stage, the Decoder Head of the detection model RT-DETR based on end-to-end Transformer was used to replace the YOLO Head. Through ensemble prediction methods, targets were directly predicted and associated, eliminating the traditional non-maximum suppression (NMS)step to improve inference speed and further enhance the real-time performance of the network. In the target counting stage, the Deep Sort algorithm was combined to achieve fruit area counting. The results showed that the improved object detection network had an average accuracy of 99.0% for dragon fruit detection, transmits 32 frames per second in the real-time test, the model size was 11.8 MB, and the fruit counting accuracy reached 82.96%, with the average detection speed 17 frames·s-1. This method could accurately identify and count small target dragon fruits under large field of view conditions, with real-time performance meeting the actual production environment of fruit orchards.

Key words: small target pitaya identification, fruit counting, YOLOv8, C2F_DCNV2_Dynamic, MPCA, Decoder Head

摘要:

为解决大视场条件下小目标火龙果识别精度低、实时性差、果实计数困难的问题,提出一种大视场下火龙果目标检测与计数方法,实现火龙果小目标精准识别与计数,完善火龙果机器人采前指导性工作。在目标特征提取阶段,采用动态可变形卷积C2F_DCNV2_Dynamic替换YOLOv8主干网络的C2F模块,引入Conv_offer_mask获取输入特征图的可变形偏移和掩码,使网络能够更好地适应目标形状的特征,提升复杂背景的目标特征提取能力;以多路协调注意力(multipath coordinate attention,MPCA)机制模块对输入进行多路径处理,使模型可以同时关注输入张量的空间信息与通道信息,提高网络对不同尺度和语境的特征感知能力,进而提升小目标识别精度;在目标预测阶段,使用基于端到端Transformer的检测器RT-DETR的Decoder Head替换YOLO Head,通过集合预测方法直接对目标进行预测和关联,去除传统非极大抑制步骤,提高推理速度,进一步提升网络实时性能;在目标计数阶段,结合Deep Sort算法实现果实区域计数。结果表明,改进的目标检测网络对火龙果果实检测的平均精度可达99.0%,在实时性测试中每秒传输32帧,模型大小为11.8 MB,果实计数精度达82.96%,平均检测速度为17帧·s-1。该方法能够精准识别与计数大视场条件下的小目标火龙果,且实时性满足果园实际生产环境。

关键词: 小目标火龙果识别, 果实计数, YOLOv8, C2F_DCNV2_Dynamic, MPCA, Decoder Head

CLC Number: