Issue

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Optimizing Autonomous Navigation: Advances in LiDAR-based Object Recognition with Modified Voxel-RCNN
Corresponding Author(s) : Firman
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control,
Vol. 10, No. 2, May 2025
Abstract
This study aimed to enhance the object recognition capabilities of autonomous vehicles in constrained and dynamic environments. By integrating Light Detection and Ranging (LiDAR) technology with a modified Voxel-RCNN framework, the system detected and classified six object classes: human, wall, car, cyclist, tree, and cart. This integration improved the safety and reliability of autonomous navigation. The methodology included the preparation of a point cloud dataset, conversion into the KITTI format for compatibility with the Voxel-RCNN pipeline, and comprehensive model training. The framework was evaluated using metrics such as precision, recall, F1-score, and mean average precision (mAP). Modifications to the Voxel-RCNN framework were introduced to improve classification accuracy, addressing challenges encountered in complex navigation scenarios. Experimental results demonstrated the robustness of the proposed modifications. Modification 2 consistently outperformed the baseline, with 3D detection scores for the car class in hard scenarios increasing from 4.39 to 10.31. Modification 3 achieved the lowest training loss of 1.68 after 600 epochs, indicating significant improvements in model optimization. However, variability in the real-world performance of Modification 3 highlighted the need for balancing optimized training with practical applicability. Overall, the study found that the training loss decreased up to 29.1% and achieved substantial improvements in detection accuracy under challenging conditions. These findings underscored the potential of the proposed system to advance the safety and intelligence of autonomous vehicles, providing a solid foundation for future research in autonomous navigation and object recognition.
Keywords
Download Citation
Endnote/Zotero/Mendeley (RIS)BibTeX
- M. Nadeem Hangar, Q. Ahmed, F. Khan, and M. Hafeez, “A Survey of Autonomous Vehicles: Enabling Communication Technologies and Challenges,” Sensors, vol. 21, p. 706, 2021. https://doi.org/10.3390/s21030706
- R. Keith and H. La, “Review of Autonomous Mobile Robots for the Warehouse Environment,” 2024. https://doi.org/10.48550/arXiv.2406.08333
- A. Roshanianfard, N. Noguchi, H. Okamoto, and K. Ishii, “A review of autonomous agricultural vehicles (The experience of Hokkaido University),” J. Terramechanics, vol. 91, pp. 155–183, 2020. https://doi.org/10.1016/j.jterra.2020.06.006
- M. Ibiyemi and D. Olutimehin, “Revolutionizing logistics: The impact of autonomous vehicles on supply chain efficiency,” Int. J. Sci. Res. Updat., vol. 8, pp. 9–26, 2024. https://doi.org/10.53430/ijsru.2024.8.1.0042
- E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A Survey of Autonomous Driving: Common Practices and Emerging Technologies,” IEEE Access, vol. PP, p. 1, 2020. https://doi.org/10.1109/ACCESS.2020.2983149
- R. Qian, X. Lai, and X. Li, 3D Object Detection for Autonomous Driving: A Survey. 2021. https://doi.org/10.48550/arXiv.2106.10823
- R. Qian, X. Lai, and X. Li, “3D Object Detection for Autonomous Driving: A Survey,” Pattern Recognit., vol. 130, 2022. https://doi.org/10.1016/j.patcog.2022.108796
- F. Liu, Z. Lu, and X. Lin, “Vision-based environmental perception for autonomous driving,” Proc. Inst. Mech. Eng. Part D J. Automob. Eng., 2023. https://doi.org/10.1177/09544070231203059
- L. Peng, H. Wang, and J. Li, “Uncertainty Evaluation of Object Detection Algorithms for Autonomous Vehicles,” Automot. Innov., vol. 4, 2021. https://doi.org/10.1007/s42154-021-00154-0
- L. Lidar, L. Bai, S. Member, Y. Zhao, X. Huang, and S. Member, “Enabling 3D Object Detection with a,” vol. 14, no. 8, pp. 2–5, 2015.
- Y. Wang, W. Chao, D. Garg, B. Hariharan, M. Campbell, and K. Q. Weinberger, “Pseudo-LiDAR from Visual Depth Estimation : Bridging the Gap in 3D Object Detection for Autonomous Driving”.
- J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang, and H. Li, “Voxel R-CNN : Towards High Performance Voxel-based 3D Object Detection,” 2020.
- S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017. https://doi.org/10.1109/TPAMI.2016.2577031
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–14, 2015.
- N. L. W. Keijsers, “Neural Networks,” Encycl. Mov. Disord. Three-Volume Set, pp. V2-257-V2-259, 2010. https://doi.org/10.1016/B978-0-12-374105-9.00493-7
- K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 386–397, 2020. https://doi.org/10.1109/TPAMI.2018.2844175
- J. Shin, J. Kim, K. Lee, H. Cho, and W. Rhee, “Diversified and Realistic 3D Augmentation via Iterative Construction, Random Placement, and HPR Occlusion,” Proc. 37th AAAI Conf. Artif. Intell. AAAI 2023, vol. 37, pp. 2282–2291, 2023. https://doi.org/10.1609/aaai.v37i2.25323
- A. Dosovitskiy et al., “an Image Is Worth 16X16 Words: Transformers for Image Recognition At Scale,” ICLR 2021 - 9th Int. Conf. Learn. Represent., 2021.
- D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–15, 2015.
- Y. Bengio, Learning deep architectures for AI, vol. 2, no. 1. 2009. https://doi.org/10.1561/2200000006
- T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” 1st Int. Conf. Learn. Represent. ICLR 2013 - Work. Track Proc., pp. 1–12, 2013.
- J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, no. Mlm, pp. 4171–4186, 2019.
- Z. Chao, F. Pu, Y. Yin, B. Han, and X. Chen, “Research on real-time local rainfall prediction based on MEMS sensors,” J. Sensors, vol. 2018, pp. 1–9, 2018. https://doi.org/10.1155/2018/6184713
- G. Cohen and R. Giryes, “Generative Adversarial Networks,” Mach. Learn. Data Sci. Handb. Data Min. Knowl. Discov. Handbook, Third Ed., pp. 375–400, 2023. https://doi.org/10.1007/978-3-031-24628-9_17
- Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully convolutional one-stage object detection,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2019-Octob, pp. 9626–9635, 2019. https://doi.org/10.1109/ICCV.2019.00972
- Q. Zhong and X.-F. Han, “Point Cloud Learning with Transformer,” 2021. https://doi.org/10.48550/arXiv.2104.13636
- A. Howard et al., “Searching for MobileNetV3,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314–1324. https://doi.org/10.1109/ICCV.2019.00140
References
M. Nadeem Hangar, Q. Ahmed, F. Khan, and M. Hafeez, “A Survey of Autonomous Vehicles: Enabling Communication Technologies and Challenges,” Sensors, vol. 21, p. 706, 2021. https://doi.org/10.3390/s21030706
R. Keith and H. La, “Review of Autonomous Mobile Robots for the Warehouse Environment,” 2024. https://doi.org/10.48550/arXiv.2406.08333
A. Roshanianfard, N. Noguchi, H. Okamoto, and K. Ishii, “A review of autonomous agricultural vehicles (The experience of Hokkaido University),” J. Terramechanics, vol. 91, pp. 155–183, 2020. https://doi.org/10.1016/j.jterra.2020.06.006
M. Ibiyemi and D. Olutimehin, “Revolutionizing logistics: The impact of autonomous vehicles on supply chain efficiency,” Int. J. Sci. Res. Updat., vol. 8, pp. 9–26, 2024. https://doi.org/10.53430/ijsru.2024.8.1.0042
E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A Survey of Autonomous Driving: Common Practices and Emerging Technologies,” IEEE Access, vol. PP, p. 1, 2020. https://doi.org/10.1109/ACCESS.2020.2983149
R. Qian, X. Lai, and X. Li, 3D Object Detection for Autonomous Driving: A Survey. 2021. https://doi.org/10.48550/arXiv.2106.10823
R. Qian, X. Lai, and X. Li, “3D Object Detection for Autonomous Driving: A Survey,” Pattern Recognit., vol. 130, 2022. https://doi.org/10.1016/j.patcog.2022.108796
F. Liu, Z. Lu, and X. Lin, “Vision-based environmental perception for autonomous driving,” Proc. Inst. Mech. Eng. Part D J. Automob. Eng., 2023. https://doi.org/10.1177/09544070231203059
L. Peng, H. Wang, and J. Li, “Uncertainty Evaluation of Object Detection Algorithms for Autonomous Vehicles,” Automot. Innov., vol. 4, 2021. https://doi.org/10.1007/s42154-021-00154-0
L. Lidar, L. Bai, S. Member, Y. Zhao, X. Huang, and S. Member, “Enabling 3D Object Detection with a,” vol. 14, no. 8, pp. 2–5, 2015.
Y. Wang, W. Chao, D. Garg, B. Hariharan, M. Campbell, and K. Q. Weinberger, “Pseudo-LiDAR from Visual Depth Estimation : Bridging the Gap in 3D Object Detection for Autonomous Driving”.
J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang, and H. Li, “Voxel R-CNN : Towards High Performance Voxel-based 3D Object Detection,” 2020.
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017. https://doi.org/10.1109/TPAMI.2016.2577031
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–14, 2015.
N. L. W. Keijsers, “Neural Networks,” Encycl. Mov. Disord. Three-Volume Set, pp. V2-257-V2-259, 2010. https://doi.org/10.1016/B978-0-12-374105-9.00493-7
K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 386–397, 2020. https://doi.org/10.1109/TPAMI.2018.2844175
J. Shin, J. Kim, K. Lee, H. Cho, and W. Rhee, “Diversified and Realistic 3D Augmentation via Iterative Construction, Random Placement, and HPR Occlusion,” Proc. 37th AAAI Conf. Artif. Intell. AAAI 2023, vol. 37, pp. 2282–2291, 2023. https://doi.org/10.1609/aaai.v37i2.25323
A. Dosovitskiy et al., “an Image Is Worth 16X16 Words: Transformers for Image Recognition At Scale,” ICLR 2021 - 9th Int. Conf. Learn. Represent., 2021.
D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–15, 2015.
Y. Bengio, Learning deep architectures for AI, vol. 2, no. 1. 2009. https://doi.org/10.1561/2200000006
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” 1st Int. Conf. Learn. Represent. ICLR 2013 - Work. Track Proc., pp. 1–12, 2013.
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, no. Mlm, pp. 4171–4186, 2019.
Z. Chao, F. Pu, Y. Yin, B. Han, and X. Chen, “Research on real-time local rainfall prediction based on MEMS sensors,” J. Sensors, vol. 2018, pp. 1–9, 2018. https://doi.org/10.1155/2018/6184713
G. Cohen and R. Giryes, “Generative Adversarial Networks,” Mach. Learn. Data Sci. Handb. Data Min. Knowl. Discov. Handbook, Third Ed., pp. 375–400, 2023. https://doi.org/10.1007/978-3-031-24628-9_17
Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully convolutional one-stage object detection,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2019-Octob, pp. 9626–9635, 2019. https://doi.org/10.1109/ICCV.2019.00972
Q. Zhong and X.-F. Han, “Point Cloud Learning with Transformer,” 2021. https://doi.org/10.48550/arXiv.2104.13636
A. Howard et al., “Searching for MobileNetV3,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314–1324. https://doi.org/10.1109/ICCV.2019.00140