Optimizing Autonomous Navigation: Advances in LiDAR-based Object Recognition with Modified Voxel-RCNN

Firman; Arief Suryadi Satyawan; Helfy Susilawati; Mokh. Mirza Etnisa  Haqiqi; Khaulyca Arva Artemysia; Sani Moch Sopian; Beni Wijaya; Muhammad Ikbal Samie

doi:10.22219/kinetik.v10i2.2199

Issue

Vol. 10, No. 2, May 2025

Issue Published : May 31, 2025

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Optimizing Autonomous Navigation: Advances in LiDAR-based Object Recognition with Modified Voxel-RCNN

https://doi.org/10.22219/kinetik.v10i2.2199

Firman

Universitas Garut

Arief Suryadi Satyawan

Research Center for Telecommunication - BRIN

Helfy Susilawati

Universitas Garut

Mokh. Mirza Etnisa Haqiqi

Universitas Garut

Khaulyca Arva Artemysia

Universitas Garut

Sani Moch Sopian

Universitas Garut

Beni Wijaya

Universitas Garut

Muhammad Ikbal Samie

Universitas Garut

Corresponding Author(s) : Firman

24052121044@fteknik.uniga.ac.id

Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, Vol. 10, No. 2, May 2025
Article Published : May 31, 2025

Abstract

This study aimed to enhance the object recognition capabilities of autonomous vehicles in constrained and dynamic environments. By integrating Light Detection and Ranging (LiDAR) technology with a modified Voxel-RCNN framework, the system detected and classified six object classes: human, wall, car, cyclist, tree, and cart. This integration improved the safety and reliability of autonomous navigation. The methodology included the preparation of a point cloud dataset, conversion into the KITTI format for compatibility with the Voxel-RCNN pipeline, and comprehensive model training. The framework was evaluated using metrics such as precision, recall, F1-score, and mean average precision (mAP). Modifications to the Voxel-RCNN framework were introduced to improve classification accuracy, addressing challenges encountered in complex navigation scenarios. Experimental results demonstrated the robustness of the proposed modifications. Modification 2 consistently outperformed the baseline, with 3D detection scores for the car class in hard scenarios increasing from 4.39 to 10.31. Modification 3 achieved the lowest training loss of 1.68 after 600 epochs, indicating significant improvements in model optimization. However, variability in the real-world performance of Modification 3 highlighted the need for balancing optimized training with practical applicability. Overall, the study found that the training loss decreased up to 29.1% and achieved substantial improvements in detection accuracy under challenging conditions. These findings underscored the potential of the proposed system to advance the safety and intelligence of autonomous vehicles, providing a solid foundation for future research in autonomous navigation and object recognition.

Keywords

Autonomous vehicles LiDAR Voxel-RCNN Object recognition Point cloud dataset 3D detection

Firman, Satyawan, A. S., Susilawati, H., Haqiqi, M. M. E. ., Artemysia, K. A., Sopian, S. M., Wijaya, B., & Samie, M. I. (2025). Optimizing Autonomous Navigation: Advances in LiDAR-based Object Recognition with Modified Voxel-RCNN. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 10(2). https://doi.org/10.22219/kinetik.v10i2.2199

Download Citation

References

M. Nadeem Hangar, Q. Ahmed, F. Khan, and M. Hafeez, “A Survey of Autonomous Vehicles: Enabling Communication Technologies and Challenges,” Sensors, vol. 21, p. 706, 2021. https://doi.org/10.3390/s21030706
R. Keith and H. La, “Review of Autonomous Mobile Robots for the Warehouse Environment,” 2024. https://doi.org/10.48550/arXiv.2406.08333
A. Roshanianfard, N. Noguchi, H. Okamoto, and K. Ishii, “A review of autonomous agricultural vehicles (The experience of Hokkaido University),” J. Terramechanics, vol. 91, pp. 155–183, 2020. https://doi.org/10.1016/j.jterra.2020.06.006
M. Ibiyemi and D. Olutimehin, “Revolutionizing logistics: The impact of autonomous vehicles on supply chain efficiency,” Int. J. Sci. Res. Updat., vol. 8, pp. 9–26, 2024. https://doi.org/10.53430/ijsru.2024.8.1.0042
E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A Survey of Autonomous Driving: Common Practices and Emerging Technologies,” IEEE Access, vol. PP, p. 1, 2020. https://doi.org/10.1109/ACCESS.2020.2983149
R. Qian, X. Lai, and X. Li, 3D Object Detection for Autonomous Driving: A Survey. 2021. https://doi.org/10.48550/arXiv.2106.10823
R. Qian, X. Lai, and X. Li, “3D Object Detection for Autonomous Driving: A Survey,” Pattern Recognit., vol. 130, 2022. https://doi.org/10.1016/j.patcog.2022.108796
F. Liu, Z. Lu, and X. Lin, “Vision-based environmental perception for autonomous driving,” Proc. Inst. Mech. Eng. Part D J. Automob. Eng., 2023. https://doi.org/10.1177/09544070231203059
L. Peng, H. Wang, and J. Li, “Uncertainty Evaluation of Object Detection Algorithms for Autonomous Vehicles,” Automot. Innov., vol. 4, 2021. https://doi.org/10.1007/s42154-021-00154-0
L. Lidar, L. Bai, S. Member, Y. Zhao, X. Huang, and S. Member, “Enabling 3D Object Detection with a,” vol. 14, no. 8, pp. 2–5, 2015.
Y. Wang, W. Chao, D. Garg, B. Hariharan, M. Campbell, and K. Q. Weinberger, “Pseudo-LiDAR from Visual Depth Estimation : Bridging the Gap in 3D Object Detection for Autonomous Driving”.
J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang, and H. Li, “Voxel R-CNN : Towards High Performance Voxel-based 3D Object Detection,” 2020.
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017. https://doi.org/10.1109/TPAMI.2016.2577031
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–14, 2015.
N. L. W. Keijsers, “Neural Networks,” Encycl. Mov. Disord. Three-Volume Set, pp. V2-257-V2-259, 2010. https://doi.org/10.1016/B978-0-12-374105-9.00493-7
K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 386–397, 2020. https://doi.org/10.1109/TPAMI.2018.2844175
J. Shin, J. Kim, K. Lee, H. Cho, and W. Rhee, “Diversified and Realistic 3D Augmentation via Iterative Construction, Random Placement, and HPR Occlusion,” Proc. 37th AAAI Conf. Artif. Intell. AAAI 2023, vol. 37, pp. 2282–2291, 2023. https://doi.org/10.1609/aaai.v37i2.25323
A. Dosovitskiy et al., “an Image Is Worth 16X16 Words: Transformers for Image Recognition At Scale,” ICLR 2021 - 9th Int. Conf. Learn. Represent., 2021.
D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–15, 2015.
Y. Bengio, Learning deep architectures for AI, vol. 2, no. 1. 2009. https://doi.org/10.1561/2200000006
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” 1st Int. Conf. Learn. Represent. ICLR 2013 - Work. Track Proc., pp. 1–12, 2013.
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, no. Mlm, pp. 4171–4186, 2019.
Z. Chao, F. Pu, Y. Yin, B. Han, and X. Chen, “Research on real-time local rainfall prediction based on MEMS sensors,” J. Sensors, vol. 2018, pp. 1–9, 2018. https://doi.org/10.1155/2018/6184713
G. Cohen and R. Giryes, “Generative Adversarial Networks,” Mach. Learn. Data Sci. Handb. Data Min. Knowl. Discov. Handbook, Third Ed., pp. 375–400, 2023. https://doi.org/10.1007/978-3-031-24628-9_17
Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully convolutional one-stage object detection,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2019-Octob, pp. 9626–9635, 2019. https://doi.org/10.1109/ICCV.2019.00972
Q. Zhong and X.-F. Han, “Point Cloud Learning with Transformer,” 2021. https://doi.org/10.48550/arXiv.2104.13636
A. Howard et al., “Searching for MobileNetV3,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314–1324. https://doi.org/10.1109/ICCV.2019.00140

References

M. Nadeem Hangar, Q. Ahmed, F. Khan, and M. Hafeez, “A Survey of Autonomous Vehicles: Enabling Communication Technologies and Challenges,” Sensors, vol. 21, p. 706, 2021. https://doi.org/10.3390/s21030706

R. Keith and H. La, “Review of Autonomous Mobile Robots for the Warehouse Environment,” 2024. https://doi.org/10.48550/arXiv.2406.08333

A. Roshanianfard, N. Noguchi, H. Okamoto, and K. Ishii, “A review of autonomous agricultural vehicles (The experience of Hokkaido University),” J. Terramechanics, vol. 91, pp. 155–183, 2020. https://doi.org/10.1016/j.jterra.2020.06.006

M. Ibiyemi and D. Olutimehin, “Revolutionizing logistics: The impact of autonomous vehicles on supply chain efficiency,” Int. J. Sci. Res. Updat., vol. 8, pp. 9–26, 2024. https://doi.org/10.53430/ijsru.2024.8.1.0042

E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A Survey of Autonomous Driving: Common Practices and Emerging Technologies,” IEEE Access, vol. PP, p. 1, 2020. https://doi.org/10.1109/ACCESS.2020.2983149

R. Qian, X. Lai, and X. Li, 3D Object Detection for Autonomous Driving: A Survey. 2021. https://doi.org/10.48550/arXiv.2106.10823

R. Qian, X. Lai, and X. Li, “3D Object Detection for Autonomous Driving: A Survey,” Pattern Recognit., vol. 130, 2022. https://doi.org/10.1016/j.patcog.2022.108796

F. Liu, Z. Lu, and X. Lin, “Vision-based environmental perception for autonomous driving,” Proc. Inst. Mech. Eng. Part D J. Automob. Eng., 2023. https://doi.org/10.1177/09544070231203059

L. Peng, H. Wang, and J. Li, “Uncertainty Evaluation of Object Detection Algorithms for Autonomous Vehicles,” Automot. Innov., vol. 4, 2021. https://doi.org/10.1007/s42154-021-00154-0

L. Lidar, L. Bai, S. Member, Y. Zhao, X. Huang, and S. Member, “Enabling 3D Object Detection with a,” vol. 14, no. 8, pp. 2–5, 2015.

Y. Wang, W. Chao, D. Garg, B. Hariharan, M. Campbell, and K. Q. Weinberger, “Pseudo-LiDAR from Visual Depth Estimation : Bridging the Gap in 3D Object Detection for Autonomous Driving”.

J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang, and H. Li, “Voxel R-CNN : Towards High Performance Voxel-based 3D Object Detection,” 2020.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017. https://doi.org/10.1109/TPAMI.2016.2577031

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–14, 2015.

N. L. W. Keijsers, “Neural Networks,” Encycl. Mov. Disord. Three-Volume Set, pp. V2-257-V2-259, 2010. https://doi.org/10.1016/B978-0-12-374105-9.00493-7

K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 386–397, 2020. https://doi.org/10.1109/TPAMI.2018.2844175

J. Shin, J. Kim, K. Lee, H. Cho, and W. Rhee, “Diversified and Realistic 3D Augmentation via Iterative Construction, Random Placement, and HPR Occlusion,” Proc. 37th AAAI Conf. Artif. Intell. AAAI 2023, vol. 37, pp. 2282–2291, 2023. https://doi.org/10.1609/aaai.v37i2.25323

A. Dosovitskiy et al., “an Image Is Worth 16X16 Words: Transformers for Image Recognition At Scale,” ICLR 2021 - 9th Int. Conf. Learn. Represent., 2021.

D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–15, 2015.

Y. Bengio, Learning deep architectures for AI, vol. 2, no. 1. 2009. https://doi.org/10.1561/2200000006

T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” 1st Int. Conf. Learn. Represent. ICLR 2013 - Work. Track Proc., pp. 1–12, 2013.

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, no. Mlm, pp. 4171–4186, 2019.

Z. Chao, F. Pu, Y. Yin, B. Han, and X. Chen, “Research on real-time local rainfall prediction based on MEMS sensors,” J. Sensors, vol. 2018, pp. 1–9, 2018. https://doi.org/10.1155/2018/6184713

G. Cohen and R. Giryes, “Generative Adversarial Networks,” Mach. Learn. Data Sci. Handb. Data Min. Knowl. Discov. Handbook, Third Ed., pp. 375–400, 2023. https://doi.org/10.1007/978-3-031-24628-9_17

Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully convolutional one-stage object detection,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2019-Octob, pp. 9626–9635, 2019. https://doi.org/10.1109/ICCV.2019.00972

Q. Zhong and X.-F. Han, “Point Cloud Learning with Transformer,” 2021. https://doi.org/10.48550/arXiv.2104.13636

A. Howard et al., “Searching for MobileNetV3,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314–1324. https://doi.org/10.1109/ICCV.2019.00140

Author Biographies

Firman, Universitas Garut

Lead author, Conceptual, Metodology, Experimental, Analisis,Dataset Production, Writing, Editing

Arief Suryadi Satyawan, Research Center for Telecommunication - BRIN

Conceptual, Metodology, Writing Review

Helfy Susilawati, Universitas Garut

Conceptual, Metodology, Writing Review

Mokh. Mirza Etnisa Haqiqi, Universitas Garut

Conceptual, Metodology, Writing Review

Khaulyca Arva Artemysia, Universitas Garut

Dataset Production, Experimental

Sani Moch Sopian, Universitas Garut

Dataset Production, Experimental

Beni Wijaya, Universitas Garut

Dataset Production, Experimental

Muhammad Ikbal Samie, Universitas Garut

Dataset Production, Experimental

Download this PDF file

PDF

Statistic

Read Counter : 0 Download : 0

Downloads

Download data is not yet available.