Istrazivanja i projektovanja za privreduJournal of Applied Engineering Science

TRAINing NEURAL NETWORKS USing REINFORCEMENT LEARNing TO REACTIVE PATH PLANNing


DOI: 10.5937/jaes0-26386 
This is an open access article distributed under the CC BY 4.0
Creative Commons License

Volume 19 article 762 pages: 48 - 56

Milton Vicente Calderón*
Universidad Distrital Francisco José de Caldas, Faculty of Technology, Department of Electronic, Bogotá D.C, Colombia

Esperanza Camargo Casallas
Universidad Distrital Francisco José de Caldas, Faculty of Technology, Department of Electronic, Bogotá D.C, Colombia

The mobile robots are devices with great boom given the possibilities that their utilities offer, and to a greater extent, those freelancers who do not require an operator to perform their functions. In order to consolidate the autonomy it is necessary to generate a system of planning of ways that allows a viable route and as far as possible optimal. This study develops a reactive two-dimensional path planning method with neural networks trained under the reinforcement learning method. The complexity of the scenario between the initial and final point is due to warning and forbidden obstacle zones, and the experimentation is carried out on different neural network architectures, each one as an agent of the learning-by-reinforcement algorithm, being these DQN and DDQN types. The best results are obtained with the DDQN training, reaching the objective in 89% in the validation episodes, although the DQN method shows to be 15.63% faster in its success cases. This work was carried out within the research group DIGITI of the Universidad Distrital Francisco José de Caldas.

View article

The current work was developed with the support of CECAD (High-performance computing center of the Universidad Distrital Francisco José de Caldas) who provided the technological platform required to perform the training and test of the neural networks. The name of the project is "Plataforma estratosférica de vuelo autónomo Sabio Caldas". The code of the project is 3307348415.

1. Praveen Kalla et al. “Coordinate Reference Frame Technique for Robotic Planar Path Planning”. In: Materials Today: Proceedings 5.9, Part 3 (2018). Materials Processing and characterization,16th – 18th March 2018, pp. 19073–19079. issn: 2214- 7853. doi: 10.1016/j.matpr.2018.06. 260. url: http://www.sciencedirect.com/science/article/pii/ S221478531831383X.

2. Chaymaa Lamini, Said Benhlima, and Ali Elbekri. “Genetic Algorithm Based Approach for Autonomous Mobile Robot Path Planning”. In: Procedia Computer Science 127 (2018). PRO-CEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES, ICDS2017, pp. 180–189. issn: 1877-0509. doi:10.1016/j.procs.2018.01.113. url: http://www.sciencedirect.com/science/article/pii/ S187705091830125X.

3. Thi Thoa Mac et al. “Heuristic approaches in robot path planning: A survey”. In: Robotics and Autonomous Systems 86 (2016), pp. 13–28. issn: 0921-8890. doi: 10.1016/j.robot.2016.08. 001. url: http://www.sciencedirect.com/science/article/pii/ S0921889015300671.

4. Azzeddine Bakdi et al. “Optimal path planning and execution for mobile robots using genetic algorithm and adaptive fuzzy-logic control”. In: Robotics and Autonomous Systems 89 (2017), pp. 95–109. issn: 0921-8890. doi: 10.1016/j.robot.2016.12.008. url: http://www.sciencedirect.com/science/article/pii/ S0921889016302512.

5. Farhad Bayat, Sepideh Najafinia, and Morteza Aliyari. “Mobile robots path planning: Electrostatic potential field approach”. In: Expert Systems with Applications 100 (2018), pp. 68–78. issn: 0957-4174. doi: 10.1016/j.eswa.2018.01.050. url: http://www.sciencedirect.com/science/article/pii/ S0957417418300630.

6. Julien Moreau et al. “Reactive path planning in intersection for autonomous vehicle”. In: IFAC-PapersOn- Line 52.5 (2019). 9th IFAC Symposium on Advances in Automotive Control AAC 2019, pp. 109–114. issn: 2405-8963. doi: 10.1016/j.ifacol.2019.09.018. url: http: //www.sciencedirect.com/science/article/pii/ S2405896319306378.

7. Yi-Wen Chen and Wei-Yu Chiu. “Optimal Robot Path Planning System by Using a Neural Network-Based Approach”. In: Proceedings of 2015 International Automatic Control Conference (CACS) (2015), pp. 85–90.

8. Valeri Kroumov and Jianli Yu. “Neural Networks Based Path Planning and Navigation of Mobile Robots”. In: Recent Advances in Mobile Robotics (2011), pp. 173–190.

9. Keyu Wu et al. “TDPP-Net: Achieving three-dimensional path planning via a deep neural network architecture”. In: Neurocomputing 357 (2019), pp. 151–162. issn: 0925-2312. doi: 10.1016/j.neucom. 2019.05.001. url: http://www.sciencedirect. com/science/article/ pii/S092523121930606X.

10. Md. Arafat Hossain and Israt Ferdous. “Autonomous robot path planning in dynamic environment using a new optimization technique inspired by bacterial foraging technique”. In: Robotics and Autonomous Systems 64 (2015), pp. 137–141. issn: 0921-8890. doi: 10.1016/ j.robot.2014.07.002. url: http://www.sciencedirect.com/science/article/pii/ S0921889014001274.

11. Ee Soong Low, Pauline Ong, and Kah Chun Cheah. “Solving the optimal path planning of a mobile robot using improved Q-learning”. In: Robotics and Autonomous Systems 115 (2019), pp. 143–161. issn: 0921-8890. doi: 10.1016/j.robot.2019.02.013. url: http://www.sciencedirect.com/science/article/pii/ S0921889018308285.

12. Adam C Parry and Raul Ordonez. “Intelligent Path Planning with Evolutionary Computation”. In: AIAA Guidance, Navigation, and Control Conference August (2010).

13. Alessandro Gasparetto et al. “Path Planning and Trajectory Planning Algorithms: a General Overview”. In: Motion and Operation Planning of Robotic Systems. September 2017. 2015. Chap. 1, pp. 3–27. isbn: 9783319147055.

14. Yandun Aracely and Sotomayor. Nelson. “Planeacion y seguimiento de trayectorias para un robot movil”. In: (2012).

15. Chen Chen et al. “A knowledge-free path planning approach for smart ships based on reinforcement learning”. In: Ocean Engineering 189 (2019), p. 106299. issn: 0029-8018. doi: 10.1016/ j.oceaneng.2019.106299. url: http://www.sciencedirect. com/science/article/pii/S0029801819304706.

16. Richard S. Sutton and Andrew G Barto. Reinforcement Learning: An Introduction. 2nd ed. 2018, p. 552. isbn: 978-0262039246.

17. Ding Ding et al. “Q-learning based dynamic task scheduling for energy-efficient cloud computing”. In: Future Generation Computer Systems 108 (2020), pp. 361–371. issn: 0167-739X. doi: 10.1016/j.future. 2020.02.018. url: http://www.sciencedirect.com/ science/article/ pii/S0167739X19313858.

18. Gyeeun Jeong and Ha Young Kim. “Improving financial trading decisions using deep Q-learning: Predicting the number of shares, action strategies, and transfer learning”. In: Expert Systems with Applications 117 (2019), pp. 125–138. issn: 0957-4174. doi: 10.1016/j.eswa.2018.09.036. url: http://www.sciencedirect.com/science/article/pii/ S0957417418306134.

19. Martin Riedmiller. “Neural fitted Q iteration - First experiences with a data efficient neural Reinforcement Learning method”. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 3720 LNAI (2005), pp. 317–328. issn: 03029743.

20. Sebastian Thrun and Anton Schwartz. “Issues in Using Function Approximation for Reinforcement Learning”. In: Proceedings of the 4th Connectionist Models Summer School Hillsdale, NJ. Lawrence Erlbaum (1993), pp. 1–9.

21. Hado Van Hasselt. “Double Q-learning”. In: Advances in Neural Information Processing Systems. January 2010. 2010.

22. Hado Van Hasselt. Insights in Reinforcement Learning: Formal Analysis and Empirical Evaluation of Temporal-Difference Learning Algorithms. 2011, pp. 1–282. isbn: 9789039354964.