-
Felipe R. Campos
Programa de Pós-Graduação em Instrumentação, Controle e Automação de Processos de Mineração, Universidade Federal de Ouro Preto, MG; Instituto Tecnológico Vale, Ouro Preto, MG
-
Aline X. Fidêncio
Faculty of Electrical Engineering and Information Technology, Ruhr-University Bochum
-
Gustavo Pessin
Instituto Tecnológico Vale, Ouro Preto, MG
-
Gustavo M. Freitas
Departamento de Engenharia Elétrica, Universidade Federal de Minas Gerais, Belo Horizonte, MG
Keywords:
Robotics, Machine Learning, Reinforcement Learning, DDPG, PPO
Abstract
Applications with autonomous robots play an important role in the industry and in everyday life. Among them, the activities of manipulating and moving objects are highlighted by the wide variety of possible applications. These activities in static and known environments can be implemented through logic planned by the developer, but this is not feasible in dynamic environments. Machine learning techniques such as Reinforcement Learning (RL) algorithms have sought to replace the pre-defined programming by teaching the robot how to act. This paper presents the implementation of two RL algorithms, Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization (PPO), for orientation and position control of a 6-degree-of-freedom (6-DoF) robotic manipulator. The results demonstrated that the DDPG had a faster learning convergence in simpler activities, but if the complexity of the problem increases, it might not obtain a satisfactory behavior. On the other hand, PPO can solve more complex problems, however, it limits the convergence rate to the best result in order to avoid learning instability.
DB Error: Lost connection to MySQL server during query
DB Error: MySQL server has gone away