Although their presence is not as common as predicted in the sci-fi movies of the 80s, robots are becoming more and more integrated into our daily lives. From robotic vacuum cleaners that take care of the dust in our homes to what humanoids are doing challenging parkour, the advancement of robots can be seen in many fields. Despite their increased presence, however, robots are still expensive to build, fragile in most cases and, more importantly, difficult to train to adapt to different environments.
It is critical for a robot to have strong perceptual abilities, which means understanding and operating within the environment. Imagine getting a robotic vacuum cleaner and the first introduction you see is to change the layout of your home in a similar way to the test rooms in the factory; so that the robot can work properly. That would be annoying and probably cause us to return the product immediately. To overcome such limitations, the robot must be properly trained.
Training a robot in the physical world to introduce it to different environments is possible, but comes with numerous drawbacks. For one thing, setting up logistics and replacing damaged robots would be expensive. On the other hand, the learning speed will be limited to real time and the logistics cost will increase significantly in parallel learning. Consequently, simulation training has increased in popularity and remains an active topic.
The most important issue to address when it comes to learning in simulation is generalization from simulation to the real world. In other words, how to ensure that the physics in the simulation are good enough to mimic the real world, and the visual effects are good enough to be accepted as photorealistic. To answer these questions, researchers from Stanford and UC Berkeley proposed the Gibson Environment, a perceptual and physical simulation.
Named after James J. Gibson, the author of An ecological approach to visual perception, Gibson Environment attempts to provide a simulation that can mimic both the physical and visual properties of the real world. Gibson can be used to train and test perceptual agents in the real world. It is possible to import any agent, car or humanoid, into the simulation, where the agent is contained in its physical body and placed in different sets of real spaces. This constraint provides a photorealistic real-time visual stream thanks to the new rendering mechanism as if the agent had a built-in camera. Additionally, it also places physical constraints such as collision and gravity on the agent as if it were in the real world.
Gibson is designed to preserve the performance of agents trained in a real-world simulation. This is achieved through the new method used in the construction of the visual elements in the simulation environment. First, the environment is built based on scanned real spaces, not artificial ones. Second, a mechanism is integrated into the simulation that removes the differences between the Gibson rendering and the actual camera capture. These two mechanisms ensure that images from a real camera and a Gibson visual renderer are statistically indistinguishable to the agent.
The researchers demonstrated a range of active perceptual tasks such as obstacle avoidance, long-distance navigation and stair climbing learned at Gibson were successfully transferred to the real world. While really exciting and promising, Gibson still has limitations to address (eg, including dynamic content such as moving objects, allowing simulation manipulations, etc.), as the authors describe at the end of their research paper.
This Article is written as a research summary article by Marktechpost Staff based on the research paper 'Gibson Env: Real-World Perception for Embodied Agents'. All Credit For This Research Goes To Researchers on This Project. Check out the paper, github link and project link. Please Don't Forget To Join Our ML Subreddit
Ekrem Chetinkaya received a bachelor’s degree. in 2018 and M.Sc. in 2019 from Ozyegin University, Istanbul, Turkey. He wrote his M.S. thesis on image denoising using deep convolutional networks. He is currently pursuing a Ph.D. degree at the University of Klagenfurt, Austria, and works as a researcher on the ATHENA project. His research interests include deep learning, computer vision, and multimedia networks.