In the past decade, artificial intelligence (AI) technology has made incredible strides forward. Writing in the New York Times, technology columnist Kevin Rouse recently announced a “the golden age of progress” for AI.
These advances could not have been made without improvements in machine learning (ML) technology, including deep and reinforcement learning techniques. The emergence of hardware to support computationally challenging AI models is another contribution. Some of the developments are slow and steady, while others present themselves as breakthroughs.
DeepMind’s AlphaFold is one such breakthrough in the scientific space. Their XLand technology is set to repeat the success.
Also Read: Deep Learning Supervised or Unsupervised?
What is XLand?
XLand is a digital 3D learning environment for artificial intelligence agents. The environment looks like a colorful playground similar to video games. In this playground, players are faced with billions of different tasks to solve.
In this respect, XLand is like other AI training tools. As we delve deeper into this new release, it will become clear just how much further this environment is going.
This tool is much more than an AI playground. Tasks are set by changing the composition of the environment, the rules of the game and the number of players. There is also a playground manager who is responsible for adapting the rules and layout of the environment. Players are AI agents who use XLand to tackle progressively more complex tasks.
Both playground managers and AI players use a technique called reinforcement learning. They learn by trial and error, being rewarded for solving a problem correctly and punished when they get it wrong.
Also read: Self-taught AI will be the end of us
Features of XLand
Besides using reinforcement learning, XLand is based on open-ended learning. In this way, the tool mimics how people learn. Children during play, for example, learn without an explicit goal. They simply explore their surroundings with different toys to develop a better understanding of their world.
Here are some of the key features of XLand:
- Learning progresses from simple to complex tasks
- Learning is open-ended and reinforcement-based
- Players learn by experimenting
In the XLand environment, AI players start small and then move on to more complex tasks. This feature is another parallel to human learning. Babies tend to play with simple toys, solving simple tasks. As they age, their toys become more complex, at some point involving entire worlds.
AI players in XLand start by playing single-player games based on simple tasks like identifying a shape in a certain color. After performing well in simple single-player games, XLand presents players with more complex challenges. The tasks become more difficult and more players are added to the game.
XLand challenged some of its players with up to 4000 worlds and hundreds of thousands of different games. Some completed more than three million unique tasks. The learning environment is infinite, meaning there is no one best thing to do in every situation.
This is a clear departure from the way most existing reinforcement learning tools work. With XLand, AI players are free to experiment. They can try a solution to see what happens instead of being limited by yes or no decisions. They may also try to use objects as tools to reach another object or hide behind something large enough. Again, the idea is not to limit learning and to allow players to learn as humans learn.
Human children experiment with their toys as well as their food. They discover the world around them naturally and iteratively. Later in life, scientists still apply the principle of experimentation, for example. Although they are usually guided by a hypothesis rather than asking a completely open-ended question, scientists are open to completely unexpected discoveries during their experimental process.
Theories of intelligence
The concept of artificial intelligence has been around for nearly 100 years. While some sources credit code breaker Alan Turing with the foundation of today’s AI being laid, it was Marvin Minsky of Dartmouth College who coined the term in 1956.
AI allows machines to perform tasks by simulating human intelligence. This technology does not replace the type of intelligence exhibited by humans or animals, but instead augments and copies it. These fundamentals have not changed since the early days of AI. What has changed, however, is that this technology has entered every aspect of our lives. From suggested viewing on streaming services to home assistants like Apple’s Alexa, we’re surrounded by AI applications.
To better understand modern AI, it is useful to divide it into two categories – narrow and broad or general artificial intelligence. The examples above are representations of narrow AI. Even chatbots fall into narrow AI applications.
General AI is a much broader application of AI technology, aiming to approach the flexibility and adaptability of the human brain. At this point, true general AI remains more of a concept than a reality. However, tools like XLand may begin to change that.
Differences and challenges between simulations and the real world
Simulations are necessary for machine learning and any kind of AI application training. They allow machines to shorten the life of accumulated experience that humans benefit from. Without simulations, machines would probably take years to acquire human skills.
However, as powerful as simulations are, they cannot perfectly depict reality. Experts refer to this as “mismatch between simulated and real environment” and the difficulty in transferring experiences from one to another as a gap in reality. Although it is possible to improve the simulations, this type of optimization requires extreme effort, making the simulation somewhat less efficient.
Also, most simulators have flaws. Powerful machine learning algorithms manage to exploit these flaws and effectively fool the simulation. The problem with this is that the trick is done in ways that wouldn’t work in reality.
Technology is improving and the gap between simulation and reality is closing. For now, however, a combination of simulation and reality remains the best way to learn RL applications.
The challenges of deep reinforcement learning
Before we look at the challenges of deep reinforcement learning, it’s worth clarifying some terms. Reinforcement learning is a part of machine learning. Machine learning refers to machines, such as computers, that learn from data without needing additional human input.
Deep learning takes this approach one step further by enabling the machine to analyze and process massive amounts of data. Data can be unstructured such as images, audio files and text. Deep learning allows computers to process much more data than humans could. To do this, the computer uses skills normally associated with human intelligence. These include learning, problem solving, observation and, of course, the ability to analyze data.
Reinforcement Learning (RL) it involves a trial and error process. Deep reinforcement learning uses the same principle but deals with larger amounts of data. Typically, RL involves AI players participating in round after round of games, having to repeat the process from scratch whenever they have to learn another game.
This limitation of RL is one of the biggest challenges that developers have to overcome when using these principles. Learning one game at a time is a relatively slow process compared to the human ability to adapt already learned skills to a new scenario.
Are we one step closer to general AI with XLand?
It would be fair to talk about general AI as the holy grail of artificial intelligence. To date, general AI remains just a concept. Opinions differ as to when will the world reach that stage. Some scientists predict that AI will be a reality in less than 20 years. Others believe that due to our limited understanding of the human brain, true general AI may be centuries away.
So what role does XLand play in the process? XLand breaks the mold of reinforcement learning as we know it. Instead of repeating the same RL process over and over again, XLand presents AI agents with new tasks and trains them in a way that encourages them to apply already learned behaviors.
So far, the results are promising. XLand DeepMind owners have found that their training leads to “more generally capable agents.They notice emerging heuristic behaviors rather than the highly specific behaviors that AI agents typically display for their individual tasks. The DeepMind team has also seen agents experiment when they are unsure of the exact solution to apply to a given situation.
At this point, developers still have a lot of work to do before AI technology becomes truly general AI. However, tools like XLand bring us several steps closer to the goal. By changing the way reinforcement learning trains AI players and building more human-like learning environments, XLand has the potential to transform AI training, resulting in far more capable players.