Google computer mimics human brain; learns from new experience...
Is playing 'Space Invaders' a milestone in artificial
intelligence?
Researchers with Google's DeepMind project created a
computer loosely based on brain architecture that mastered computer games --
such as Space Invaders -- without any knowledge of their rules.
By Geoffrey Mohan
Computers have beaten humans at chess and
"Jeopardy!," and now they can master old Atari games such as
"Space Invaders" or "Breakout" without knowing anything
about their rules or strategies..
Playing Atari 2600 games from the 1980s may seem a bit
"Back to the Future," but researchers with Google's DeepMind project
say they have taken a small but crucial step toward a general learning machine
that can mimic the way human brains learn from new experience.
Unlike the Watson and Deep Blue computers that beat
"Jeopardy!" and chess champions with intensive programming specific
to those games, the Deep-Q Network built its winning strategies from keystrokes
up, through trial and error and constant reprocessing of feedback to find
winning strategies.
“The ultimate goal is to build smart, general-purpose
[learning] machines. We’re many decades off from doing that," said
artificial intelligence researcher Demis Hassabis, coauthor of the study
published online Wednesday in the journal Nature. "But I do think this is
the first significant rung of the ladder that we’re on."
The Deep-Q Network computer, developed by the
London-based Google DeepMind, played 49 old-school Atari games, scoring
"at or better than human level," on 29 of them, according to the
study.
The algorithm approach, based loosely on the architecture
of human neural networks, could eventually be applied to any complex and
multidimensional task requiring a series of decisions, according to the
researchers.
The algorithms employed in this type of machine learning
depart strongly from approaches that rely on a computer's ability to weigh
stunning amounts of inputs and outcomes and choose programmed models to
"explain" the data. Those approaches, known as supervised learning,
required artful tailoring of algorithms around specific problems, such as a
chess game.
The computer instead relies on random exploration of
keystrokes bolstered by human-like reinforcement learning, where a reward
essentially takes the place of such supervision.
“In supervised learning, there’s a teacher that says what
the right answer was," said study coauthor David Silver. "In
reinforcement learning, there is no teacher. No one says what the right action
was, and the system needs to discover by trial and error what the correct
action or sequence of actions was that led to the best possible desired
outcome.”
The computer "learned" over the course of
several weeks of training, in hundreds of trials, based only on the video pixels
of the game -- the equivalent of a human looking at screens and manipulating a
cursor without reading any instructions, according to the study.
Over the course of that training, the computer built up
progressively more abstract representations of the data in ways similar to
human neural networks, according to the study.
There was nothing about the learning algorithms, however,
that was specific to Atari, or to video games for that matter, the researchers
said.
The computer eventually figured out such insider gaming
strategies as carving a tunnel through the bricks in "Breakout" to
reach the back of the wall. And it found a few tricks that were unknown to the
programmers, such as keeping a submarine hovering just below the surface of the
ocean in "Seaquest."
The computer's limits, however, became evident in the
games at which it failed, sometimes spectacularly. It was miserable at
"Montezuma's Revenge," and performed nearly as poorly at "Ms.
Pac-Man." That's because those games also require more sophisticated exploration,
planning and complex route-finding, said coauthor Volodymyr Mnih.
And though the computer may be able to match the
video-gaming proficiency of a 1980s teenager, its overall
"intelligence" hardly reaches that of a pre-verbal toddler. It cannot
build conceptual or abstract knowledge, doesn't find novel solutions and can
get stuck trying to exploit its accumulated knowledge rather than abandoning it
and resort to random exploration, as humans do.
“It’s mastering and understanding the construction of
these games, but we wouldn’t say yet that it’s building conceptual knowledge,
or abstract knowledge," said Hassabis.
The researchers chose the Atari 2600 platform in part
because it offered an engineering sweet spot -- not too easy and not too hard.
They plan to move into the 1990s, toward 3-D games involving complex
environments, such as the "Grand Theft Auto" franchise. That
milestone could come within five years, said Hassabis.
“With a few tweaks, it should be able to drive a real
car,” Hassabis said.
DeepMind was formed in 2010 by Hassabis, Shane Legg and
Mustafa Suleyman, and received funding from Tesla Motors' Elon Musk and
Facebook investor Peter Thiel, among others. It was purchased by Google last
year, for a reported $650 million. Hassabis, a chess prodigy and game designer,
met Legg, an algorithm specialist, while studying at the Gatsby Computational
Neuroscience Unit at University College, London. Suleyman, an entrepreneur who
dropped out of Oxford University, is a partner in Reos, a conflict-resolution
consulting group.
Comments
Post a Comment