This clever AI hid data from its creators to cheat at its appointed task
This clever AI hid data from its creators to
cheat at its appointed task
This occurrence reveals a problem with
computers that has existed since they were invented: they do exactly what you
tell them to do.
The intention of the researchers was, as
you might guess, to accelerate and improve the process of turning satellite
imagery into Google’s famously accurate maps. To that end the team was working
with what’s called a CycleGAN — a neural network that learns to transform
images of type X and Y into one another, as efficiently yet accurately as
possible, through a great deal of experimentation.
In some early results, the agent was doing
well — suspiciously well.
What tipped the team off was that, when the agent reconstructed aerial
photographs from its street maps, there were lots of details that didn’t seem
to be on the latter at all. For instance, skylights on a roof that were
eliminated in the process of creating the street map would magically reappear
when they asked the agent to do the reverse process:
The intention was for the agent to be able
to interpret the features of either type of map and match them to the correct
features of the other. But what the agent was actually being
graded on (among other things) was how close an aerial map was to the original,
and the clarity of the street map.
So it didn’t learn how
to make one from the other. It learned how to subtly encode the features of one
into the noise patterns of the other. The details of the aerial map are
secretly written into the actual visual data of the street map: thousands of
tiny changes in color that the human eye wouldn’t notice, but that the computer
can easily detect.
In fact, the computer is so good at
slipping these details into the street maps that it had learned to encode any aerial
map into any street
map! It doesn’t even have to pay attention to the “real” street map — all the
data needed for reconstructing the aerial photo can be superimposed harmlessly
on a completely different street map, as the researchers confirmed:
This practice of encoding data into images
isn’t new; it’s an established science called steganography, and it’s used all
the time to, say, watermark images or add metadata like camera settings. But a
computer creating its own steganographic method to evade having to actually
learn to perform the task at hand is rather new.
(Well, the research came out last year, so it isn’t new new, but it’s
pretty novel.)
One could easily take this as a step in
the “the machines are getting smarter” narrative, but the truth is it’s almost
the opposite. The machine, not smart enough to do the actual difficult job of
converting these sophisticated image types to each other, found a way to cheat
that humans are
bad at detecting. This could be avoided with more stringent evaluation of the
agent’s results, and no doubt the researchers went on to do that.
As always, computers do exactly what they
are asked, so you have to be very specific in what you ask them. In this case
the computer’s solution was an interesting one that shed light on a possible
weakness of this type of neural network — that the computer, if not explicitly
prevented from doing so, will essentially find a way to transmit details to
itself in the interest of solving a given problem quickly and easily.
This is really just a lesson in the oldest
adage in computing: PEBKAC. “Problem exists between keyboard and computer.” Or
as HAL put it: “It can only be attributable to human error.”
The paper,
“CycleGAN, a Master of Steganography,” was presented at the
Neural Information Processing Systems conference in 2017. Thanks to Fiora Esoterica and
Reddit for bringing this old but interesting paper to my attention.
Comments
Post a Comment