IARPA Needs More Training Data for Video Surveillance
Algorithms
By Jack Corrigan, Staff Correspondent, Nextgov MAY 10,
2019
The data would improve the tech’s ability to link
together footage shot across a broad geographic space, allowing it to better
track and identify potential targets.
The intelligence community’s research arm wants to train
algorithms to track people across sprawling video surveillance networks, and it
needs more data to do it.
The Intelligence Advanced Research Projects Activity is
recruiting teams to build bigger, better datasets to train computer vision
algorithms that would monitor people as they move through urban environments.
The training data would improve the tech’s ability to link together footage
from a large network of security cameras, allowing it to better track and
identify potential targets.
Computer vision is a type of artificial intelligence that
allows computers to interpret images and videos. Many law enforcement and
public safety organizations already use the tech to investigate crimes, monitor
critical infrastructure and secure major events that could be targets for
terrorists. An early version of the tech was used to identify the perpetrators
of the Boston Marathon bombing in 2013, for instance, its popularity has only
grown in the years since.
But according to IARPA, the data used to train algorithms
today is fairly narrow, which limits the tech’s ability to dissect the wide
range of situations they’d see in the real world. With the new datasets,
officials aim to improve the training process and enable computer vision
systems to connect footage shot from cameras positioned across a broad
geographic area.
“Further research in the area of computer vision within
multi-camera video networks may support post-event crime scene reconstruction,
protection of critical infrastructure and transportation facilities, military
force protection, and in the operations of National Special Security Events,”
IARPA officials wrote in the solicitation.
Under the solicitation, selected vendors would compile
roughly 960 hours of video footage covering numerous different environments and
scenarios.
The dataset must include footage from at least 20
different security cameras with “varying positions, views, resolutions and
frame rates” scattered across roughly 2.5 acres of “urban or semi-urban space.”
The videos would be shot all hours of the day and in different weather
conditions, and include pedestrians, moving vehicles, street signs and other
“distractors.”
The footage must also include at least 200 test subjects
behaving in different ways across the camera network. Ultimately, these are the
people the algorithms would focus on to sharpen their identification and
tracking skills.
Interested vendors must respond to the solicitation by
May 17.
Comments
Post a Comment