Researchers at MIT, Microsoft, and Adobe have developed an algorithm that can reconstruct an audio signal by analyzing minute vibrations of objects depicted in video. In one set of experiments, they were able to recover intelligible speech from the vibrations of a potato-chip bag photographed from 15 feet away through soundproof glass.
In other experiments, they extracted useful audio signals from videos of aluminum foil, the surface of a glass of water, and even the leaves of a potted plant. The researchers will present their findings in a paper at this year’s Siggraph, the premier computer graphics conference.
“When sound hits an object, it causes the object to vibrate,” says Abe Davis, a graduate student in electrical engineering and computer science at MIT and first author on the new paper. “The motion of this vibration creates a very subtle visual signal that’s usually invisible to the naked eye. People didn’t realize that this information was there.”
Joining Davis on the Siggraph paper are Frédo Durand and Bill Freeman, both MIT professors of computer science and engineering; Neal Wadhwa, a graduate student in Freeman’s group; Michael Rubinstein of Microsoft Research, who did his PhD with Freeman; and Gautham Mysore of Adobe Research.
Reconstructing audio from video requires that the frequency of the video samples — the number of frames of video captured per second — be higher than the frequency of the audio signal. In some of their experiments, the researchers used a high-speed camera that captured 2,000 to 6,000 frames per second. That’s much faster than the 60 frames per second possible with some smartphones, but well below the frame rates of the best commercial high-speed cameras, which can top 100,000 frames per second.
In other experiments, however, they used an ordinary digital camera. Because of a quirk in the design of most cameras’ sensors, the researchers were able to infer information about high-frequency vibrations even from video recorded at a standard 60 frames per second. While this audio reconstruction wasn’t as faithful as that with the
high-speed camera, it may still be good enough to identify the gender of a speaker in a room; the number of speakers; and even, given accurate enough information about the acoustic properties of speakers’ voices, their identities.
The Latest on: Reconstructing audio from video
via Google News
The Latest on: Reconstructing audio from video
- WIMI Focuses on AI Vision to Seize the New Wind of 5G Live While Apple Releases Point Cloud Data Compression Patenton May 11, 2020 at 3:45 am
NEW YORK, NY / ACCESSWIRE / May 11, 2020 / Recently, the US Patent and Trademark office (PTO) issued an Apple patent related to holographic display and AR/VR. The patent indicates a point cloud data ...
- Universities have gone from being a place of privilege to a competitive market. What will they be after coronavirus?on May 5, 2020 at 12:50 pm
This essay explores the way the social contract between universities, society and the state has changed over the course of the 20th century. And how generations of students paid and benefitted.
- Car Seat Headrest’s reinvention: How a comedy EDM project redirected the Seattle indie rock stars’ new albumon May 1, 2020 at 6:00 am
Seattle indie rock stars Car Seat Headrest get a sonic makeover with its electro-charged new album “Making a Door Less Open,” dropping May 1.
- Potempa: Free local history ‘virtual tour’ includes fun forgotten factson April 29, 2020 at 3:02 pm
Kevin Matthew Pazour, executive director of the Porter County Museum, just a stone’s throw from the Porter County Courthouse square, has spent part of his April dreaming up a wonderful free tour ...
- The Wooster Group to Stream Productions of HAMLET, HOUSE/LIGHTS and Moreon April 29, 2020 at 10:08 am
Having postponed the world premiere in Vienna of its production of Bertolt Brecht's The Mother, The Wooster Group is retrenching at home in New York to develop and fund new work for an era of physical ...
- Crews Complete First Phase Of Massive I-75 Reconstruction Project In Oakland County; Audio Excerpton April 20, 2020 at 5:00 pm
Just in time for the holidays, crews have wrapped up the first phase of a 14-year project to modernize I-75 in Oakland County. The Michigan Department of Transportation said Thursday that the ...
- Opinion: How can post-pandemic Canada change for the better?on April 16, 2020 at 5:08 am
Canada fell somewhere in between, its most European adoption being universal health care in the 1960s. These policies and programs did not suddenly arise after 1945. Civil servants and assorted ...
- One Year After Notre Dame Cathedral Fire, Reconstruction Is on Hold Due to Coronaviruson April 15, 2020 at 9:48 am
One year later, reconstruction on the famous site has ... Notre Dame’s great bell Emmanuel will sound for the first time in a year to commemorate the anniversary of the fire.
- Saving the monumental sounds of Notre-Dame de Parison April 15, 2020 at 4:40 am
Besides offering unique access to the sounds of the past, these virtual acoustic reconstruction tools ... led to a virtual reality 360° video with spatial audio and an orchestra fly-over ...
- Computers Already Learn From Us. But Can They Teach Themselves?on April 8, 2020 at 2:11 am
Supervised learning depends on annotated data: images, audio ... like reconstructing the input after forcing the model to a compact representation, predicting the future of a video or masking ...
via Bing News