A Cambridge non-profit partnered with Google to help people with ALS preserve their voice through A.I.

Project Euphonia pulls from thousands of voice recordings to translate compromised speech back into a person's original voice.

.
A WAV file that's been converted into an image file and then graphed to show the quality of someone's voice. –Steve Perrin, ALS Therapy Development Institute

“I owe you a yoyo today.”

This phrase started a whole database of words and idioms that researchers have used to help patients be understood after they’ve been diagnosed with amyotrophic lateral sclerosis, or ALS. 

“We picked that phrase when we started the project in 2014 because that phrase has a lot of interesting resonance components to it,” CEO of the Cambridge-based ALS Therapy Development Institute (ALS TDI) Steven Perrin said. “It’s used in a lot of speech analysis scientific studies over the years, and so we copied it.”

Starting with just this sentence, and collecting thousands more, Perrin sent the data to Google in the hopes of garnering better ways to track the progression of the disease. But it grew into a project that could find ways to help voice recognition technology understand compromised speech, and, eventually, translate that speech back into a person’s original voice.

Advertisement

Project Euphonia, I would say, started by accident,” Perrin said.

It started as just another way of collecting data for ALS TDI’s precision medicine program in 2014. 

“If there were 100 newly diagnosed patients in the room here with us, I couldn’t tell you which one is going to lose their battle with ALS in two months, and which one could live as long as Stephen Hawking,” Perrin said.

So the precision medicine program sought a better way to measure ALS’s progression by learning from those living with the neurodegenerative disease. 

And since no one had ever tried the program for ALS, Perrin said they decided to record as much data in as many forms as possible like asking 600 questions on the history of the patient’s life, sequencing their full genomes after quarterly blood drawings, and asking for monthly recordings of their voice.

They moved forward analyzing most of the data except for the recordings, having had no idea what to do with them. 

But then, Perrin said he met with Google and asked if they could help look at the voice recordings to see if they could correlate the client’s voice with the disease’s progression.

Advertisement

“At first they laughed and said, ‘Ah that’s not big enough data for Google.’” 

But a year later, once ALS TDI had 600 people in the program uploading monthly recordings, Google said yes.

Using a fourier transformation to convert the WAV file recordings into colorimetric patterns, or image files, Google applied its machine learning algorithm to the recordings. 

Through that, “they were able to more sensitively predict disease progression than anything else we’re using in ALS,” Perrin said.

Google’s A.I. model trains itself independently, which is why it requires so much data. 

The more data it has, the more it can pick out patterns from the WAV files after they’ve become image files.

That’s when Perrin said Google saw a light bulb go off.

“They said to us, you know, we never thought about it before, but people lose their voice and we have all of their voice recordings before they lost them,” Perrin said. “Maybe we could reconstruct somebody’s voice.” 

And so Project Euphonia began. At first, only having access to data from patients with ALS, Perrin and Google saw it as a way to adapt voice recognition technology to better help anyone with voice impairment issues. But Perrin said the project has developed a broader goal: to restore patients’ original voices.

“Sure, there’s devices out there now that help with communication, but out comes this computerized voice that’s not your own,” he said. “It’s kind of sterile, it’s not the most inviting thing.” 

Advertisement

Perrin said it’s been profound to watch people use Project Euphonia and hear their own voice come out of a computer. 

One patient’s voice, once it was fully reconstructed, sounded so close to his original that his wife called Perrin in tears. 

She hadn’t heard her husband’s voice since 2010.

‘The time would pass either way’

Perrin said sharing your voice as part of the program is free to any patient who wants to contribute, and most do, despite a diagnosis telling them they only have a maximum of five years to live.

Like Andrea Lytle Peet, who was diagnosed with ALS in May 2014 at 33 years old and founded the Team Drea Foundation while also participating in ALS TDI’s Precision Medicine Program.

Peet continues to attend races, despite her diagnosis. She will be in Boston this June to participate in the Tri-State Trek’s ride to end ALS, where she will be biking 270 miles to Greenwich, Connecticut. Photo courtesy of Chris Szagola. —Chris Szagola

“I realized when I was diagnosed that I could choose whether to be depressed or to live life the best way I knew how the time would pass either way,” Peet said in an email. “I have chosen to dedicate my remaining time to finding a cure for ALS and helping to advance the science so that one day, no other families will have to go through this cruel disease.”

Only a year before her diagnosis, she had been doing nine workouts a week to take part in a half Ironman triathlon in September 2013. 

Peet said she went to five neurologists before getting her diagnosis, and she’s been fighting ALS for five and a half years since, outliving the normal life expectancy of two to five years.

“I went from the strongest I’d ever been to walking with a cane in eight months,” she said.

And everything about her life and future changed. 

“My husband and I no longer plan to have children,” Peet said. “We don’t get to imagine growing old together. I cashed out my 401k because I won’t live long enough to retire.”

Peet and her husband before she was diagnosed with ALS. Photo courtesy of Andrea Peet. —Andrea Peet

But she’s grateful for what she can still do, like speaking despite slurred words, walking with a walker, eating, driving, and using the bathroom on her own. 

“These are all things that most people take for granted, but people with ALS lose over time,” Peet said. “I will never take for granted the neurological glue that is still holding me together.”

She said after the diagnosis, she was nervous about a lot of things.  

“I worried after I was diagnosed that I would no longer have a purely happy thought,” Peet said. “But the happy memories are sweeter, and we don’t often argue about little things that don’t matter. We take adventures now we don’t wait for someday anymore.”

And Project Euphonia has eased some of her worries, too, giving her independence, allowing her to turn on lights, the TV, or lock the door using just her voice. 

Peet said she’s been using the technology every day for the past six months. 

This technology also allows me to continue giving presentations for my foundation to keep raising money for ALS research,” she said. “It ‘live captions’ what I say and I don’t have to worry about being understood.”

The project has also offered her peace of mind. 

 “If my hands stop working, I know that I can use my voice to turn on the TV, turn on a podcast, or set an alarm,” she said. “Any small measure of independence means so much when you’ve lost everything else.”

The most traumatic part

Perrin said losing the ability to speak might be the most traumatic part of the ALS disease course. 

“Communication is key to our existence, our well being, our mental health,” he said. 

And Project Euphonia is giving patients the ability to not worry as much about losing access to that vital part of being human. 

While it took almost four years for ALS TDI and Google to get to where they are today with the project, there’s still more to be adjusted. 

Perrin said the systems aren’t automated yet. 

Google takes the audio recordings from ALS TDI and their A.I. learns how to translate them, but to fully recreate and fine tune someone’s voice, Google needs more than twenty minutes of perfectly clear, pre-recorded audio of a patient’s voice before ALS affected it. 

The problem is, aside from wedding speeches or recorded business conference calls, most people don’t regularly record their voice. 

“I think the vision is to try to get it down to a minimal amount of high-quality audio,” Perrin said. “Less than a minute would be optimal, because probably everybody could find that.”

He said they’re moving forward nonetheless, continuing to collect words and get to a point where the translation is automatic. 

But the progress is ongoing, and “it’s not going to happen overnight,” Perrin said. 

Jump To Comments