It's another morning at home, and the star of the multimillion-dollar Human Speechome Project is sitting on his living room floor. "What's that over there?" his father asks. "Over there. That ball?"
The boy looks across the room. "Green ball!" he says.
"O-oh," says his father, a single syllable full of praise.
It's the kind of mundane conversation that burbles incessantly through every household populated by a toddler. But inside this yellow house on a quiet suburban street, the dialogue may help unlock the secret of how children learn to speak.
Nearly every word uttered by the boy, now almost 3, is recorded by microphones and video cameras lodged in the ceiling of each room - the most comprehensive study ever made of a child learning to talk.
Scientists will pore over the way he first said "ga-ga," then "wa-wa" for water. They will analyze the direction he gazed as he spoke important words, and the way he interacted with adults. They will hear what he said as he took his first wobbly steps.
The Human Speechome Project is the creation of Deb Roy, an associate professor and the director of the Cognitive Machines group at MIT's Media Lab in Cambridge. But Roy has another, more primal role: He is the boy's father.
Roy builds talking robots and has long been stymied by how little scientists know about the way humans learn to speak. He and his wife, Rupal Patel, associate professor and director of the Communication Analysis and Design Laboratory at Northeastern University, often collaborate on research. So about four years ago, they saw a way to collect an unprecedented amount of data about how children learn to speak that could both shed light on this little-understood milestone and help Roy and his colleagues create robots capable of learning speech.
"When we knew I was pregnant, it was a conversation that first night: How are we going to capture the language development of this child?" she said. "It was part and parcel of having a kid."
Roy installed 11 cameras and 14 microphones in their Arlington home's ceilings, leaving no room in the house, including the bathroom, out of range. He added about 3,000 feet of concealed wiring and a computer server room in their basement. When their son was born in 2005, they began recording.
Now, nearly three years later, he has collected 230,000 hours of raw data. Linda B. Smith, director of the Cognitive Development Lab at Indiana University, said the Speechome project goes far beyond the scale of any other project studying language development in children.
"People have taken bits and pieces, transcribing 10 minutes here and an hour there, but we really do not know . . . the mass of language that children hear and its relation to the child's own activity," she said in an e-mail message. "This is what Roy is trying to do."
Given the intimate - and unparalleled - nature of his research, Roy has taken great pains to protect his family's privacy. He never speaks his son's name when talking about his research.
Everyone who works on the project must sign a confidentiality agreement. And he cautiously limits access to the data - hundreds of thousands of hours of his son's life.
Roy and Patel also made it easy to erase moments too personal for the Speechome project. Beside a light switch in each room sits a control switch that not only turns the cameras and microphones off but can erase a period of time already recorded. He and Patel call it the "oops" button.
"It's like the anti-Tivo button," he said.
Roy and his wife have often deleted dinnertime conversations, if they veer into politically sensitive topics or work colleagues. Sometimes, the video needs to be erased, such as when Patel was nursing. "I've probably walked out of the shower a couple of times," Roy said. But they both say they rarely erase.
People who come to the house must also be briefed on the unusual infrastructure. One nanny wanted to know why there was a camera in the bathroom. (The answer: to capture the boy's speech during bath time - but it's rarely turned on.)
If Roy and Patel have visitors for a single night, Roy and Patel turn off the cameras and microphones. But if friends or relatives stay longer, the couple ask them to sign consent forms, allowing them to be recorded.
Few people will be permitted to view the entire body of data. Students who transcribe audio cannot see the video. Those who work with the video cannot hear the recorded sound.
Roy says people often ask him how the recordings have changed the way his family, which now includes a year-old daughter, interacts. He says the privacy restrictions, including his decision not to make available all the raw data, has limited the effects on his home life.
His son likes watching video of himself, but is otherwise bored by the technology, Roy said. Since the project only records in their house, there are lapses when the family goes on vacation or takes a weekend trip. Roy estimates he has captured 70 percent of his son's waking hours. One of the limitations of earlier projects trying to capture the way children learned to speak was observing children in their natural environment - as opposed to a lab.
"It's a very simple question that really took me down this path, which is, 'How do children learn the meaning of words?' " Roy said. "One of the things you remember about your kid is what their first word was and [you] kind of wonder why that word. . . . We don't actually have any well-developed theories of how does a kid actually perform this magic of trying to use words meaningfully."
Roy named his research the Human Speechome Project, knowingly alluding to the Human Genome Project. Whereas that research mapped DNA, Roy wanted to examine the environmental factors that influence the way children learn to speak.
As the data collection phase has begun to wind down - Roy's son has mastered the basics of language as he approaches 3 - Roy and his team have tackled one of the biggest challenges of the project: how to analyze hundreds of thousands of hours of raw video and audio.
"When you have that much data, no person's going to watch it," Roy said. "There's too much of it."
So he and his team devised a computer program to hasten the speed of transcribing the audio tapes. They created "space-time worms," computer models that track human movement over time. And they figured out a way for computers to analyze that movement, the direction people gaze and the way they interact with one another.
"If you have a theory that's precise about how a child takes what they hear and what they see . . . and converts that into a mental lexicon, then you should be able to build a machine that actually performs according to that theory," he said. "So we're going to build a series of such language-learning machines."
Roy sees one important application of his research as early detection and treatment of developmental disorders, such as autism.
He is creating a portable system that could observe children at risk for such disorders in their home, and monitor their progress once treatment has begun.
Outside his lab, Roy has found another, more personal, benefit of the Speechome project: a nearly complete record of his son's first three years.
When the boy was 15 months old, the camera in the hallway ceiling documented his son's first tentative steps. As the boy starts to totter toward him, Roy asks, "Can you do it?"
The child takes off. Astounded at this new world, the toddler whispers, "Wow."
The father watches. Then he, too, says softly, "Wow."
Kathleen Burge can be reached at email@example.com.