Mark Turner is brimming with confidence, wearing a black hat that wouldn’t have trouble finding its way in academia. He is a kind soul, with a keen interest in the person sitting across from him, not the computer in front of him. Turner is the type to offer to buy coffee as if it’s a right, not simply a courtesy. This attitude, perhaps, has led him to co-found an international big-data project whose purpose is to be a tool for social science research.
This month that project, called the Red Hen Lab, received a Google Summer of Code 2015 grant. The grant provides stipends for students to work on open source projects. The grant encourages student participation in open source projects in order create tools and also create a great learning opportunity.
That’s exactly what the Red Hen Lab will give them.
Imagine a group of servers that contain a copy of every news and talk show broadcast from the past five years. This is the core of the Red Hen Lab.
Even more importantly, the servers also have closed captioning. The database timestamps the closed captions in relation to the audiovisual input. To Turner this is key because it makes the broadcast searchable. He can type in a few keywords, and much like a Google search, the database will name which broadcast included the words he was looking for. The channel and show are named in bold above, much like the title of a website. The texts around his search words also appear below this.
Unlike what appears on a Google search page however the page shows the videos that the captions are from. Turner can then click on any word in the caption and video will then play from this point.
“You see the human being perform it,” he said. “You see their eyes, how they move their eyebrows, what they do with their gestures, how their voice moves you see the onscreen text; you see the graphics they use.”
The database allows him to easily observe persons in a natural environment and find how the words he chose affect the persons on screen. In other words, it makes conducting a study extremely fast.
However the researchers didn’t just stop there. When the servers receive the closed captions, they put them through a rigorous grammar analysis that identifies all the parts of a sentence. This makes linguistic research a lot easier.
Still Turner’s favorite part of the database was actually added just last summer, a query tool known as Framenet. Framenet can take an abstract idea and find when persons are talking about it, even if the specific words aren’t used.
A big problem that social science researchers face is that subjects are always aware that they are in a laboratory setting. The Red Hen Lab doesn’t face this dilemma. At the same time, it’s extremely efficient. Conducting a study normally takes months of work, but the lab speeds up the process greatly.
“I can type in two seconds, go have dinner, and the next morning it will all be there,” Turner said with a laugh.