When you tweet, you can tag your location and all 335 million active Twitter users can see it, including researchers who can use it to draw conclusions — about topics you’d never tweet about.
According to a study from N.C. State, scientists have figured out how to combine social media location information with satellite imagery and other data sources with the goal of monitoring international nuclear testing and activity.
The Department of Energy National Nuclear Security Administration funded the research, published last month, to explore using social media as “sensors” for sensitive information.
“They want to come up with ways to very closely monitor unauthorized or illegal nuclear activity by remote sensing,” said study co-author Hamid Krim, professor at N.C. State, in a phone interview. “It’s a very challenging problem in the sense that (they) want to use a diverse set of sensors, which includes things like texts and invoices.”
For example, he added, “How do you (mix) computationally incompatible data like images and text? Or audio, what do you do with it? They want to fuse all this information with a robust and more reliable way to make (conclusions) and decisions.”
Cutting through Twitter noise
Since nuclear activity data is classified, the researchers looked at emergency flood management in the paper, Krim said. Government agencies need to know where flooding is critical, what roads need attention and where citizens are in distress.
The researchers created a formula which takes in satellite data and tweets during a flood and spits out a map with estimates for water height. The computer output ended up looking a lot like a map of recorded high-water marks of a 2013 flood in Boulder, Colo., which means the algorithm worked.
Only 1 to 2 percent of the 6,000 tweets sent per second include location data, per the study. The researchers used only the location-tagged tweets and had to tweak the location data before putting it into their formula.
“When somebody tweets about the flood itself, they aren’t exactly in the middle of a flood. They’re a distance away from the flood,” Krim said. “We had to make up for that.”
“The other thing we discovered was how noisy Twitter is. You may be tweeting about flooding, but it’s flooding of your bathtub. This is where we used some of the text to filter the good tweets from the bad tweets, and then we had to take out those ones that weren’t of any use.”
By using social media and satellite imagery in this way, cities may not need to use staff to observe floodwaters and decide where to send first responders.
How would that work in Raleigh? The city measures water height along five creeks with 14 stream gauges, Raleigh communications analyst Kristin Freeman wrote in an email. You can track these gauges online in real-time at the U.S. Geological Survey website. But the city still physically sends out people to inspect the gauges, flood-prone areas and construction sites, Freeman wrote.
“We have field staff go out to monitor and maintain the gauges as a safeguard to make sure the data is accurate and make sure everything is still functioning,” Freeman said in a phone interview. “We also have field staff who monitor locations prone to flooding and we have staff who monitor (construction) projects underway.”
Power, with limits
The paper, published in the journal IEEE Transactions on Geoscience and Remote Sensing, combines Twitter data with satellite imagery and historical information. University of Maryland professor Jessica Vitak, who researches interaction between humans and computers, warns that we can’t come to conclusions using social media information by itself.
“It’s not going to give you the whole story,” Vitak said. “Probably the people most affected by national disasters aren’t tweeting about it because they’re responding to the issue.”
Every day people send 500 million tweets, according to Internet Live Stats. The sheer volume of data makes it hard to separate good information from bad, said Brigham Young University professor Amanda Hughes in a phone interview.
“A secondary problem that’s existed from the beginning is the problem with misinformation, false rumors and disinformation, when people are trying to inject misleading information,” Hughes said. “(For instance,) people in other countries have injected information that’s not true to get people to believe different things.”
Since so many people use Twitter, it reflects backgrounds overlooked by medical or academic institutions, Vitak said.
“Until the 1990’s, health studies and medical studies focused on men, and we’re still seeing the problems of understanding how various health conditions affect people, especially for female-specific issues,” Vitak said. “User-generated data shared online is much more diverse.”
“Everybody can report on something because they have their phone in their pocket. We have all these perspectives we wouldn’t have gotten before.”
The ethical ‘slippery slope’
Though scientists like Hughes and Vitak have been researching social media for a decade, the research community has no consistent ethical standards for using the user information, Vitak said.
“A lot of people don’t necessarily realize that people are farming their data,” Hughes said. “There are people posting very personal information, especially in emergency response things where people will say ‘I need help at this particular address’. […] People post personal information that they don’t realize is being used, and don’t realize what it’s being used for.”
While some users don’t know that they’re research guinea pigs, academics might not think through the ethics of their research, Vitak said.
“It’s a slippery slope with algorithms and (making conclusions),” Vitak said. “Researchers often do research for research’s sake and they don’t think about the ways their research can be used for other things.”
“For example, there’s been research into data about a person to figure out a person’s sexual orientation. From a scientific perspective, we added to the list of things we can infer from social media data. But now we can out them just from their likes and preferences and friends.”
When asked about the ethics of using social media data of citizens to track their countries, Krim wasn’t worried. This project came fromdedicated to preventing the spread of nuclear weapons.
“One has to look at the bigger picture and the greater good, and keeping the world safe is the greatest of the greater good,” Krim said. “A baseball bat is to show the skills of the [athletes], but somebody can take that bat and go kill somebody else with that bat. There’s always the possibility that somebody can use it to a nasty purpose.”
“Personally, it never crossed my mind that the government would do something bad. … I look at the positive and sleep well at night when I look at the positive.”