A resident installs an air-quality sensor outside their home, hoping to contribute to a growing network of citizen-collected environmental data. But before making the information public, they drag the sensor’s location pin on the digital map slightly down the street.
That small act of digital self-protection is at the center of new research from Yue Lin, a professor in the Department of Geography and Geographic Information Science at the University of Illinois Urbana-Champaign.
Lin’s recent study examined “location masking” — the practice of intentionally obscuring a sensor’s true location before sharing data publicly — using a national dataset of PurpleAir sensors, a popular crowdsourced air-quality monitoring platform. The findings, recently published in The Professional Geographer, reveal that the data increasingly used by researchers and government agencies is shaped not only by technology but also by the people who contribute to it and their concerns about privacy, trust, and surveillance.
For Lin, the findings connect to larger questions about how human behavior and social context shape the data increasingly used in modern technologies.
“Sometimes when people think about biases, the problem becomes purely technical,” Lin said. “But a large part of my research is to answer how these biases are not purely technical.”
Those questions have become increasingly important as artificial intelligence and machine learning systems rely on enormous quantities of data gathered from online platforms and public contributions.
“In an ideal world, everything can be collected as data,” she added. “But in reality, our world is political; it’s not neutral.”