FREQUENT visitors to the Hustler Club, a gentlemen’s entertainment venue in New York, could not have known that they would become part of a debate about anonymity in the era of “big data”.
But when, for sport, a data scientist called Anthony Tockar mined a database of taxi-ride details to see what fell out of it, it became clear that, even though the data concerned included no direct identification of the customer, there were some intriguingly clustered drop-off points at private addresses for journeys that began at the club. Stir voter-registration records into the mix to identify who lives at those addresses (which Mr Tockar did not do) and you might end up creating some rather unhappy marriages.
The anonymisation of a data record typically means the removal from it of personally identifiable information. Names, obviously. But also phone numbers, addresses and various intimate details like dates of birth. Such a record is then deemed safe for release to researchers, and even to the public, to make of it what they will. Many people volunteer information, for example to medical trials, on the understanding that this will happen.
But the ability to compare databases threatens to make a mockery of such protections. Participants in genomics projects, promised anonymity in exchange for their DNA, have been identified by simple comparison with electoral rolls and other publicly available information. The health records of a governor of Massachusetts were plucked from a database, again supposedly anonymous, of state-employee hospital visits using the same trick. Reporters sifting through a public database of web searches were able to correlate them in order to track down one, rather embarrassed, woman who had been idly searching for single men. And so on.
Each of these headline-generating stories creates a demand for more controls. But that, in turn, deals a blow to the idea of open data—that the electronic “data exhaust” people exhale more or less every time they do anything in the modern world is actually useful stuff which, were it freely available for analysis, might make that world a better place.
Of cake, and eating it
Modern cars, for example, record in their computers much about how, when and where the vehicle has been used. Comparing the records of many vehicles, says Viktor Mayer-Schönberger of the Oxford Internet Institute, could provide a solid basis for, say, spotting dangerous stretches of road. Similarly, an opening of health records, particularly in a country like Britain, which has a national health service, and cross-fertilising them with other personal data, might help reveal the multifarious causes of diseases like Alzheimer’s.
This is a true dilemma. People want both perfect privacy and all the benefits of openness. But they cannot have both. The stripping of a few details as the only means of assuring anonymity, in a world choked with data exhaust, cannot work. Poorly anonymised data are only part of the problem. What may be worse is that there is no standard for anonymisation. Every American state, for example, has its own prescription for what constitutes an adequate standard.
All these approaches, though, are anathema to the open-data movement, because they limit the scope of studies. “If we’re making it so hard to share that only a few have access,” says Tim Althoff, a data scientist at Stanford University, “that has profound implications for science, for people being able to replicate and advance your work.”
The Latest on: Data Privacy
via Google News
The Latest on: Data Privacy
- Internal Senate memo warns Zoom poses ‘high risk’ to privacy, securityon April 9, 2020 at 9:00 am
The Senate sergeant at arms has warned offices that virtual conferencing platform Zoom poses a high risk to privacy and could leave their data and systems exposed, according to an internal memo ...
- US doesn’t need state privacy laws, it needs a federal oneon April 9, 2020 at 8:06 am
How much is your personal data worth? Mine is worth about $9.23, according to a check I received from a privacy violation class action lawsuit. While I appreciate the need for new legislation to ...
- Senate examines data, privacy in coronavirus ‘paper hearing’on April 9, 2020 at 7:06 am
Act on the news with POLITICO Pro. — Back to the basics: The Senate Commerce Committee holds one of the first-ever “paper hearings” today on data’s role in the war on the coronavirus and the privacy ...
- Big Data Is Helping Us Fight The Coronavirus — But At What Cost To Our Privacy?on April 9, 2020 at 4:00 am
You don’t lose your civil liberties simply because you become sick. It’s important to understand the primacy of due process.” So what does a policy that balances data privacy and the public good look ...
- How to Address Data Privacy During Remote Learningon April 8, 2020 at 11:26 am
Would students and teachers have the ability to access those tools from home? If not, how would we address that? And then there’s data privacy, which many schools are starting to look more into as ...
- Tracking coronavirus: big data and the challenge to privacy | Free to readon April 7, 2020 at 9:00 pm
says citizens have legitimate concerns about vast amounts of data being used to track them individually. But he says it is a tricky balance to strike between using technology to help tackle health ...
- Europe's privacy officials are working on geolocation guidelines for tracking COVID-19on April 7, 2020 at 12:28 pm
Countries like Singapore, the United Kingdom and Israel have all developed their own apps for tracking people's movements and how COVID-19 spreads -- and the only privacy protection based on trusting ...
- New privacy protections could make some 2020 census data ‘unacceptably wrong’on April 7, 2020 at 5:00 am
Census data can be published only as collections of statistics, but in an age where so many companies are collecting so much data about people, even anonymized statistics can present a privacy risk.
- Coronavirus fallout: Massachusetts won’t release town-specific COVID-19 data, citing ‘stigma’ and privacy; some towns doing it on their ownon April 7, 2020 at 4:51 am
From the Berkshires to the outer reaches of the Cape & Islands, there are now tens of thousands of cases of coronavirus, and many hundreds who are seriously or critically sickened with the virus. But ...
- When It Comes To Health Data, Should We Value Privacy Over Innovation?on April 7, 2020 at 4:45 am
Breaches: 41 Million Patient Records Internal and external actors can breach health data via insider snooping, simple carelessness, hacking or ransomware. The consequences can be severe, from fines ...
via Bing News