CAT | Anonymity
Tor just announced that they have detected and blocked an attack that may have allowed hidden services and possibly users to be de-anonymized.
It looks like this may be connected to the recently canceled BlackHat talk on Tor vulnerabilities. One hopes so, otherwise the attack may have been more hostile than simple research.
Tor is releasing updated server and client code to patch the vulnerability used in this attack. This shows once again one of the key architectural weaknesses in Tor, the distributed volunteer infrastructure. On the one hand, it means that you are not putting all of your trust in one entity. On the other hand, you really don’t know who you are trusting, and anyone could be running the nodes you are using. Many groups hostile to your interests would have good reason to run Tor nodes and to try to break your anonymity.
The announcement from Tor is linked below.
Thanks to WhoIsHostingThis for providing this informative infographic (click to enlarge). They provide a cool service that allows you to look up the hosting service behind any website.
- A decision giving Canadians more rights to Anonymity
- Iraq’s recent blocking of social media and more
- Iran’s outright criminalization of social media
- A court decision requiring warrants to access cell tower location data
- Another court stating that irrelevant seized data needs to be deleted after searches
- A massive failure of data anonymization in New York City
- A court requiring a defendant to decrypt his files so they can be searched
- The Supreme Court ruling protecting cellphones from warrantless search.
- Phone tracking streetlights in Chicago
- And a small change for iPhones bringing big privacy benefits
The Importance of Privacy & The Power of Anonymizers: A Talk With Lance Cottrell From Ntrepid — The Social Network Station A recent interview I did, talking about data anonymization and mobile device privacy. Lance Cottrell is the Founder and Chief Scientist of Anonymizer. Follow me on Facebook, Twitter, and Google+.
One often hears that some massive collection of data will not have privacy implications because it has been “anonymized”. Any time you hear that, treat the statement with great skepticism. It turns out that effectively anonymizing data, making it impossible to identify the individuals in the data set, is much harder than you might think. The reason comes down to combinatorics and structured information.
This article on Medium by Vijay Pandurangan discusses a massive data set of NYC taxies, complete with medallion number, license number, time and location of every pick up and drop off, and more. The key to unraveling it is that there are just not that many taxi medallions, and the numbering structure only allows for a manageable possible number of combinations (under 24 million). While that would be a lot to work through by hand, Vijay was able to hash and identify every single one in the database in under 2 minutes.
Another approach would have been to make a set of known trips, note the location, time, etc., then use that to map the hash to the true identity. More work but very straight forward.
Even harder is the problem of combinatorics when applied to “non-identifying” data. One will often see birth date (or partial birth date) zip code, gender, age, and the like treated as non-identifying. Just five digit Zip-code, date of birth, and gender will uniquely identify people 63% of the time.
A study of cell phone location data showed that just 4 location references was enough to uniquely identify individuals.
This is a great resource on all kinds of de-anonymization.
The reality is that, once enough is collected is is almost certainly identifiable. Aggregation provides the best anonymization, where individual records represent large groups of people rather than individuals.
Update: small edit for clarification of my statement about aggregation.
Canada’s Supreme Court just released a ruling providing some protection for on-line anonymity. Specifically, the ruling requires law enforcement to obtain a warrant before going to an Internet provider to obtain the identity of a user. Previously they were free to simply approach the provider and ask (but not compel) the information.
The judges found that there is a significant expectation of privacy with respect to the identifying information, and that anonymity is a foundation of that right.
Unfortunately the case in question revolves around child pornography, which creates a great deal of passion. Much of the reaction against the decision has come from those working to protect abused children. Because the ruling has implications primarily far from child porn cases, I applaud the court in taking the larger and longer view of the principle at work.
It is important to remember that the court is not saying that the information can not be obtained. This is not an absolute protection of anonymity. This decision simply requires a warrant for the information, ensuring that there is at least probable cause before penetrating the veil of anonymity.
Paying for anonymity is a tricky thing, mostly because on-line payments are strikingly non-anonymous. The default payment mechanism on the Internet is the Credit Card, which generally requires hard identification. There are anonymous pre-paid cards, but they are getting harder to find, and most pre-paid cards are requiring registration with real name and (in the US) social security number.
We are working on supporting Bitcoin which provides some anonymity, but not as much as you might think. New tools for Bitcoin anonymity are being developed, so this situation may improve, and other crypto currencies are gaining traction as well.
When it comes to anonymity, cash is still king. Random small US bills are truly anonymous, and widely available (1996 study showed over half of all physical US currency circulates outside the country). While non-anonymous payments only allow Anonymizer to know who its customers are, not what they are doing, that information might be sensitive and important to protect for some people.
That is why Anonymizer accepts cash payments for its services. Obviously it is slower and more cumbersome, but for those who need it, we feel it is important to provide the ultimate anonymous payment option. If you are looking at a privacy provider, even if you don’t plan to pay with cash, take a look at whether it is an option. It could tell you something about how seriously they take protecting your privacy overall.
Here is more evidence that, if a service has access to your information, that it can get out. In this case the privacy services Whisper and Secret have privacy policies that say they will release messages tied to your identity if presented with a court order, but also to enforce their terms of service and even in response to a simple claim of “wrongdoing” (whatever that might mean).
Anonymizer has no logs connecting user activity to user identity, thus we don’t have these problems.
Janet Vertesi, sociology professor at Princeton, recently tried an on-line experiment. She had just discovered that she was pregnant, and wanted to see if it would be possible to hide that fact from “big data”. Could she prevent advertisers and social media companies from discovering this one fact, and using it to profile and target her.
Janet only tried to hide this one fact. She used pre-payed payment methods, TOR anonymity tools, and took great pains to prevent her “friends” from mentioning the pregnancy on any social media platforms. She had already opted out of using Gmail, which would have been scanning her emails as well.
While she was able to be reasonably effective, the effort and cost involved was significant, and there were some slips from within her social network. This is a great demonstration of the idea that you really need to be specific about what it is you want to hide. The personal and social costs of trying to stay “off the grid” completely are completely unacceptable for most people. The more you can identify and isolate just the individual facts or activities you want to protect, the easier it is and the more likely you will succeed.
On Monday, Dec 16, during final exams, someone sent an email to Harvard University administrators saying that there were bombs in two of four named buildings on campus. The threat was a hoax to get out of final exams. The sender used TOR and Guerrilla Mail, a disposable email address service, to hide his identity.
Despite that, police quickly identified Eldo Kim, he confessed, and was arrested. So, why did the privacy tools fail?
According to the FBI affidavit, the lead came from Harvard University, which was able to determine that Mr. Kim had accessed TOR from the university wireless network shortly before and while the emails were being sent.
This is really a case of classic police work. A bomb threat during finals is very likely to be from a student trying to avoid the tests. A student trying to avoid a test is unlikely to have the discipline to find and use a remote network. Therefor, the one or hand full of students using TOR at the time of the email are the most likely suspects…. and it turns out that they are right.
This case provides some important lessons to the rest of use who are trying to protect our identities for less illegal reasons.
First, clearly the Harvard Wireless network is being actively monitored and logged. It is reasonable to assume that your ISP or government might be monitoring your activities. One way to reduce correlations of your activity is to use privacy tools all the time, not just when you need them. This provides plausible deniability.
After all, if you never use such services, except for ten minutes exactly when some message was sent, and you are a likely suspect, then the circumstantial evidence is very strong. If you are using them 24/7, then the overlap says nothing.
Second, if Mr. Kim used anonymous email, how did they know he used TOR to access the email service? Because GuerrillaMail embeds the sending IP address in every outgoing email. The service only hides your email address, not your IP. In this case, they must have embedded the IP address of the exit TOR node. Even if they had not embedded the IP, GuerrillaMail keep logs which would have been available to the FBI with a warrant.
The lesson here is to look closely at your privacy tools, and to understand what they do protect and what they don’t.
The most important takeaway is that there is no privacy tool which will let you turn it on and turn off your brain. You always need to be thinking about what you are hiding, from whom, and how much effort they are likely to expend in finding you.
If you are hiding your IP address to get a better price on airline tickets, the threat is very low across the board. If you make terrorist threats, it is very hard to stay hidden afterwards.