Fragile Anonymity

 Bruce Schneier, in Crypto-Gram: January 15, 2008, writes an excellent article on the ease of re-identifying "anonymized" data. The Census, research results, survey results, and many other databases are released with identifying information removed with the intent to protect the identity of the subjects in the database. It turns out that it is disturbingly easy to attach the real identities again.

A question of identity

This article What's In A Name at Design Observer, Steven Heller argues against the use of pseudonyms and anonymity in blogs. He states, but never really argues, that pseudonyms are:

  1. Cowardly
  2. Deceitful
  3. Unacceptable

Despite the fact that I blog under my real name, few may find it surprising that I disagree with his claims. In this age where every word we post will last well beyond our years on earth, one should take great care about posting anything under a real name. I hold very different opinions now than I did when I was young. I would not want to have those thoughts thrown back in my face. Many bloggers hold opinions that run counter to those of their employers. Making strong arguments that might be detrimental to ones employer could well be a "career limiting move". The fear of such retaliation is often much worse than the reality. The chilling effect on speech can be significant. Far from being cowardly, I argue that pseudonymous blogging is simply prudent in many cases.That pseudonyms are deceitful would seem to apply to only a very small subset of bloggers, those who are using a pseudonym that appears to be real but is not and which is masking a true identity that, if known, would significantly color a readers interpretation of the blog. In other words, where the choice of the pseudonyms is made with an intent to deceive. The vast majority of pseudonyms I have seen used are obviously such. There is no doubt that the author is using a pseudonym. The desire to speak from behind a mask is completely overt. In addition to security and privacy concerns, one may well choose to do this to allow the writing and arguments to stand on their own, completely apart from the identity of the writer. For example, in a forum on Israeli / Palestinian  issues, the ethnicity of a posters name is likely to completely overshadow the content of the message. A pseudonym allows the reputation of the blogger to be developed on its own. If the arguments and information are sound, the reputation with grow. Because names are not unique identifiers, the use of a real name (or apparently real name) in a blog may give an unrealistic sense of attribution.I completely support the right of people to create spaces where people must be identified. It is their right to do so, and is completely appropriate and reasonable. It is unreasonable and inappropriate to suggest that this should be imposed on the entire Internet and all communications therein. 

US drafting plan to allow government access to any email or Web search

The Raw Story | US drafting plan to allow government access to any email or Web searchNational Intelligence Director Mike McConnell is developing new policies for Internet intelligence gathering. It looks like the changes may be very broad and deep. I worry that this kind of change often has significant impacts on civil liberties while providing minimal improvements to our security.Bad guys have any number of ways of protecting their communications and activities. It is the innocent Internet user that will be caught in this bigger and tighter net. 

Consumer Advocates Seek a ‘Do-Not-Track’ List - New York Times

Consumer Advocates Seek a ‘Do-Not-Track’ List - New York TimesThis idea of a "do not track" list is very interesting but also very problematic. Right off the bat is the problem of how a website would know NOT to track you. If the default is that you be tracked, you would need to pass some kind of token to every website that you wish not to track you. This would probably be a cookie, which would would be vulnerable to deletion every time a user clears her cookies. It also puts the responsibility on the user to keep track of all the websites which might track her information and maintain that preference across all of them.This is very different from the phone number based "do not call" list, where the marketer can check against a list of numbers they should not call. In this case, the user hits the website out of the blue, and the website needs to work out whether to track or not. One solution would be for there to be some kind of universal identifier that all websites could check against the list, but this is certainly replacing one kind of tracking with a much worse kind.This could all be avoided if the default was set to "do not track" and users could opt in. Of course, almost no one would bother to opt in to the targeted tracking. This is a problem because it is exactly this kind of targeted advertising that makes so many free Internet services possible right now. Without ad targeting the advertising revenue would likely be too low to make the services viable. As usual, I am in favor of the user controlled opt out of privacy technology, without requiring the consent or support of the tracking websites. If you don't want to be tracked, tools exist (like Anonymizer) to prevent that tracking. Just use them.

Online privacy? For young people, that's old-school - USATODAY.com

Online privacy? For young people, that's old-school - USATODAY.com Being over 35, I fall in to the "old-school" category described in this article. While I have presence on a number of social networking sights I have been very stingy with the information I have posted there. I think the root cause of the high risk behavior on these sites is in the way they are used. People treat them as an extension of in person, phone, and text message communications. It is just one more mode of communication. Unfortunately this mode of communication has some significant differences. The most important is that it is generally very public, searchable, and archived. It is almost impossible to take something back once it makes its way out on to the net.

As a high school or college student, it may be cool to show the dark side of your personality and not to care what people think. 5-10 years later when you are looking for a job with a high level of trust, requiring a clean reputation, the historical artifacts floating out on the web may turn out to be a real disadvantage.

It may turn out one day that our culture comes to understand this trend and ignores youthful indiscretions memorialized on the Internet, but I would not want to bet my future on that level of forgiveness.

Rogue Nodes Turn Tor Anonymizer Into Eavesdropper's Paradise

Rogue Nodes Turn Tor Anonymizer Into Eavesdropper's Paradise In a follow up to this post I wrote a few weeks ago, we now understand how the 1000 government email accounts were compromised. It turns out that he did it using TOR.

I have said for a long time that I am amazed that any one operates TOR servers other than government people and criminal/terrorist people. As the operator of a TOR server, you have access to the clear text of the data flowing through your server when you are the exit node (about 1/3 of the traffic typically). While the TOR documentation is clear about this vulnerability, it really understates it, and does not address what you should do about communicating with public services that do not provide an option to do end to end encryption of the information.

As a user of TOR, you are trusting the operators of the servers not to monitor your information. Dan Egerstad's attack was simply to violate that trust. He actively monitored all of the traffic through his 5 TOR servers. He ran multiple servers to increase the amount of data he could collect. He identified the government accounts by searching the captured data for simple strings that would indicate the message was an email being sent or received in the clear, then further searching for key words that would indicate is was government or military related.

Many other TOR servers could currently be searching for financial, medical, trade secret, or other information.

With any privacy service, you need to trust the operators of that service. The theory was that you would not need to trust the operators of the TOR network. The reality is that, in real world use, you do have to trust them, but you typically know very little about them. There is almost no hurdle to establishing a new TOR server. Just about anyone with access to a server can set it up as a TOR server. You must assume that many of those people will not have your best interests at heart.

My personal approach is to work with people with a long track-record of trustworthy behavior. Anonymizer has been providing services for almost 12 years. I personally have been operating privacy services since 1992. In that time I have protected millions of people and billions of web pages and emails. Our track record for integrity is long and unblemished. I think that is the kind of basis one should use for deciding who to trust.

Yahoo seeks to dismiss China case - Yahoo! News

Yahoo seeks to dismiss China case - Yahoo! News This is a really interesting legal case. Yahoo was sued in the US by people representing some Chinese journalists who were convicted in China of violating Chinese law. Yahoo's involvement was to provide evidence from their logs and stored account data. The argument is that Yahoo should have resisted more and provided less information under US and International laws.

The people working for Yahoo in China are in a tough place because they could easily be arrested and held in contempt for failing to comply. Widespread corruption in China would almost certainly lead to extra-legal consequences for Yahoo if they resisted.

One might well criticize Yahoo for designing their systems in such a way as to be vulnerable to such foreseeable attempts to gather information on journalists and dissidents.

I think it is a mistake to trust such potentially damaging information to any company like Yahoo, Google, AOL, etc. International law will be a cold comfort if you are sitting in a jail somewhere. The only real solution is to take control of your own information. Use encryption, and anonymity to ensure that your information can not be handed over.

Hacks hit embassy, government e-mail accounts worldwide

Hacks hit embassy, government e-mail accounts worldwide

Usernames and passwords for more than 100 e-mail accounts at embassies and governments worldwide have been posted online. Using the information, anyone can access the accounts that have been compromised.

I am not sure how much needs to be said about this. In general email security is very lax. People often forget just how much information lives in their email accounts. Especially when using Exchange or IMAP type email, all of your old email archives will be compromised if your account is breached. When you consider all of the file attachments most of us get every day, there is probably little sensitive information any of us handle that is not contained in those email archives.

Germany wants to spy on suspects via Web

Germany wants to spy on suspects via Web Germany is proposing to use trojan horse software to enable surveillance of target computers. I have to wonder how effective this will actually be. They are talking about distributing it in an apparently official email from a government email address.

  1. Now that the bad guys know this, it seems likely that they will take more care with the attachments from the government.
  2. Anti-virus / anti-malware programs should be able to identify and block this software
  3. If the anti-virus software makers are convinced to leave a hole for this software, it will be a huge back door for other hackers to use to deploy their trojan horse software.

In general this seems like a high risk operation for the Germans. I suspect that it will be used rarely and very selectively.

Wikipedia Spin Doctors Revealed - Yahoo! News

Wikipedia Spin Doctors Revealed - Yahoo! News Once again, people use the Internet in inappropriate ways assuming that they are anonymous. In this case, Virgil Griffith has created WikiScanner. The idea is really simple. Look through Wikipedia for the IP addresses of everyone who has submitted edits to Wikipedia. They also provide tools to make it easy to see what changes have been submitted by people within specific organizations.

It will come as no surprise that this turns up many blatant attempts to whitewash articles about that organization (or its leaders), or to turn the Wikipedia entry in to a veritable marketing vehicle. I am amazed that people who are net-savvy enough to think of altering Wikipedia entries like this, would simultaneously be unaware that they could easily be identified while doing so.

How search engines rate on privacy | CNET News.com

How search engines rate on privacy | CNET News.com CNET has done a nice little study on the privacy policies and practices of the top 5 search engines. Their results show that their privacy policies leave a lot to be desired. In particular, Google and Yahoo never actually delete search data, and only partially "anonymize" it after over a year. As has been proven many times, the "anonymized" data can still be easily used to identify the actual identity of the searcher.

Sidejacking

Report: "Sidejacking" session information over WiFi easy as pie

While this is not really news, it is a very nice description of a very widespread risk. This issue here is that many websites simply use a serial number in a cookie to keep track of user sessions. The implicit behavior is that if you have the cookie, you are authenticated and logged in. The big problem is that most of these sites are also insecure. With the popularity of insecure WiFi networks, capturing those cookies has become very easy. Once an attacker has the cookie, he can act as you for all purposes on those websites.

The simplest solutions are: enable SSL on the website (if possible), only use WPA secured WiFi, use a VPN, or use Anonymizer with the encrypted surfing option enabled (which effectively makes all websites SSL protected).

Testing if OPT-OUT really lets you OPT-OUT

I am posting this to help the World Privacy Forum test if web advertisers actually honor their own opt-out systems. This should provide some very interesting hard data on the actual activities of big on-line web advertisers. They are running a test on the Opt Out page of the Network Advertising Initiative site and are looking for volunteers. The idea is to determine how well the opt out page is working, for which systems and which browsers. 

Here are the directions:

(To run this test, you will need to set your browser to accept cookies)

1. Open site: http://www.networkadvertising.org/managing/opt_out.asp

2. Check all of the opt out boxes you will see on the right hand column of the screen.

3. Click the submit button. (bottom of page)

4. Note how many of the opt outs were successful. (Successful opt outs will have a green check mark next to them, unsuccessful opt-outs will have a red X mark next to them. 

5. Please tell us your OS and OS version, and your browser and browser version. 

6. If you can, please send us a screen shot of your result page. 

7. Please email results to nai_test@nyms.net  

8. We are closing the test period on Thursday, July 26, at close of business (Pacific). 

Tor hack proposed to catch criminals

Tor hack proposed to catch criminals This article is a couple of months old now, but I have been thinking about it a lot. Basically, HD Moore has created a set of tools to scan the contents of traffic leaving a TOR exit node, and to inject active tracking code into the data returned to the user. While this is possible in any anonymity system, the fact that almost anyone can run a TOR node makes the question of trust much more tricky.

I have talked to Roger Dingledine (one of the creators of TOR) about this but we seem to talk past each other. As I understand it, Roger feels that a user needs to take additional action to protect himself from such threats, including blocking all active content. He would further argue that if you are going to an insecure site, then you are putting yourself at risk. TOR is about anonymity, not security.

While all this is true, it runs aground on the reefs of reality. I am reminded of a statement by Yogi Berra: "In theory there is no difference between theory and practice. In practice there is." People want active content. People want to go to insecure websites. People want privacy. People don't want to work for it.

At the end of the day, that is really the difference between the TOR philosophy and the Anonymizer philosophy. We think that users should not need to be security experts. We think they should not have to research the trustworthiness of a number different individuals or groups. We think that the privacy threats normal people actually face in the real world are a long way from the unlimited money and resource attacks imagined by academic security researchers. Security is a balance. We strive to be secure, fast, and user friendly. I think 11 years with out a single breach of a user's identity from using the service is good evidence that we are doing something right.

Google-DoubleClick Merger Concerns

Google's acquisition of DoubleClick raises many major privacy concerns. Throughout the late 90's DoubleClick was the boogyman of the privacy community. More recently Google has taken on that mantle. The combination creates an information harvesting juggernaut. Google is in a position to see the search terms, and thus focus of interest, of the vast majority of Internet users. Most users start most searches or web expeditions with a Google search. Their logs contain a fairly complete history of the interests of their users going back for years.

DoubleClick has a view of user activity after the search across thousands of websites. Banner and other website ads are not actually hosted on the websites on which they appear. DoubleClick serves the content from their servers, and handles any clicks on the ads. Importantly, DoubleClick can gather your information even if you don't click on the link. Simply viewing the ad is enough for them to cookie you, to gather your IP address, and store that along with the URL you are viewing.

Combined, this enables the creation of a database most searches along with most subsequent web surfing activities. Nearly ubiquitous Internet monitoring by a single entity will be a reality after this merger. Having both the search information and the surfing activity give the answer to both the what and why of a users actions. The merged data is much more powerful than the individual components, and serve to fill in the gaps in each other's coverage.

Ironically, even Microsoft is talking about the privacy risks of this merger. Redmond | News: Microsoft Warns of Google-DoubleClick Danger

The Electronic Privacy Information Center (EPIC) has gone so far as to file a complaint with the FCC.

Mixed feelings about Whitehouse use of outside email accounts.

I have been following a number of stories like this,Congress Follows Email Trail - WSJ.com, about the Whitehouse use of RNC controlled email accounts to discuss the firings of federal prosecutors. The law appears quite clear. Official Whitehouse email is a document that must be retained. Discussions of firing federal prosecutors sounds official to me. Therefore the Whitehouse was wrong to use outside email addresses to keep the discussions secret. I am not comfortable with the law in the first place. Email and other electronic communication media like chat and IM are often used more like casual conversation than formal memos. Few would argue that the President's every word should be recorded at all times. It would make discussion and debate next to impossible. In the process of thinking through an issue one may consider many potentially unpopular ideas, if only for the purpose of argument. Free and unconstrained give and take generally leads to be best understanding and decisions. Free and unconstrained debate can not take place with the world looking over your shoulder and scrutinizing every word.

If we accept that email and chat are used like conversation to hash out ideas, then it is very damaging to the process to place heavy recording and monitoring requirements on it. At the same time, having no oversight substantially reduces accountability. It might even facilitate corruption.

This really shows in a microcosm the greater question of general communications privacy vs. law enforcement access. It is a hard balancing act because there is very little middle ground. Basically you are either monitored or not. Having monitoring of a random half of the messages is going to make everyone unhappy.

Report: IRS bungles may imperil data

As a followup to my discussion of risks of online tax filing, here is an article on security weaknesses at the IRS. Report: IRS bungles may imperil dataIt does not appear that this is particularly connected to online filing, but rather an overall laxness in their security.

Google Changes to Privacy Practices

On March 14th Google announced plans to improve their privacy practices by "anonymizing" their logs after 18-24 months. As usual Google is getting slammed for their efforts, despite the fact that no other search engine is making any efforts at privacy at all. I am going to join in, not to pick on Google, but because this affords us a chance to discuss these issues and debate what the policy SHOULD be.

Read More