The Privacy BlogPrivacy, Security, Cryptography, and Anonymity

Jul/14

31

Canvas Fingerprinting: a reality check

Fingerprint to binary

The Internet is buzzing with discussions about a new kind of tracking called Canvas Fingerprinting. In fact, the technique goes back to a paper by Mowery and Shacham back in 2012. Canvas Fingerprinting gets most of its information from the hardware and software used to render images on a given computer. When asked to render a geometric curve or a modern font to the screen, the system has many decisions to make in the process of turning that into the brightness and color values of the pixels in the image. The technique for creating the Canvas Fingerprint is to give the browser a somewhat complex image to render, capture the actual pixel values produced, which is then hashed down to make the actual fingerprint.

Canvas Fingerprinting is really just another technique for capturing information about a user’s computer as part of a larger system fingerprint. I have been talking about tools like Panopticlick which take all kinds of different information they can see about your computer’s configuration to try to create a unique identifier. Testing my computer right now it says that my browser fingerprint contains at least 22 bits of entropy and is unique among the roughly 4.3 million users they have tested so far. Panopticlick uses information about the browser, operating system, time zone, fonts, plugins, and such to create the identifier.

By comparison, Canvas Fingerprinting contains on average 5.7 bits of entropy meaning that about one in 52 people on the Internet would have the exact same fingerprint. That makes it a lousy identifier on its own.

The real power of this new technique is in combination with other fingerprints like those used in Panopticlick. By combining the two there is about 27.7 bits of entropy which would identify me to one in 218 Million people. Once of the strengths of Canvas Fingerprinting is that it captures very different kinds of information than many other methods. For example, because a windows machine comes with a whole bunch of fonts installed, knowing that a computer is running windows immediately tells you a lot about the fonts. The two bits of information are hight correlated. The Canvas Fingerprint mostly gives information about the graphics subsystems. Knowing the operating system does not tell you very much at all about the specific chipset or firmware in the graphics processor, they are mostly independent.

So, in short Canvas Fingerprinting is not that big a deal, and folks should not get so worked up about it, however system fingerprinting in general IS a big deal. It is now good enough to allow individual users to be tracked even if they are deleting all their cookies and hiding their IP addresses with tools like Anonymizer Universal. System fingerprints are not identifying in the same way an IP address is, but they do allow a person to be recognized when they revisit a website, or a cooperating website.

Current best practice to minimize System Fingerprint based tracking (including Canvas Fingerprinting) is to run the browser inside a clean and un-customized virtual machine, which you then revert back to the clean state at the end of every use. That will give your browser a maximally generic identifier, while also eliminating all other kinds of tracking techniques.

·

6 comments

  • Uri · August 6, 2014 at 5:01 pm

    Very interesting. Thanks for the clear information.

    I have 2 questions:

    (1) The so-called fingerprint that is created for individual computer, real or virtual, has to be stored somewhere for future comparison, either (a) locally on said computer, or (b) somewhere else. If it is stored locally, it should be possible to locate and identify the finger print and either delete it or corrupt it. If it stored remotely, how is it associated with the local computer? There must be some tracker that establishes 1-to-1 relationship with the fingerprint. Then this tracker can be identified, located and deleted or corrupted.

    (2) There must be something about the fingerprinting process that is unique. For example, the code collects data and acts as if it is about to instruct the browser to paint something in a window but never actually does it. It should be possible to identify such processes, block their operation or delete its final outcome.

    In general, shouldn’t it be possible to filter out either the fingerprinting process or the resulting fingerprints?

    Thank you very much,

    Uri

    Reply

    • Author comment by Lance Cottrell · August 6, 2014 at 6:29 pm

      The fingerprint is stored on the server, rather than on your computer. In someways, the name fingerprint is very apt. If I see a given fingerprint, and then see it again later, I can assume the same person left both prints. If at any time I can attach an identity to that fingerprint, then I know who it is every time I see it.
      Human fingerprints are not actually unique, but the random pattern of swirls is so uncommon, that it can be treated as such for most purposes. Likewise, if I gather enough different kinds of information about your computer, the signature may be shared by only a few other computers, or might even be unique to you. That is what the measurement of entropy tells you. The more variation in a given bit of information, the fewer people will share it with you.
      It is similar to the de-anonymization of databases issue I wrote about a while back. There are a huge number of people in your zipcode, and many people in the world with your birthday, gender, occupation, and such. However, it is likely that you are the only person in your zipcode with your birthday, gender and occupation all at once.
      It is not easy to avoid fingerprinting, and doing so is unusual and therefor its own fingerprint. Looking like a very new computer is going to provide the least amount of useful information.

      Reply

  • lower bound · August 7, 2014 at 7:12 pm

    Let’s don’t miss the fact that 5.7 bits entropy is a LOWER BOUND and comes from an experiment done with only *294* participants.

    This value (as a lower bound) only valid for the specific fingerprinting method used in the 2012 paper — which is pretty naive compared to that of AddThis.

    Reply

    • Author comment by Lance Cottrell · August 8, 2014 at 8:45 am

      Do you have any references I could look at for the AddThis methodology? I looked around a bit and could not find anything more than vague generalizations.

      Reply

  • lower bound · August 13, 2014 at 4:03 pm

    The following study finds 6.90 bits of entropy when they repeat the canvas fingerprinting experiment with 527 participants.

    https://github.com/qwelyt/Browser-Fingerprinting

    n=294, H=5.73 bits (1 in ~53 shares the same fingerprint on avg)
    n=527, H=6.90 bits (1 in ~119 shares the same fingerprint on avg)

    It’s a bachelor thesis, up to you to take it as granted, but precisely supports my point: the original experiment was extremely limited for measuring the actual entropy. This was indeed noted by the authors of the 2012 paper in the conclusion (3rd paragraph).

    Addthis’s script itself may be the best for checking their methodology: https://ct1.addthis.com/static/r07/core130.js

    Reply

    • Author comment by Lance Cottrell · August 18, 2014 at 10:01 am

      Thanks for the info.This paper suggests there might be as much as 10 bits of entropy.

      Reply

Leave a Reply

<<

>>