Wednesday, October 5, 2011

Is Facebook Tracking our Web Surfing Activity?

Over the last few days, every since Facebook did the major update to it's user interface I have been reading a lot of articles on the web relating to the topic "Does Facebook track our web surfing activity even after we log out". This is clearly a very major breach of online privacy as no website has the right to track our surfing activity without our knowledge.

Out of curiosity, I decided to do my own research and dissect the Facebook cookies and see if they are indeed tracking a user's activity on third party sites. I give my findings below in 3 parts.

Part 1: A Step by Step Method on how Facebook "could" Monitor a User's Activity if they want
Part 2: The Facebook Cookies Dissected and their Usage Identified
Part 3: A Conclusion on the Findings


Part 1: 
A Step by Step Method on how Facebook "could" Monitor a User's Activity if they want

For those of you who are curious as to how Facebook could be doing this, I have described one such way below. This is how I would do it and have done in the past as a project - which I later dumped as I didn't think it was ethical to track a user's surfing patterns, even though my intention was to build a recommendations engine that will improve the user experience.

Step 1) User opens the browser and goes to www.facebook.com

Step 2) User logs in

Step 3) Facebook drops multiple "cookies" on the "facebook.com" domain. Each cookie holding small bits of data that is used INTERNALLY on the site to identify who you are, what you doing etc. Remember the keyword here is INTERNALLY, as you are signed into Facebook then its fine for Facebook to do this as it can improve the user experience and serve you relevant ads etc.

Also technically, a website (i.e facebook.com) can only read and write cookies on its own domain, for e.g Facebook can't write some code on its web pages to read cookie information that Twitter stored on your browser when you surfed twitter.com. So think of cookies as secured boxes of data that only the owner can open, see its contents and modify it.

Box A (Cookie) - belongs to the facebook.com domain
Box B (Cookie) - belongs to the twitter.com domain

Facebook can only read, write and modify Box A
Twitter can only read, write and modify Box B

Step 4) As mentioned above, as you are logged into facebook.com and move around the site, Facebook continues to update the cookies as well as add new cookies that track your movements within the site. Some of these cookies are also used to identify you and would usually have some unique code that represents you.

Step 5) User now logs out of facebook.com. This is when facebook.com and any other site SHOULD delete all cookies that were created when you were signed in. But a lot of sites don't do this and when confronted about this, say things like "we don't store any personal data that can identity you and we only use it to enhance your surfing experience when you visit the site again".

Step 6) User now moves to another third party site that uses Facebook plugins like the "Like" button. for example, the user moves onto one of my blog posts which has the Like button plugin at the bottom of each post:
e.g: http://newbreedofgeek.blogspot.com/2011/08/parramatta-park-duathlon.html

Step 7) Now here is where the privacy accusation comes in, some people are claiming that even though I am NOT signed into facebook.com and because they do not clear all the cookies when I sign out, Facebook is using some kind of machanism to track you on the third party site (in this example it's http://newbreedofgeek.blogspot.com) and log what you are reading.

But how can that be? as I mentioned in step 3, as we are now on the http://newbreedofgeek.blogspot.com domain how can any code get access to the cookies that were stored on the facebook.com domain? simple, as the "Like" button plugin is a "iframe" based tool and its placed onto your own html page on the http://newbreedofgeek.blogspot.com and the plugin url comes from facebook.com, each time a request is made to fetch the plugin from facebook.com all those cookies are sent in the HTTP header.

So after logging out of facebook.com and if the site DID not clear any cookie that can uniquely identify you then now, because these cookies are sent in the HTTP header of the request that is sent to grab the Like button plugin code, Facebook can now grab the URL of the third party site you are on and store it against the unique value that indefies you. Essentially, they now know what you are reading on any website that uses the Like button or other facebook plugins. And as millions of sites user Facebook plugins, your reading habits can now be tracked very easily.


Part 2: 
The Facebook Cookies Dissected and their Usage Identified

I used the Firebug cookie inspector to monitor and inspect the cookies Facebook drops and updates when you use their site. I basically monitored the cookie activity during the following 3 cases (note: I only payed attention to the cookies dropped on the facebook.com domain, a  lot more cookies are dropped as well which are probably used to ad serve but are on a different domains).

Case 1: User successfully signs into facebook.com and moves around the site.

Here are the cookies found during this phase (cookie name : cookie value):

presence : EM317106464L243REp999_5f500898692F0X317106362035Y0OQ0EsF0CEblFDacF14G317106464PCC
p : 2439
xs : 62%3A637f02f1459828704e7e967c0db99972601
sct : 138171086356
s : Aa6KmQVH-IzYB6taNp
lu : RANTXi3TVg3XLscvXPVF7yp29w
c_user : 5000006920
checkpoint :%7B%22u%22%3A500898692%2C%22t%22%3A1317106349%2C%22ca%22%3A%22%22%2C%22step%22%3A1%2C%22n%22%2C%2%3A%22skhGRVhfs78%3D%22%2C%22f%22%3A205050996206056%2C%22la%22%3Anull%2C%22ln%22%3Anull%2C%22ssp%22%3Anull%2C%22s%22%3A%22AWXufr8GYQZjn8ke%22%2C%22cs%22%2C%2%3Anull%7D
datr : rHRKBTo-8PBAwlpSFTIRL1C8gcE
act : 131971063C43425%2F6


Case 2: User successfully  logs out of facebook.com

Here are the cookies found during this phase (cookie name : cookie value):

datr : rHRKBTo-8PBAwlpSFTIRL1C8gcE
checkpoint : %7B%22u%22%3A500898692%2C%22t%22%3A1317106349%2C%22ca%22%3A%22%22%2C%22step%22%3A1%2C%22n%22%2C%2%3A%22skhGRVhfs78%3D%22%2C%22f%22%3A205050996206056%2C%22la%22%3Anull%2C%22ln%22%3Anull%2C%22ssp%22%3Anull%2C%22s%22%3A%22AWXufr8GYQZjn8ke%22%2C%22cs%22%2C%2%3Anull%7D

p : 2439
act : 1317p1065068B02%2F5
locale : en_GB
lu : RAfUXyaWkITwQ_jARpBW-zrvg
lsd : GsgCC
reg_fb_gate : http%3A%2F%2Fwww.facebook.com%2Findex.php%3Flh%3D0662b518dfa1f7c48be10aa18ca065b4%26eu%3D7WjU9u9xwejEGQYxDqcBZw
wd : 1600x459

Case 3: User then moves onto a third party site which uses Facebook's "Like" plugin, in this case one my my blog posts


datr :rHRKBTo-8PBAwlpSFTIRL1C8gcE
checkpoint : %7B%22u%22%3A500898692%2C%22t%22%3A1317106349%2C%22ca%22%3A%22%22%2C%22step%22%3A1%2C%22n%22%2C%2%3A%22skhGRVhfs78%3D%22%2C%22f%22%3A205050996206056%2C%22la%22%3Anull%2C%22ln%22%3Anull%2C%22ssp%22%3Anull%2C%22s%22%3A%22AWXufr8GYQZjn8ke%22%2C%22cs%22%2C%2%3Anull%7D
p : 2439
act : 1317p1065068B02%2F5
locale : en_GB
lu : RAfUXyaWkITwQ_jARpBW-zrvg
lsd : GsgCC
reg_fb_gate : http%3A%2F%2Fwww.facebook.com%2Findex.php%3Flh%3D0662b518dfa1f7c48be10aa18ca065b4%26eu%3D7WjU9u9xwejEGQYxDqcBZw
reg_fb_ref : http%3A%2F%2Fwww.facebook.com%2Findex.php%3Flh%3D0662b518dfa1f7c48be10aa18ca065b4%26eu%3D7WjU9u9xwejEGQYxDqcBZw


What do we notice from this?
1) Firstly, after the user logs out of facebook.com in Case 2, you will notice that most of the cookies that were used when you were logged in are still around. The state of some of these cookies have changed, i.e. the values have changed and we'll look at that below in more detail.

2) When the user moves onto the third party site you can see that a lot of facebook's cookies from Case 2 also move along to that site.

Now the question we have to ask is this, After the user logs out of facebook.com is there still some data stored in the cookies that can "uniquely" identify you? The first step towards answering this would be to compare the cookie values between Case 1 and Case 2, i.e. the logged in state and the logged out state.

Here is the comparison (cookie name : result of comparison)

presence : Cookie Removed
p : Still there and same value
xs : Cookie Removed
sct : Cookie Removed
s : Cookie Removed
lu : Still there but different value
c_user : Cookie Removed
checkpoint : Still there and same value
datr : Still there and same value
act : Still there but different value


So then here is the list of cookies that have the same value or a different value even after the user has logged out. i.e. If Facebook wants to track you then could use any of these cookies. It would be easier to track you if the cookie had the same value regardless of your signed in/off state but even if the values are different they can still use some kind of mapping to identify you.

p : Still there and same value
lu : Still there but different value
checkpoint : Still there and same value
datr : Still there and same value
act : Still there but different value


And now to confirm if the following cookies (with have the same value or a different value) are sent via the HTTP Header call to fetch the Facebook like plugin code on a third party.

Here is a screen shot of the HTTP header call that is made as soon as you load the third party site and the Facebook Like plugin is about to be rendered on the page.


So as you can see, all the above cookies are sent to Facebook (except the 'checkpoint' one, not sure why the HTTP request did not include this in the header).


Part 3: 
A Conclusion on the Findings

If you look at the values of the cookies like "act", "datr", "lu" and "lsd" which are sent via the HTTP header to the third party site, they do look like some kind on unique key values, so the real question is what are they using these values for? is it something totally unrelated or are they using it to map viewing habits to Facebook users.

In my humble opinion, I feel that Facebook is NOT tracking anyone. Doing something like this is a major violation of a user's privacy and I like to think that the business decision makers in Facebook are not dumb enough to do something like this as the risk of getting caught and exposed could lead to a masse exodus of its users as they feel like they are being spied on. That being said, it is indeed possible for Facebook or any other website to build these kinds of tracking mechanisms so don't be surprised if your surfing habits are being logged by websites, after all its a well known fact that ISP's maintain logs about your surfing habits and yet a lot of people choose to be in denial about it.

So could Facebook be tracking your surfing activity on third party websites? My answer is, they could if they wanted to but I doubt they will risk it.


Disclaimer:
1) All information provided about here are based on my research and opinions and in no way should be taken as fact. It's provided for educational purposes only.
2) The snapshot of cookies and their values was taken around 26th September 2011. Facebook could have very well changed these cookies and values since then.
3) I have altered the values of the cookies slightly.


Please let me know you thoughts in the comments section and happy surfing!



2 comments:

  1. It's unbelievable.. but in the same time I'm not surprised that Facebook will do just about anything to increase their earnings caring not about us (the users) and our right to privacy!! 

    They of course have cookies tracking, so even if they deny it, how could we possibly trust them?! 

    My take on this: http://www.vectorash.ro/facebook_tracking_activity_after_logoff/

    ReplyDelete
  2. Just a sample comment to test if this Disqus comment tool is working as i've seen some issue lately...

    ReplyDelete

Fork me on GitHub