RSS

Category Archives: Ethical Concerns

Thwarting SSL Inspection Proxies

A disturbing trend in corporate IT departments everywhere is the introduction of SSL inspection proxies.  This blog post explores some of the ethical concerns about such proxies and proposes a provider-side technology solution to allow clients to detect their presence and alert end-users.  If you’re well-versed in concepts about HTTPS, SSL/TLS, and PKI, please skip down to the section entitled ‘Proposal’.

For starters, e-commerce and many other uses of the public Internet are only possible because the capability for encryption of messages to exist.  The encryption of information across the World Wide Web is possible through a suite of cryptography technologies and practices known as Public Key Infrastructure (PKI).  Using PKI, servers can offer a “secure” variant of the HTTP protocol, abbreviated as HTTPS.  This variant itself encapsulates other application level protocols, like HTTP, using a transport-layer protocol called Secure Socket Layer (SSL), which as since been superseded by a similar, more secure version, Transport Layer Security (TLS).  Most users of the Internet are familiar with the symbolism common with such secure connections: when a user browses a webpage over HTTPS, usually some visual iconography (usually a padlock) as well as a stark change in the presentation of the page’s location (usually a green indicator) show the end-user that the page was transmitted over HTTPS.

SSL/TLS connections are protected in part by a server certificate stored on the web server.  Website operators purchase these server certificates from a small number of competing companies, called Certificate Authorities (CA’s), that can generate them.  The web browsers we all use are preconfigured to trust certificates that are “signed” by a CA.  The way certificates work in PKI allows certain certificates to sign, or vouch for, other certificates.  For example, when you visit Facebook.com, you see your connection is secure, and if you inspect the message, you can see the server certificate Facebook presents is trusted because it is signed by VeriSign, and VeriSign is a CA that your browser trusts to sign certificates.

So… what is an SSL Inspection Proxy?  Well, there is a long history of employers and other entities using technology to do surveillance of the networks they own.  Most workplace Internet Acceptable Use Policies state clearly that the use of the Internet using company-owned machine and company-paid bandwidth is permitted only for business use, and that the company reserves the right to enforce this policy by monitoring this use.  While employers can easily review and log all unencrypted that flows over their networks, that is any request for a webpage and the returned rendered output, the increasing prevalence of HTTPS as a default has frustrated employers in recent years.  Instead of being able to easily monitor the traffic that traverses their networks, they have had to resort to less-specific ways to infer usage of secure sites, such as DNS recording.

(For those unaware and curious, the domain-name system (DNS) allows client computers to resolve a URL’s name, such as Yahoo.com, to its IP address, 72.30.38.140.  DNS traffic is not encrypted, so a network operator can review the requests of any computers to translate these names to IP addresses to infer where they are going.  This is a poor way to survey user activity, however, because many applications and web browsers do something called “DNS pre-caching”, where they will look up name-to-number translations in advance to quickly service user requests, even if the user hasn’t visited the site before.  For instance, if I visited a page that had a link to Playboy.com, even if I never click the link, Google Chrome may look up that IP address translation just in case I ever do in order to look up the page faster.)

So, employers and other network operators are turning to technologies that are ethically questionable, such as Deep Packet Inspection (DPI), which looks into all the application traffic you send to determine what you might be doing, to down right unethical practices of using SSL Inspection Proxies.  Now, I concede I have an opinion here, that SSL Inspection Proxies are evil.  I justify that assertion because an SSL Inspection Proxy causes your web browser to lie to it’s end-user, giving them a false assertion of security.

What exactly are SSL Inspection Proxies?  SSL Inspection Proxies are servers setup to execute a Man-In-The-Middle (MITM) attack on a secure connection, on behalf of your ISP or corporate IT department snoops.  When such a proxy exists on your network, when you make a secure request for https://www.google.com, the network redirects your request to the proxy.  The proxy then makes a request to https://www.google.com for you, returns the results, and then does something very dirty — it creates a lie in the form of a bogus server certificate.  The proxy will create a false certificate for http://www.google.come, sign it with a different CA it has in its software, and hand the response back.  This “lie” happens in two manners:

  1. The proxy presents itself as the server you request, instead of the actual server you requested.
  2. The proxy states the certificate handed back with the page response is a different one than what was actually handed back by that provider, http://www.google.com in this case.

This interchange would look like this:

It sounds strange to phrase the activities of your own network as an “attack”, but this type of interaction is precisely that, and it is widely known in the network security industry as a MITM attack.  As you can see, a different certificate is handed back to the end-user’s browser than what http://www.example.com in the above image.  Why?  Well, each server certificate that is presented with a response is used to encrypt that data.  Server certificates have what is called a “public key” that everyone knows which unique identifies the certificate, and they also have a “private key”, known only by the web server in this example.  A public key can be used to encrypt information, but only a private key can decrypt it.  Without an SSL Inspection Proxy, that is, what normally happens, when you make a request to http://www.example.com, example.com first sends back the public key of the server certificate for its server to your browser.  Your browser uses that public key to encrypt the request for a specific webpage as well as a ‘password’ of sorts, and sends that back to http://www.example.com.  Then, the server would use its private key to decrypt the request, process it, then use that ‘password’ (called a session key) to send back an encrypted response.  That doesn’t work so well for an inspection proxy, because this SSL/TLS interchange is designed to thwart any interloper from being able to intercept or see the data transmitted back and forth.

The reason an SSL Inspection Proxy sends a different certificate back is so it can see the request the end-user’s browser is making so it knows what to pass on to the actual server as it injects itself as a proxy to this interchange.  Otherwise, once the request came to the proxy, the proxy could not read it, because the proxy wouldn’t have http://www.example.com’s private key.  So, instead, it generates a public/private key and makes it appear like it is http://www.example.com’s server certificate so it can act on its behalf, and then uses the actual public key of the real server certificate to broker the request on.

Proposal

The reason an SSL Inspection Proxy can even work is because it signs a fake certificate it creates on-the-fly using a CA certificate trusted by the end user’s browser.  This, sadly, could be a legitimate certificate (called a SubCA certificate), which would allow anyone who purchases a SubCA certificate to create any server certificate they wanted to, and it would appear valid to the end-user’s browser.  Why?  A SubCA certificate is like a regular server certificate, except it can also be used to sign OTHER certificates.  Any system that trusts the CA that created and signed the SubCA certificate would also trust any certificate the SubCA signs.  Because the SubCA certificate is signed by, let’s say, the Diginotar CA, and your web browser is preconfigured to trust that CA, your browser would accept a forged certificate for http://www.example.com signed by the SubCA.  Thankfully, SubCA’s are frowned upon and increasingly difficult for any organization to obtain because they do present a real and present danger to the entire certificate-based security ecosystem.

However, as long as the MITM attacker (or, your corporate IT department, in the case of an SSL Inspection Proxy scenario) can coerce your browser to trust the CA used by the proxy, then the proxy can create all the false certificates it wants, sign it with the CA certificate they coerced your computer to trust, and most users would never notice the difference.  All the same visual elements of a secure connection — the green coloration, the padlock icon, and any other indicators made by the browser, would be present.  My proposal to thwart this:

Website operators should publish a hash of the public key of their server certificates (the certificate thumbprint) as a DNS record.  For DNS top-level domains (TLD’s) that are protected with DNSSEC, as long as this DNS record that contains the has for http://www.example.com is cryptographically signed, the corporate IT department of local clients nor a network operator could forge a certificate without creating a verifiable breach that clients could check for and then warn to end users.  Of course, browsers would need to be updated to do this kind of verification in the form of a DNS lookup in conjunction with the TLS handshake, but provided their resolvers checked for an additional certificate thumbprint DNS record anyway, this would be a relatively trivial enhancement to make.

EDIT: (April 15, 2013): There is in fact an IETF working group now addressing this proposal, very close to my original proposal! Check out the work of the DNS-based Authentication of Named Entities (DANE) group here: http://datatracker.ietf.org/wg/dane/ – on February 25, they published a working draft of this proposed resolution as the new “TLSA” record.  Great minds think alike. :)

 
 

Tags:

CNN Lies to Every One of Its Web Viewers

When is it okay to flat out lie to your users?  I would argue: Never.  But the website of one of the world’s most watched sources of news, CNN, does just that.

Near the bottom of every article is a section called “We recommend” and “From around the web”.  These sections list about six links to other articles either on CNN itself, other Turner properties, or simply as a paid referral service for selected partners.  So what’s my beef with this?  It’s not the targeted marketing, it’s the outright lie I noticed they make when you hover over any of those links with your mouse.

For some background, I’m a huge dissident against outbound link tracking.  It’s fundamentally the same as gluing a GPS tracking device to your forehead and giving a a tracking device to the website you’re visiting.  I have a problem with it because I think there is a fundamental freedom that is eroded by this technology – the freedom to consume information without being tracked for doing so.  Do I have the right to pick up a magazine and browse through it without giving someone my telephone number?  I would say yes — I think it is a natural right to be able to consume information without having your consumption observed.

But my belief here isn’t realistic — tracking basic visitor behavior and consumer preferences is the basic monetization and sustainability model for most of the Web as we know it.  So, this world doesn’t mesh with my perfect world, but at least I should know if someone is observing my behavior, right?  Observing CNN’s privacy policy one can clearly see the word “link” is referenced twice, once in relation to third-party sites that may cookie you, and once for integration to social media or other partner sites that may have differing privacy policies.

Okay, fair enough, therefore I should expect that if I am surfing just CNN’s website, if I disable cookies, and if I turn on my do not track header, I should expect not to be tracked, right?  No, and the reason is I cannot find out when I’m still on the CNN site to only stay within it.  The reason is CNN has specifically coded it’s site to lie to me about when I’m staying within it or navigating away.  For an example, if I were to hover over one example link in these two sections, I see the following in my browser status bar:

http://www.cnn.com/2012/07/15/sport/jason-kidd-arrested/index.html

I right-clicked the link in Chrome and copied the URL.  Then curiously I noticed the link read differently in the browser status bar when hovering over it, this time reading:

http://traffic.outbrain.com/network/redir?key=ad68e2a0a57f3eb04e4553bf2e80b6b2&rdid=349349184&type=MVLVS_d/t1_ch&in-site=false&req_id=968ab83e0a0f44e584d8744520d2aea0&agent=blog_JS_rec&recMode=4&reqType=1&wid=100&imgType=0&refPub=0&prs=true&scp=false&version=59070&idx=3

Youch, what’s that, and why did it change?  On closer inspection, by viewing the source of the page, I can see the target href of the link is exactly as reproduced above, going to traffic.outbrain.com.  I peeked at some other URL’s in the same section that I had not yet left-clicked or right-clicked and noticed this:

<a target=”_self” href=”http://www.cnn.com/2012/07/15/sport/jason-kidd-arrested/index.html&#8221; onmousedown=”this.href=’http://traffic.outbrain.com/network/redir?key=10b8398e7c07227c8a8786b1682f1707&amp;rdid=349349184&amp;type=WMV_d/t1_ch&amp;in-site=false&amp;req_id=968ab83e0a0f44e584d8744520d2aea0&amp;agent=blog_JS_rec&amp;recMode=4&amp;reqType=1&amp;wid=100&amp;imgType=0&amp;refPub=0&amp;prs=true&amp;scp=false&amp;version=59070&amp;idx=4&#8242;;return true;” onclick=”javascript:return(true)”>Knicks’ Jason Kidd arrested on suspicion of DWI</a>

And herein is the deception — this piece of inline JavaScript code changes the target of the link at the moment it is clicked to go to the traffic.outbrain.com address.  Because target href originally reads to the final destination of the article, hovering over it gives the false impression that my click will directly take me to it.  Instead, at the moment I click it, the target href is changed to the potentially unscrupulous third-party, and I have been given no browser notification this would happen prior to my click, and upon traffic.outbrain.com responding, it redirects me back to the original CNN article I initially wanted to view.  On a broadband connection, you probably wouldn’t even notice the superfluous page load and redirect back to CNN’s site.  Deceptive!

So, sure, why should anyone care?  Isn’t this just plumbing, technology, and toolbox of tricks inherit of the Web?  Maybe, but the problem here is the lie.  You do not lie to your users.  Ever.  Outbound web tracking is not a web beacon.  Web beacons are a different kind of “evil” – usually some JavaScript that opens an IFRAME to a third-party site that issues a cookie to track you; however, web beacons are covered by CNN’s privacy policy, so if they were equivalent, it’s all fair.  Web beacons can be simply disabled by turning off third-party cookies in today’s browsers.  This is precisely why outbound link tracking is becoming popular – it circumvents the privacy management tools most users have available and have knowledge of.  Outbound link tracking is no more insidious than web beacons are, but the implementation of them often lies to the end user about what their action will do (a click in this case).  An honest implementation would be to either clearly state in the privacy policy that any links you click may be link tracked or simply not to deceive the user by rewriting the target href the moment they click it to actually go to the link tracking site so the browser status bar is truthful on hover (Twitter’s t.co strategy).

Well, at least it’s just CNN at fault here.  At least no one else would stoop to such shady tactics.  Surely not Google (/url) or Facebook (l.php).. no, definitely not…

 
 

Tags:

P2P DNS: Not solving the real problem of centralized control

The more tech-savvy probably noted with passing interest the news blip this last week by Peter Sunde, co-founder of The Pirate Bay, a notorious website for finding BitTorrent .torrent files for everything from public domain books to copyrighted music, video, and warez of a new peer-to-peer Domain Name System in response to recent US authoritarian action in seizing domain names.  The specific instance that is causing so much cyberangst is the Department of Homeland Security and Immigration and Customs Enforcement bowing to the pressures of media giants have shut down RapGodfathers.com.  By “shut down”, these enforcement agencies didn’t just confiscate server equipment, they actually seized DNS hostnames assigned by their registrar, through ICANN.  Long has the rest of the world complained that IANA and ICANN, bodies that assign all sorts of global numbering and addressing schemes, are puppets of the U.S. Government, and even a number of the American tech crowd that the actions of these bodies over time are counter to the perceived free and open nature of the Internet.

While DNS isn’t that important from a purely technological networking perspective, that is, it is simply a redirection service, almost no denizens of the web could find Google, Facebook, or Bing without it.  DNS is a protocol that allows a simple name, such as example.com to be translated into an IP address, serving the role of a phone book of sorts.  I’ll have to admit, just as I’d probably lose all my friends if I lost my EVO, since I depend on my address books over memorized phone numbers these days – I only know some of Google’s servers, my work, and my home IP address by heart, but for everything else, I’m dependent on DNS to tell me (and my browser) where to find things.  In response to ICE’s attack on the perception that domain names should not be commandeered by governments, Sunde has started a project to offer up an alternative DNS service over peer-to-peer networks, to remove the ability for corporations or governments to seize domains.  Unlike failed ‘alternate root’ schemes in the past, this shift in technology would, as the thought goes, allow the domain name resolution service to be operated by consensus.  In such a world, ICE couldn’t have seized RapGodfathers.com domain, nor could any corporation with a similar name as a private individual file a copyright claim to take a domain name away from them.  Do we have a fundamental right to allow the public to sign off on who gets to hold what URL properties?

The rhetoric on the issue has been amusing at best and eye-rolling at worst, when people like Keir Thomas make outlandish claims that an alternate DNS scheme will be ‘heartily embraced by terrorists and pedophiles’.  Sadly, such claims showcase the true lack of technical understanding about how the networking protocols of the Internet actually work.  Coming back to my phone book analogy, a P2P DNS scheme would be akin to GOOG-411 providing phone numbers instead of my local phonebook (which sits unused, now 5 years old, mind you):  Anyone can one a phone number or IP address, but the way you resolve a name to a number doesn’t really, on a true technical level, change anything about who controls access and availability to resources.  If I could configure my computer to point cocacola.com to illegal content, that doesn’t change the fact the content was out there to point to in the first place, nor does it make it any easier to find for those not seeking how to access it.

The real threat is when governments start mandating control over a protocol that hasn’t yet become a household name — BGP.  Around in some form since 1982, BGP doesn’t translate human-recognizable names into network numbers, it actually describes where to route those numbers.  When the Great (Fire)wall of China censors where its citizens can go, it does so by dictating that the numbers it doesn’t want you to dial call non-existent places, or more realistically in the network world, that the paths to route your request to are wrong or dead-end.  Back to the analogy, controlling BGP is the end-game on the Internet– instead of taking over the phone book’s printing presses, you take over the phone company’s switching stations themselves.  For those wishing to make the Internet more autonomous and decentralized, the future to securing the existing global communications network from superpowers’ total control lies in alternatives to BGP, not DNS.

However, P2P BGP isn’t going to happen, because as DNS instructs your computers where to go to find information, an attribute you can control yourself, BGP instructs your ISP’s routers where to get their information, and you won’t ever control their hardware.  And really, the fundamental issue is there’s no clear way to keep the current networking stack of protocols we collectively call the Internet free and open, as we like to believe it should be.  Instead, for those wanting to leverage the crowd to free the Internet from tyrannous regimes or powerful special interests, your best bet for the future is Freenet or Tor, layers that sit on top of the Internet’s infrastructure and provide their own.  They route requests and traffic through a “tunnel-atop-the-tunnels” approach that cannot be easily discerned nor controlled.  If the history of Internet governance has taught us anything, it’s that if something can be controlled, the wrong entities end up controlling it.  The approach that Freenet and similar onion routing networks take is to remove control and technologically favor independent voices.  Instead of writing new technologies like P2P DNS to address yesterday’s problems, I heartily recommend those with the interest and aptitude look into key-routing networks like Freenet, which by their very design prevent eavesdropping and circumvent traditional control mechanisms.  Just in their awkward teenage years, these will be the technology tools of digital patriots in the future, not P2P DNS on a network protocol stack that is increasingly being pulled out of the grasps of its grandfathers and architects.

I will have to commend Sunde’s efforts though, on the principal that if you do some Google keyword searching, ICE’s seizure of RapGodfathers.com was only a spec on the web’s map until Sunde’s project was announced.  Raising awareness of who holds the keys to the words we write, read, and share is paramount in a world where most of the people who write, read, and share their thoughts over the Internet are generally otherwise without a clue to how their ideas are allowed or blocked by the powers above.

 
1 Comment

Posted by on December 3, 2010 in Ethical Concerns, Privacy

 

Facebook OpenGraph: A Good Laugh or a Chilling Cackle?

If you want to sell a proprietary technology for financial gain or to increase user adoption for eventual financial gain once a model is monetized, the hot new thing is to call it “open” and ascribe intellectual property rights to insignificant portions of the technology to a “foundation.  The most recent case in point that has flown across my radar is Facebook’s OpenGraph, a new ‘standard’ the company is putting forward to replace their existing Facebook Connect technology, a system by which third-parties could integrate a limited number of Facebook features into their own sites, including authentication and “Wall”-like communication on self-developed pages and content.  The impetus for Facebook to create such a system is rather straightforward:  If it joins other players in the third-party authentication product-space, such as Microsoft’s Windows Live ID, Tricipher’s myOneLogin, or the OpenID, it can minimally drive your traffic to its site for authentication, where it requires you to register for an account and log in.  These behemoths have much more grand visions though, for there’s a lot more in your wallet than your money: your identity is priceless.

Facebook and other social networking players make a majority of their operating income from targeted advertising, and displaying ads to you during or subsequent to the login process are just the beginning.  Knowing where you came from as you end up at their doorstep to authenticate lets them build a profile of your work, your interests, or your questionable pursuits based on the what comes through a browser “referrer header”, a response most modern web browsers announce to pages that tell them “I came to your site through a link on site X”.  But, much more than that, these identity integration frameworks often require rich information that describe the content of the site you were at, or even metadata that site collected about you that further identifies or profiles you, as part of the transaction to bring you to the third-party authentication page.  This information is critical to building value in a targeted marketing platform, which is all Facebook really is, with a few shellacs of paint and Mafia Wars added for good measure to keep users around, and viewing more ads.

OpenGraph, the next iteration from their development shop in the same aim, greatly expands both the flexibility of the Facebook platform, as well as the amount and type of information it collects on you.  For starters, this specification proposes content providers and web masters annotate their web pages with Facebook-specific markup that improves the semantic machine readability of the page.  This will make web pages appear to light up and become interactive, when viewed by users who have Facebook accounts, and either the content provider as enabled custom JavaScript libraries that make behind-the-scenes calls to the Facebook platform or the user himself runs a Facebook plug-in in their browser, which does the same.  (An interesting aside is, should Facebook also decide to enter the search market, they will have a leg up on a new content metadata system they’ve authored, but again, Google will almost certainly, albeit quietly, be noting and indexing these new fields too.)

However, even users not intending to reveal their web-wanderings to Facebook do so when content providers add a ‘Like’ button to their web pages.  Either the IFRAME or JavaScript implementations of this make subtle calls back to Facebook to either retrieve the Like image, or to retrieve a face of a friend or the author to display.  Those who know what “clearpixel.gif” means realize this is just a ploy to use the delivery of a remotely hosted asset to mask the tracking of a user impression:  When my browser makes a call to your server to retrieve an image, you not only give me the image, you also know my IP address, which in today’s GeoIP-coded world, also means if I’m not on a mobile device, you know where I am by my IP alone.  If I am on my mobile device using an updated (HTML5) browser, through Geolocation, you know precisely where I am, as leaked by the GPS device in my phone. Suddenly, impression tracking became way cooler, and way more devious, as you can dynamically see where in the world viewers are looking at which content providers, all for the value of storing a username or password… or if I never actually logged in, for no value added at all.  In fact, the content providers just gave this information to them for free.

Now, wait for it…  what about this new OpenGraph scheme?  Using this scheme, Facebook can not only know where you are and what you’re looking at, but they know who you are, and the meaning behind what you’re looking at, through their proprietary markup combined with OpenID’s Immediate Mode, triggered through AJAX technology.  Combined with the rich transfer of metadata through JSON, detailing specific fields that describe content, not just a URL reference, now instead of knowing what they could only know a few years ago, such as “A guy in Dallas is viewing http://www.example.com/Page.html&#8221;, they know “Sean McElroy is at 32°46′58″N 96°48′14″W, and he’s looking at a page about ‘How to Find a New Job at a Competitor’, which was created by CareerBuilder”.  That information has to be useful to someone, right?

I used to think, “Hrm, I was sharing pictures and status updates back in 2001, what’s so special about Facebook?”, and now I know.  Be aware of social networking technology; it’s a great way to connect to friends and network with colleagues, but with it, you end up with a lot more ‘friends’ watching you than you knew you ever had.

References:

http://www.facebook.com/advertising/?connect

http://opengraphprotocol.org/

http://developers.facebook.com/docs/opengraph

http://openid.net/specs/openid-authentication-2_0.html

 
 
Follow

Get every new post delivered to your Inbox.