My problems with OpenID

I’ve been tempted to write why OpenID has been driving me up the wall.

I have not implemented OpenID in any application, so I come at it not as an implementor or programmer but as an end user: a number of sites I’ve used, including Stack Overflow and Sourceforge, have either allowed or insisted upon OpenID authentication.

My first OpenID account was at Verisign Labs (PIP).  They’re well established in web security, so I figured it would be a reliable service, and a company that wasn’t likely to disappear on me.  Their service, however, left me frustrated for a few reasons.

  • For some reason (early onset dementia?), I could never remember my OpenID URL and found myself needing to look it up all the time, which meant starting up my email client.  Because it’s not only a username I chose, but also includes the web address of the OpenID provider, I found it easier to forget.  I can’t really see ordinary web users finding the URL thing intuitive; for some time now, favourites/bookmarks and search engines have been teaching us that remembering URLs shouldn’t be necessary.
  • The Versign Labs PIP has one of the most user-unfriendly features I have ever experienced.  With the aim of preventing phishing attacks, a well-meaning goal, it does not allow you to authenticate yourself at any OpenID supported site at all unless you have already logged in directly at Verisign’s website during the same browser session.  Try typing in your OpenID to your favourite site, and you get a message from Verisign telling you that no, you haven’t logged in to Verisign this session, so you can’t proceed.  When I encouter this, I have no choice but to open up a second tab and head over to their site to log in, except that much of the time I can’t, because I don’t have a browser certificate installed on the computer I’m using at the time (I don’t think it’s abnormal to use more than one computer regularly).  So in order to authenticate me, it has to send me an email containing a single-use PIN.  Thank goodness my email account doesn’t use OpenID authentication and I can get to that fairly easily.  I’ve never had to jump through so many hoops, just to log in to an application I already have an account at.
  • Once I’ve started using an OpenID identity from a certain provider on a site or two, it would appear that I am tied to that OpenID provider for life.  It makes it very hard to evaluate OpenID providers when your choice is a permanent one.  Yes, I realise that it is possible to use delegation, or even to install your own OpenID server, but if we’re going to be talking about end users, neither of these two are really practical, and both of them are likely to result in decreased security.

My second OpenID provider, MyOpenID, appears to be a fair bit easier to get along with, and doesn’t suffer from many of the problems I’d previously encountered.

Simply by opening another OpenID account, however, everything has become exponentially more complicated: if you switch providers, there’s no easy way that I can see to merge all site accounts based on an identity at my previous provider across to the new one.  It seems like changing providers may mean ditching a bunch of old accounts and signing up for all new ones.  I was impressed at the way Stack Overflow’s implementation allowed switching the OpenID identity associated with my account there.  Unfortunately, this flexibility is a result only of Stack Overflow’s thoughtful design, and such a feature is not part of a typical OpenID implementation.

MyOpenID, thankfully, allows me to authenticate myself without having to twiddle around with going to the OpenID provider’s site in a separate browser window or getting a single-use PIN.  I suppose it is similar to what the OpenID experience should have been like from the start.  Maybe my Verisign Labs PIP account just had too many optional features turned on.

I still find, however, that some things about OpenID underwhelm me:

  • Signing up for a new account at an OpenID-enabled site appears no easier when using OpenID.  After authenticating with my OpenID URL and whatever authentication I need to do at the OpenID provider’s end, when I return to the client site I still have to fill out a form, and most of the time I still have to confirm my email address.  Some fields have been pre-filled by my OpenID account, but I still need to choose a username that is unique to that application, and likely even fill in a Captcha.
  • Users are well experienced already with simple username/password combinations.  They know, for example, that the password should be kept secret, and it’s that secret that provides their security.  Even though they might have several username/password combinations at different sites, this doesn’t make things any more complicated, because the same concept is just repeated.  With an OpenID account, however, not only do they now have a username and password at their OpenID provider, but they also have this OpenID URL, and maybe even a browser certificate.  That is three or four pieces of information.  Furthermore, how will they understand that authenticating with an OpenID URL alone can provide any security, when the OpenID URL is not a secret, and there is no password?  I wouldn’t be surprised if users thought that OpenID was grossly insecure, because they don’t understand that all the real security is hidden from them.
  • I also wouldn’t be surprised if the idea that their identity is passed between sites made users a bit worried.  For instance, how can an OpenID implementor reassure the user that even if they use their OpenID URL to log in and register, that doesn’t mean the implementor now has the password to the user’s OpenID account?  All the beneficial security concepts are a black box to the users, who may just assume that the OpenID account is a way for their password and identity to be freely passed around between sites.  Far from using it only when high security is needed, we may find that users, unaware of the security benefits to OpenID, only trust OpenID with information they don’t mind losing.

So far I haven’t been convinced that using OpenID is significantly safer – even when comparing it to re-using the same username and password at a whole bunch of different sites, which is itself a dubious security practice.  With OpenID, I still have all my eggs in one basket.  If an attacker gains access to my OpenID account, he can still impersonate me at all sites where I rely on that identity.

OpenID is a well-meaning idea, and with more experience I am sure that I will master it more, but being this confusing and headache-inducing even to a web developer is a clear indication that it has some way to go before it can be considered fit for general use.  Get this: the Wikipedia page for OpenID displays a prominent warning which reads  “This page may be too technical for a general audience” applying to various sections, including the section titled “Logging in”.  If it is too hard to describe how to “log in” without alienating a non-technical audience, it is a sign that the process is not too usable, and anyone thinking that they are implementing OpenID in order to “simplify” things for end-users may need to think twice.

While some boast about big companies like Google adopting OpenID, it’s not really all that much to crow about – their support is only as a provider, not as an implementor.  I cannot, for example, use an existing OpenID to authenticate myself at Google, I can only use a Google ID to authenticate myself elsewhere.  Not allowing OpenID authentication themselves doesn’t contibute to the widespread use of OpenID but further segregates it, which is probably just as much of an injustice to OpenID as its indecipherable Wikipedia page.

Why CAPTCHA are not a security measure

CAPTCHA, those little groups of distorted letters or numbers we are being asked to recognise and type into forms, are becoming more and more common on the web, and at the same time, frustratingly, more and more difficult to read, like visual word puzzles.  The general idea behind them is that by filling them in, you are proving that you are a human being, given that the task would be difficult for a computer program to carry out.  CAPTCHA aim to reduce the problem of spam, such as comment spam on blogs or forums, by limiting the rate at which one can send messages or sign up for new accounts enough so that it becomes undesirable for spammers.

Example CAPTCHA

There is a problem, however, when these are talked about as a ‘security measure’.  They are not a security measure, and this misconception is based on a flawed understanding of security: that humans can be trusted whereas computers – bots – cannot.  CAPTCHA are not a bouncer; they cannot deny entry to any human that looks like they are going to start trouble.  They can only let in all humans.  If your security strategy is something along the lines of ‘If it’s a human, we can trust them’, then you are going to have problems.

Another problem with CAPTCHAs is that they are relatively easy to exploit, and by this I mean, pass a large number of tests easily in order to spam more efficiently. While a single human can only look at a certain number of images per hour, a number of humans, with a lot of time on their hands, can look at thousands of them per hour.  If the internet has taught us anything, it’s that there are a lot of humans on the internet with a lot of time on their hands. So, while a CAPTCHA will slow one human down, it won’t slow hundreds of humans down.

The so-called ‘free porn exploit‘, a form of relay attack, takes advantage of this. CAPTCHA images are scraped from a site and shown on the attacker’s site. On the attacker’s site is a form instructing its visitors to identify the letters in the image, often by promising access to something such as free porn. All the attacking computer needs to do is drive a whole bunch of people to that form and then use all the resulting solutions to carry out the intended dirty work at the original site.

It doesn’t have to be porn, of course – that is just a popular way of illustrating this form of circumvention.  Any time a human wants something, or even is a little bit bored, you can ask them to fill out a form. Get free jokes in your inbox, fill out this form. If you get many humans working against a CAPTCHA it makes the CAPTCHA ineffective.

Technology for solving CAPTCHA automatically, without requiring any human intervention at all, is also evolving at a high rate.  Scripts that can solve known CAPTCHA variants are becoming available all the time, and in response, new CAPTCHA variants are emerging too, each one more difficult to read than the last.  The computer power required to recognise characters visually is trivial compared to, for example, the computer power required to crack a good password from its hash, or to break an encryption scheme based on encrypted data.  The CAPTCHA that are unbreakable today are only unbreakable by obscurity; they are constructed differently enough for previous ones that current computer programs don’t recognise the letters and numbers in them.

What are alternatives to CAPTCHA?

The alternative will vary depending on what you are using it for.  If you are using CAPTCHA for reducing spam on your blog, then they will probably continue to do so, though you may find yourself resorting to other options.  Bayesian or rule-based filtering, or a combination of both, are effective methods of reducing spam, and have the added benefit that they do not annoy the user or impede usability or accessibility like CAPTCHA would.

If you are using CAPTCHA as a security measure, you would need to ensure that you are only doing so based on a proper understanding of what kind of security they bring.  Certainly, they cannot do anything about keeping unauthorised or unwanted people out, as this is not what they are designed for.  They also have severe limitations in their ability to reduce spamming or flooding, due to them being relatively easy for a sufficiently organised attacker to bypass.

An earlier version of this article was published at SitePoint.com in November, 2005.

The password problem

The first problem with passwords on the web is that passwords alone are not strong authentication.  The second problem is that people have too many of them to remember.

Some people will reuse the same password on several different services, leaving all those services vulnerable if their password is compromised through any one of them.  Other people use different passwords for several different services, but need to resort to writing passwords down or making heavy use of ‘forgotten password’ features because they are too hard to remember.

Online authentication

In the offline world, pretty good authentication can be achieved by combining a card with a PIN.  This is an implementation of two-factor authentication.  In short, this means that authentication is based not just on something a person has in their possession (like a key or card), something a person knows (like a PIN or password) or their own body (like a fingerprint or DNA), but on at least two of those three categories.  The principle behind this is that it is significantly more difficult for an attacker to steal your identity if they need to both obtain something you have, and find out something you know.  When your bank card is stolen, the thief cannot access your account unless they also know your PIN.  If your PIN is guessed, overheard or intercepted somehow, the snoop cannot access your account without your card.

Online, however, strong authentication is a lot more difficult, because authentication has to rely almost entirely on something you know.  This means that rather than just being one factor in authenticating you, your user name and password combination becomes the only factor.  It becomes a lot easier for someone to steal your identity, as they only need to intercept your password somehow.

When we sign up for online accounts, we are told to create passwords that are “strong”, and unique.  The general idea of “strong” here is hard to guess, but also helps with “hard for someone to see over your shoulder”.  However, a strong password still does not protect against a situation where someone bugs your computer, or your ISP’s computer, and sees your password as it is transmitted.  This is relatively easy to do – as easy as exploiting a bug on any software in your system or your ISP’s, or ISPs betraying your trust, etc.

In making passwords stronger, too, they are also made harder to remember.  This is a good thing to a point.  It means that someone who does overlook you typing your password is less likely to recall it or catch all the letters.  But after a point, being harder to remember really detracts from security, because users are more likely to write them down in order to remember them.  When the password is the only secret thing that can authenticate a person for an online account, having that password written down makes a less than ideal situation worse.  It means that the password now becomes vulnerable not only to eavesdropping, but also to physical theft.  Comparing this to two-factor authentication, we have gone in the opposite direction.  Not only is there only a single factor, but there are two types of vulnerabilities for this single factor, and an attacker can choose either.

Multiple accounts

The number of accounts we need to authenticate ourselves (prove our identity) for is growing.  It is not uncommon for someone to have a dozen different accounts or more for different online services, ranging in importance from online bank accounts and auction websites right down to simple blogs and discussion forums.

If we are to assume that people use different passwords for each, we are expecting too much for them all to be remembered.  It is common practice for people to just write them all down, but as stated that detracts from security.

The alternative for users is to re-use the same password for multiple accounts, but this is putting all their eggs in one basket.  If their one good password is compromised, then all these accounts are vulnerable.  If your password is stolen or intercepted, it may not even be your fault – a company hosting one of your accounts may have let it slip through negligence.

A reasonable person probably uses a combination of the above – using unique passwords on only those very important accounts such as their online banking, while re-using a common good password for everything else.

Technological solutions

OpenID is a distributed authentication mechanism which aims to let someone log in to accounts with several different companies, without exposing their password to those companies.  The password is sent only to the single OpenID provider, which authenticates the person and then signals to the company that the person has been authenticated according to that OpenID account.

This helps cut down the number of ways that a password could be intercepted.  However, it does not change the fact that if that one single password does get compromised, the attacker can gain (albeit temporary) access to all those accounts.  In fact, it could make it worse: upon gaining entry to your OpenID account, the attacker might be presented with a nice little list of all approved providers – IE a list of where they can use this OpenID account.  This would depend on the OpenID provider.

Educating users

Educating users is something that does not work as well in practice as it does in intention.  Most users already know that they should not write down their passwords, and they should be as strong and unique as possible.  However, they will continue to behave in the way that is most convenient need to them, creating easier to remember passwords, re-using them, or writing them down, just to make it easier to deal with so many.

Security warnings that fail to be helpful: Who are you trying to confuse?

Warning: This file may contain malicious code, by executing it your system may be compromised.

If your average user is a computer security professional, this warning, found in a web application for corporate intranets, may be appropriate.  But does the average person want to understand the concepts of ‘malicious code’ or a ‘compromised’ system?

Wouldn’t they rather just be gently reminded that you can’t trust every file that you download on the web?

Let’s consider for a second that a user sees the original warning and doesn’t ignore it.  They may ask, ‘what is malicious code?’  Well, it is code, erm …  Well, computer programs are written in what’s called code … and sometimes that code might be erm … written to do bad things to your computer.  If they are not confused enough already they might try and ask what a compromised system is.  And no, it doesn’t really have much to do with a ‘compromise’ on anyone’s part.

If you really want to scare people, you could at least use terms that are more likely to be widely understood, like ‘files may contain viruses!’  ‘Be careful!’

Writing language that people are likely to understand is not dumbing down.  It would be dumbing down if you removed some of the most relevant and important details from your message, leaving your users feeling cheated because details were withheld.  But in this case, the fact that software is made up of code, and some code may be a bit nefarious, or the concept of a compromised system, detracts from the most important detail of the message, which is that it might not be safe to trust the file you are downloading.  There are also some more clear, more concise alternatives, such as the word ‘virus’.  This word has come to collectively signify everything nasty that you could let loose on your computer without meaning to.

Telling people not to get phished

If you provide users with a password, you should probably think about telling them how to keep their password safe.  Teaching users how to avoid being tricked into giving away their account data – being ‘phished’ – can be difficult.

Social engineering is a method of obtaining access to a secured system by exploiting a person’s trust.  It consists of deceiving a person into granting access to a system by some sort of pretense.  For example, let’s say I receive a rather desperate sounding phone call from an intern over in IT who has screwed up and lost their password and needs desperately to fix some problem for their boss.  They know I have an account and are hoping I would be so kind to log in for them, something which I might be happy to do for a colleague.  However, the person on the phone is not an intern in IT at all, and doesn’t even work for the company.  What’s more, the second I have given them access via my own username and password, all of the security precautions are now absolutely useless; the attacker has gained access to the system they wanted to access.  What if they pretended to be working at the bank?  They might goad me into letting them empty my bank account.

A second example of a social engineering attack is to exploit a person’s guilt – to make the person believe that they have been caught doing something wrong and may get in a lot of trouble if they do not cooperate.  This ‘cooperation’ may involve handing over their personal details.  This kind of attack can even work if the victim did not do anything wrong; the act of being ‘accused’ can put someone into a defensive state.  The desire to cover up any wrongdoing they have been accused of may distract them from the fact they are being conned.

The term phishing is used to describe such attacks when they are done over a message service, such as over email or text messages.  Phishing is also often done on a large scale; a would-be attacker sends an email to perhaps thousands of people pretending to be from the IT department, or a bank, or something, hoping that at least one person will fall for the scheme.  Some such schemes are wildly inventive, while there are just as many that are stock standard: ‘we need to confirm your account details’, or ‘we need to verify that your account is active’.

From the point of view of anybody involved in computer security, the fact that such attacks are so effective is depressing.  They are effective for many reasons.

One reason such attacks are effective is that, like with any security precaution, it is as weak as its weakest point.  In a large organisation in which lots of people have access to a system, only one person needs to slip up and accidentally give their username and password to the wrong person in order for the system to be compromised.

Another reason is that the users of a system may, being less confident with technology, be naturally inclined to trust and be a little fearful of somebody who both seems to know a lot more about technology and is in a position of authority; for example, someone from an IT department, or law enforcement, or who has access to their bank account.  The consideration as to whether or not the person who contacted them is legitimate takes second place to the desire to comply with this person who seems so much more knowledgeable about the system.

It may also be that people don’t realise that computer security does not stop at some unseen attacker trying to guess or steal your password; that in large part an attacker can just walk right up to you and ask for it.

So, what do we tell the users?

Systems administrators often use the phrase ‘we will never ask for your password’.  This is a good message, because it at least signals to users that there may be nefarious motivations behind someone offical asking you to confirm your password.

However, in most cases where someone is duped into giving their account information, they actually believe the person who has contacted them is legitimate.  The phrase ‘we will never ask for your password’ can quickly develop exceptions to the rule; an attacker might say ‘Oh, but our systems are down and we have to log people in manually today.’  As it is coming from a person genuinely believed to be legitimate, such an exception is easily accepted to be true both because it is plausible, and because the victim trusts the attacker to know more about the issue than they do.

I think that users should be instructed that if they are ever asked for their password, even by genuine system administrators, they should not give it over the phone or in reply to the email.  Instead, the receiver should call back the company on the known correct phone number and then give the password.  Let’s say that I call you up and tell you that we are in the process of deleting unused accounts and we need your password to confirm whether your account is used or not.  If you truly believe that my story is legitimate, you may ignore the advice that we never ask you for your password, because my story seems like a plausible reason for an exception to the rule.  But if you have been told that you should always call me back when asked for a password, you may be less likely to be convinced by my insistence that that isn’t necessary.  I might say that you won’t be able to call me back if you tried, or that the matter is urgent, but this may raise more red flags.

In terms of email phishing, too, we can instruct users never to click through a link to a site on which they have an account; instead, should they wish to visit the site they should type the site address or name into their browser.

Whether this is all effective is speculation, and it must still be remembered that no matter how security conscious an organisation is as a whole, it only takes one weak link: one uninformed or absent-minded person to slip-up and allow a breach of security.