Everything you know (about security) is (probably) wrong!
I read a post on dev.to about how most websites validate passwords incorrectly. The premise of it centres on this well known strip on xkcd.
Basically things which look like complicated passwords, in this case Tr0ub4dor&3
, aren't always more secure than something which doesn't look as secure; like correct horse battery staple
. Why is that? Put simply it's down to the rules which you often see on websites. If you;re creating a password, you'll more often see the following password requirements:
- Must be at least 8 characters
- must contain a number
- must contain upper and lowercase letters
- must contain a special character
The password Tr0ub4dor&3
meets all of these requirements, and is seen as "strong" by a lot of websites. In fact my password manager, LastPass, shows me it's a strong password.
LastPass shows the passphrase
to be only 4-bars good.correct horse battery staple
Now the lesson on entropy
The problem is, the maths doesn't back up the indicator. The maths in question is the password entropy. Password entropy is a measurement of how unpredictable a password is (copied from https://www.pleacher.com/mp/mlessons/algebra/entropy.html). Basically, how many possible unique characters are there in the password, and how long is it. Unique characters are put into their own groups. These are usually:
- Lower case letters (26 characters)
- Upper case letters (26 characters)
- Numbers (10 characters)
- Special characters - one of
!"#$%&'()*+,-./:;<=>?@[\]^_{|}~
and space (if you can't see it) (32 characters)
Entropy is calculated as log2(R^L)
, where R is the pool of unique characters, and L is the length of the password. Pools of characters get added together, so the password cD3!
gets an R number 26+26+10+32=94. However, it's only 4 characters long, so we work with log2(94^4)
, or log2(78074896)
.
Where more advanced password checkers come in is by calculating the entropy of a password, but by also not counting duplicate characters after a certain point. Usually after 2 or 3 occurrences i.e. password
is seen as the same length as passwords
if they use 2 character occurrences to determine a duplicate character (the third s
is ignored). This discourages the duplication of characters.
Going back to our examples, the Tr0ub4dor3&
password uses all character sets, so has R=94, and L=11; therefore its entropy is log2(94^11)
, or 49.98.
The passphrase
on the other hand would have R=58 (because spaces are part of the special character space), and L=21 (more than 2 spaces, correct horse battery staple
r
, t
and e
characters). The entropy is log2(58^21)
, or 85.27. Mathematically, it's the stronger of the two.
What can we do as developers?
As developers, it's our responsibility to take care of the users as best we can. Encourage them to use strong passwords or phrases, and remember that size matters (yes, as a guy I actually wrote that!).
We need to get away from forcing arbitrary rules upon the user making them use every character set possible. That only leads to people resetting their password every time the log in, or picking stupid passwords that are easy to guess. I get around the 8 character, upper, lower, number, and special character restrictions with something like Letmein!23
(Entropy 45.43). It covers all of those rules, but is so easy to guess that it's likely a combination in every rainbow table. If I skipped the upper case requirement, I could have 1 flew over the cuckoo's nest
(entropy 97.05).
We also need to stop limiting password lengths. It takes no additional database space to (properly) store a passphrase over a password. If I find you're limiting password lengths, I will name and shame you (yes I'm talking about you, cula.io). Let people use passphrases (and encourage it) so they can build a stronger password, and not always require a password manager.
It's not particularly difficult to implement on applications either. If you write your application in Go, there's a project by Lane Wagner on Github for such an occasion. If you usually use PHP, then check out my PHP port of the same library. If you use a different library, write your own port. It's not too difficult.
If you do use my PHP port, the entropy is aggressively harsh in calculation, by flooring the natural logarithm of the R^L
calculation. It always returns an integer smaller than the actual entropy so you can encourage a stronger password by being overly pessimistic about the strength (Tr0ub4dor3&
gets returned with entropy 49, not 49.98 or 50 like you might expect).
What can we do as users?
Use better passwords or passphrases (assuming the service you are using allows it). Maybe use one of your favourite books as an inspiration. Be it a quote from the book, part of the title, or something similar. If you are a Harry Potter fan, using 1/2 blood Prince
as a password gives entropy of 72 in my pessimistic library (don't use it - it's published here now!). Be aware that this may be guessed by social engineering, so might not be overly secure.
Use a password manager. Turn on all of the settings you can for it, and up the length. My default settings are for all of the regular password rules to be matched, and for a 32 character length password to be generated. I randomly got password TUv%lAPnSrvpvpIIeuWRM#IE09EPL2P6
returned by LastPass when I tried. It has a pessimistic entropy score of 131. It's likely to take trillions of years for a computer to crack that password by itself. Also note it has more than 2 v
and I
characters, so my library won't see it as being 32 characters long.
Whilst this post has been about passwords and strength, there is another thing you may be able to do on some services (quite a lot of them, really). Set up 2-factor authentication. This will (usually) mean you need to enter a unique pass code when you log in, and this pass code changes every 30-60 seconds. The code is generated and shown on a device (usually your phone), and proves you have all the credentials you should for accessing the service. It might seem like an inconvenience, but I personally find it a better alternative than all my data being one password guess from being exposed.
Everything you know is wrong
Despite what you have been conditioned to think about passwords, you're actually causing yourself issues, and not overly impacting the malicious actors out there trying to steal and sell your data. It's not all doom and gloom though. Here's Weird Al to help get that message to hit home: