Security Considerations During Authentication

One of the most prolific vectors for security vulnerabilities in applications and websites is authentication – login. Gaining access to important data and functionalities could be very profitable, so compromising users’ credentials are a prime target as we have seen in the past. Here are a couple of pitfalls to watch out for when designing your authentication interface.

Proper Password Controls

Force your users to select strong passwords! A good password policy makes brute-force and dictionary attacks – that is, guessing a user’s password through manual or automated means – impractical. The most important restriction is to set the minimum allowed length to be 8 characters, and this should be done at all times. Other than that, here are a few common restrictions that attempt to ensure an adequate password minimal complexity:

  • A mixture of both upper and lower case letters.
  • A mixture of letters and numbers.
  • At least one special character: [email protected]#$%^&*() etc.

Still, it has been found by research sponsored by the US NIST ¹ out that users tend to respond to these additional restrictions in a very predictable way. For example, if a user intended to use the password “password”, but encountered the above restrictions, he would likely change his password to the predictable “Password1!”.
The NIST suggests comparing the password to a blacklist of unacceptable passwords and rejecting passwords that match. Quote:

This list should include passwords from previous breach corpuses, dictionary words, and specific words (such as the name of the service itself) that users are likely to choose. Since user choice of passwords will also be governed by a minimum length requirement, this dictionary need only include entries meeting that requirement.

NIST
Error message prompting the user to choose a stronger password

Implementing this yourself may be a daunting task, but luckily, there are services and libraries designed to help you. The zxcvbn library available in many languages is one option. The Pwned password database contains hundreds of millions of cracked passwords, against which you can check the user’s chosen password. You can either host it locally or use an API to communicate with it (using hashes of course!).

If you expect your users to be speaking languages which either use a non-Latin alphabet, or extended Latin (with diacritics or special letters like ø, æ, œ), theoretically allowing the full-palette of Unicode characters will enhance security. This will prevent most brute-force attacks, which assume the ASCII standards anyway, and also hugely increase the number of possible combinations². While this might sound like a no-brainer, in practice, you must take into account that you may fail to handle UTF-8 correctly (e.g. handling canonically equivalent characters).  Improper Unicode handling could reduce the usability and even impact security negatively. Only do this if you are confident in your abilities.
As a last note, it is important to set a maximum length restriction. Not doing so may expose you to a long password DOS attack. 64 characters is a good upper limit.

Password Hashing

The first rule of credentials storage is to never store raw passwords on the server. Instead, store hashes for authentication. Hashing is a much better method than restricting the usage of certain special characters.

To set the record straight: Hashing is not encryption. Hashing is a one-way function that maps each password to a unique or almost always unique hash (yes, clashing hashes have been discovered with certain hash functions, and this is an attack vector, so the choice of function is important). Whereas encryption can be reversed to get the original password. To that end, raw passwords should never be stored encrypted on the server.

To store a password, it should be hashed on the client’s end, and then sent to the server and hashed some more (could go up to 10,000 iterations). Client side only has marginal benefits; a man-in-the-middle (MITM) attacker can simply grab the hash and send it to the server to log in just like a legitimate user. It is still better than nothing at all, especially since it could prevent the actual password from leaking and being reused to crack open other services. Combining it with server side hashing is even more powerful, since an attacker doesn’t know your hashing configuration on the server, so even if all your hashes leaked, he still cannot authenticate himself. This should be seen as mandatory.

Even if all of the above steps are taken, it is important to prepare for the possible case of your database leaking along with the hashes: an attacker can then brute-force the hashes. This is done by

  1. Hashing a candidate password.
  2. Testing if the hash matches a password in the database.

Salting the hashes is therefore important. Salting is concatenating a random string to the password, hashing the result, and then storing the hash along with the salt in the database. The attacker is then unable to check the hash of a possible password against every hash in the database. Instead, he has to add a user’s salt to the proposed password, hash it, then test it against the user’s hash, forcing him to crack one hash at a time. Modern password hashing functions usually automatically salt the input before hashing.

The best key stretching algorithms available today for passwords are Argon2id ³, which won the 2015 Password Hashing Competition, and PBKDF2 , which is recommended by NIST. These use common simple hashing functions like SHA256 as ‘backends’, but apply them thousands of times and automatically add salt. Therefore, do not directly use simple hashing functions like SHA or MD5 not designed for passwords, nor attempt to construct your own methods of key stretching, it is a fool’s errand.

TLS and Authentication

Another mandatory precaution during authentication is to ensure the whole transaction goes over TLS, which makes HTTP MITM attacks significantly harder by requiring certificates.

TLS (Transport Layer Security) is the successor protocol to SSL (Secure Socket Layer). In short, TLS first includes a handshake procedure that requires the server to send a certificate from a trusted authority that vouches for its legitimacy. Encryption keys are then exchanged using the Deffie-Hellman method (both public and private keys are generated). The rest of the session is then encrypted and decrypted using these keys.

How to correctly implement TLS for your service is a topic in and of itself, so we won’t get into it today, but you can read for yourself the best practices.

Multi-Factor Authentication

By far, the best way to prevent password security mishaps, is to recommend or require Multi-Factor Authentication (MFA). This means using one or more other methods in order to authenticate the user. This can include using email, SMS, QR codes, biometric scans, authenticator apps. The most common way to do so is using email or SMS to send the user an OTP which he will be required to enter in order to gain access. An attacker would then have to have access to the user’s email in order to gain access to your service.
This opens other attack vectors, but for most of the applications, it should be enough.

One way to reduce the overhead MFA carries for both the developer and the user is to only require MFA during important transactions. For example, when

  • Changing passwords.
  • Performing monetary transactions.
  • Disabling MFA.
  • Gaining administrator access.

It’s also possible to mandate MFA only every couple of days, after a long period of no activity, or after failed login attempts (with the benefit of preventing brute force). 

Bot Prevention

One way to prevent automated login attempts, whether as a part of a brute force attack or DOS, is to implement time-outs after several failed attempts. This makes brute-force and dictionary attacks simply take too long to be feasible. Another possibility is to require the user to solve a captcha during every login, or perhaps after a predetermined number of failed attempts. 

solved reCAPTCHA
reCAPTCHA

Possibly the most popular 3rd-party captcha service is Google’s reCAPTCHA, but it has seen some criticism for its privacy policies. reCAPTCHA also effectively forces your users to accept a 3rd party terms of service agreement in order to use your service, which some view as ethically problematic. Yet, using reCAPTCHA is probably the easiest and one of the most reliable ways to prevent spam or brute-force bots. There are other Captcha solutions online, like hCaptcha, but as 3rd party solutions, they could all suffer from the same issues reCAPTCHA does.

Error Messages

To stop attackers from being able to determine whether a username is valid or not, it is considered best practice to have a single, ambiguous error message, regardless of whether the error is because

  • The password was incorrect.
  • The account doesn’t exist.
  • The account is locked, disabled, or banned.

A proper, secure error message would be something like

"Login failed; Invalid user ID or password."

Account restoration should also never acknowledge whether or not the process was a success but rather send responses like

"If that email address is in our database, we will send you an email to reset your password."

And

"A link to activate your account has been emailed to the address provided."

Respectively.

Even if your Error messages are vague, a tenacious attacker may attempt to compare response-time discrepancies to determine whether the password is wrong, or that the account doesn’t exist. For example, this pseudo-code

if (user_exists(username)) {
    String password_hash = hash(password);
    bool valid = lookup_credentials(username, password_hash);
    if (!valid) {
        return Error("Invalid Username or Password!");
    }
} else {
   return Error("Invalid Username or Password!");
}

Will return faster if the user doesn’t exist, which the attacker can exploit, whereas this code

String password_hash = hash(password);
bool valid = lookup_credentials(username, password_hash);
if (!valid) {
   return Error("Invalid Username or Password!");
}

Will take roughly the same amount of time in both cases.

Single Sign-On Authentication

In recent years, an authentication method known as Single Sign-On has come into vogue. When a user wants to login into your service, he is prompted to choose a trusted identity provider, usually a mega-corporation that offers that service, like Google, Amazon or Facebook. The user is then redirected to the identity provider’s website and has to log in using his account there. The identity provider then sends your service a guarantee that the user is authorized, usually in the form of an XML schema, which also provides details about the user’s identity. There are a couple of protocols to do this, but the most popular by far is OpenID.

The main benefit is ease of use. The user only has to maintain a single account with a single password to access many different services. Furthermore, if the user is already logged in to that account, he needs not reenter his credentials. This also enables the user to remember a single password for all of his accounts. Another benefit to the developer is reducing the need to take care of authorization yourself. Rather, you delegate it to some trusted 3rd party like Google.

But there are some security drawbacks to that approach: the most glaring one is the fact that if the user’s identity provider account is breached, a myriad of other services can then also be accessed by the attacker. This has the same effect as using the same password and username on all accounts, a practice which is nevertheless done by many users anyway.

Many modern websites frequently offer both SSO and traditional single-service accounts, leaving the choice to the user.


¹ National Institute of Standards and Technology


² There are 143,859 Unicode characters available. This means there are a maximum 143,8598 ≈ 1.83×1041 possible 8 character combinations. Compare to the 1288 = 7.2×1016 combinations of ASCII.

³ ragon2id should be used with the following configurations: 

  • m=37 MiB, t=1, p=1
  • m=15 MiB, t=2, p=1

Where m is the minimum memory, t is the number of iterations, and p is the degree of parallelism. They are equivalent in security, but the first is hungrier on memory but less demanding of the processor, and vice-versa. Remember to use the Aragon2id variant.

⁴When hashing with PBKDF2 the number of iterations should be set based on the internal hashing algorithm used. The following suggestions are equivalent in security:

  • PBKDF2-HMAC-SHA1: 720,000 iterations
  • PBKDF2-HMAC-SHA256: 310,000 iterations
  • PBKDF2-HMAC-SHA512: 120,000 iterations