Encryption: Challenges in Data Recovery
Warning: More complicated subject = longer blog post. But hopefully more interesting for some of you.
Encryption is a wonderful tool. It has helped create a more robust information security infrastructure for consumers and businesses all over the world and continues to become more secure each year.
Having said that, there are plenty of issues that come with data encryption. Similar to the great debate on security vs. privacy, there comes a question of how willing we should be to encrypt our data when there is an inherent risk of not being able to access that data again.
In a digital world where the “I forgot my password” button guards against any sort of accessibility concerns, many consumers and businesses are perplexed when they find out their encrypted data is absolutely inaccessible without the key (in most cases). Even if you contact the manufacturer, they won’t be able to help.
Remember, encryption is not the same as password protection.
First, How Does Encryption Work?
On a basic level, when data is encrypted, the information is taken and converted to essentially gibberish using an algorithm from software, or directly from the hardware in the case of self-encrypting drives (SED’s). Only the decryption key can then convert that data back into its original state so that it may be read. Otherwise, it remains gibberish eternal.
Password protection is just having a password without the encryption. This is the data security equivalent of hiding the key to your front door under the welcome mat. There are a ton of different ways cybercriminals can get through/around password protection, but having the data encrypted makes it way harder (read: nearly impossible) to access. Encryption algorithms are insanely complex compared to cracking a 10 character password.
For a greater understanding, there are presently two basic types of encryption, both with their own strengths and weaknesses. The first is known as symmetric encryption.
Symmetric Encryption
Symmetric encryption uses the same key to both encrypt and decrypt data, so both you and whoever you want to access the data possess the same key. This poses some security risks since it is a natural conclusion that the more people who hold the key, the more vulnerable the data is to infiltration and extraction. After all, a lock with many keys isn’t considered very secure. Unless we’re talking about the aforementioned SED’s, which use a form of symmetric encryption in the hardware. But they’re outliers.
Imagine you have a lockbox and have two keys made for the lock, one for you and one for whoever you need to share the contents with. You put your data in the lockbox, lock it up, and send it to your partner who can then open the lockbox with their copy of the key. This is symmetric encryption in a nutshell.
The second kind of encryption is known as, unsurprisingly, asymmetric encryption.
Asymmetric Encryption
Asymmetric encryption uses different keys for encryption and decryption. The way it works is you have both a public key and a private key. You send your public key to whoever you want to have it, while keeping your private key secure on your own computer for only you to access. Your private key is the only thing that can decrypt data that was encrypted by your public key, and vice versa.
If someone wants to send you secure, encrypted data, they encrypt it using your public key and then send it to you. As you are the only one with access to your private key, you are the only one able to decrypt that data.
Likewise, if you want to send someone encrypted data, you use their public key to encrypt it and they use their private key to decrypt it.
To give a lockbox example, asymmetric encryption is like having multiple lockboxes, each unlockable only by the one unique key you possess. You send these lockboxes out to everyone you need encrypted data from. When they need to send you something, they use the lockbox you sent them, that’s your public key. They put the data in, shut the lockbox, and send it back to you. Since you’re the only one with the key to open it, your private key, it doesn’t matter if someone intercepts the lockbox, they won’t be able to get in.
One drawback of public-key encryption, or asymmetric encryption, is that it takes much longer to transfer large quantities of data than symmetric encryption.
Now, there are plenty of algorithms to dive into between the two so I won’t get into them here, but this article has some great information and history on the topic. This other article distinguishes between disk-based and file-based encryption. A word of caution: Falling down the encryption literature rabbit hole can lead to hours lost and the uneasy feeling that the more you read about it, the less you’ll actually understand.
Bonus: The coolest way to use asymmetric encryption between people (in my opinion), is as follows. First, you encrypt the data using your private key. Then you encrypt the data AGAIN using their public key. So now there’s two layers of encryption on the data. When they receive the data, they use their private key to remove the first layer. So far, they know the data was at least secure. Then they use your public key to remove the second layer. Because you used both, they now know that not only was the data secure, but also that you’re absolutely the one who sent it. Since you’re the only one with your private key, which you used to encrypt it the first time, you’re the one who had to send it. So it’s an authentic message and it was definitely secure. Pretty cool, huh? But I digress.
Why not have an encryption recovery option?
With so many things that can go wrong, i.e. hard drive failing, encryption key lost, etc., why would someone create or use encryption without any sort of recovery option?
The answer is that a recovery option would completely defeat the point of encryption. On a basic level, we encrypt data to keep cybercriminals and other prying eyes from viewing it. If we had a way to get around encryption, don’t you think cybercriminals would too?
If you truly want your data to be secure, you have to be willing to accept that your data is also secure from being recovered in the event of a failure, even by you.
Not so fun fact: The security of encryption is actually the basis behind the cryptolocker virus, which encrypts your files and ransoms them for the key. Since the only way to get through encryption is the key, they are able to hold all your files ransom until you pay them- unless of course you have automatic backup that saves some earlier versions of those files.
So Why Even Bother with Encryption?
The truth is, average computer users don’t need to encrypt all of their data. Just ask yourself, does this need to be secure? If you’re a college student and you just have funny cat photos and a bunch of homework on your drive, then probably not.
If you’re sending sensitive personal information like your social security number to someone, then I think encryption is certainly a good idea. If you have sensitive business information or your tax returns on a drive, the same holds true.
Many businesses also have regulations regarding a lot of the information they handle, such as payroll information or customer information. For example, Gillware has rules to follow regarding the security of our customers’ data as laid out by the SOC 2 Type II audit. In these cases, encryption is not only necessary but required in order to be compliant of those regulations. Some organizations also have their own regulations to follow, such as required encryption across an email network.
Where we run into a contested area is when we consider that many new drives, self-encrypting drives to be exact, are coming with encryption standard as part of their hardware. All the data that goes onto the drive is encrypted before it’s stored. When the user wants to read that data, the drive then decrypts the data before sending it along to the user.
Since these drives encrypt extremely transparently (invisible to the user, no effects on performance) and self-encryption isn’t heavily advertised as a feature, most people who have them don’t even know it. They only find out after it’s too late; their laptop’s SED has failed and their data is locked inside the encryption.
Of course, if you ARE aware of your SED and you’ve set up a password to be prompted upon start-up, then you stand a chance. Otherwise, the data was being encrypted all along without the user’s knowledge and now it is unrecoverable.
There are advantages to SED’s to be sure, but they tend to cause us a few problems now and then.
What’s the Endgame?
The short answer is we’re not sure. There are currently plenty of smart people working to address this issue with encryption, but as it stands there is still a dichotomy between data security and recoverability. In some cases, our engineers have been able to recover data from self-encrypting drives using some cool techniques, but most of the time, without a key, the data is unrecoverable.
We’ve even started the Data Recovery/Erase Special Interest Group (listed in the update at the end of the article) with data recovery labs, SSD manufacturers, and computer organizations like the Trusted Computing Group, all working together to find solutions.
For anyone needing encryption, being aware of how you’re using encryption and when it’s actually necessary are two big steps to avoiding data loss from encryption gone wrong. Beyond that, luck helps.
Perhaps a better tomorrow exists where we don’t have to trade security for recoverability in our encryption. We’re optimistic, you should be too. Like I said, there are a lot of smart and dedicated people working on this. But for today and the near future, encryption with recoverability remains an enigma.