My commentary on the thread was related to classifying data by state, I’ll replicate it here:
I’ve been following this thread, and I think it’s gone a little out of focus. I like the original idea, but if you’re going to talk about classifying data in those three states (at rest, in use, in transit), you need to define exactly what you mean by those terms.
Robert (who kicked off the thread) said:
However, Data at Rest is almost by definition completely useless. Generally speaking, at least in most enterprise environments, data is worthless unless it can be shared with someone else, and that implies Data in Transit. And that includes data being physically transported on a USB flash drive, as well as transmitted electronically.
Right at the beginning of the thread, we have ambiguity as to what “Data at Rest” means. If Robert is including physical transportation of media as part of the classification of “Data in Transit” (edited from the original typo), then… well, the term “Data at Rest” itself can really be regarded as a useless category: you can always pick up a disk and carry it away… effectively (from a security analysis standpoint), the class of data which can be regarded as “at rest” is the null set.
Similarly, what is the difference between “Data in Transit” and “Data in Use”? We all seem to have some sort of assumed idea about what those states are, but there’s no rigor attached to it… if data is being read off of media by some process, being transferred to the application layer in order to be presented to a user, is that “Data in Use” or “Data in Transit”? It’s certainly changing states, which could imply that it’s “in Transit”… I could go on, but you get my point.
We’re lacking the base context. So, being obsessed by classification, I’ll propose one.
“Data at Rest” is data recorded on storage media. This data can be regarded as “secure” if and only if data is protected by strong encryption (where “strong encryption” is defined as “encryption requiring a computationally infeasible amount of time to brute force attack”) AND the key is (a) not present on the media itself (b) not present on the node associated with the media; and (c) is of sufficient length and randomness to be functionally immune to a dictionary attack.
“Data in Use” is all data not in an at rest state, that is on only one particular node in a network (for example, in resident memory, or swap, or processor cache or disk cache, etc. memory). This data can be regarded as “secure” if and only if (a) access to the memory is rigorously controlled (the process that accessed the data off of the storage media and read the data into memory is the only process that has access to the memory, and no other process can either access the data in memory, or man-in-the-middle the data while it passes through I/O), and (b) regardless of how the process terminates (either by successful completion, or killing of the process, or shutdown of the computer), the data cannot be retrieved from any location other than the original at rest state, requiring re-authorization.
“Data in Transit” is all data being transferred between two nodes in a network. This data can be regarded as secure if and only if (a) both hosts are capable of protecting the data in the previous two classifications and (b) the communication between the two hosts is identified, authenticated, authorized, and private, meaning no third host can eavesdrop on the communication between the two hosts.
Looking at these three classifications, here’s your vulnerabilities:
Protecting Data at Rest:
You must either (a) encrypt the entire contents of the storage media or (b) you must have complete knowledge of how any system or user organizes data when writing to the storage media so that you can encrypt the data that needs to be protected. (a) is FDE. (b) can accomplished by any one of a number of other solutions, but is very very difficult because even if you know how the system stores everything, you don’t know (or have to enforce through restriction) how the user may store something (you must disable his/her ability to store anything “sensitive” on the media in a location that is not encrypted). Furthermore (c) you must enforce strong keys/passwords and (d) you must prevent the user from storing the password on the media.
Finally, remember, (e) for detachable media, including laptop hard drives, the USER is considered the “node associated with the media”, so really, your data can’t be considered secure, because the user is the node, and the user has the key. (Unless, I suppose, you have the ability to revoke the key remotely, preventing Disgruntled Joe from taking a laptop out and then quitting with a copy of your code base already in his possession).
By far, (c)/(d)/(e) are going to be the hardest. A suitably strong password that prevents a dictionary attack is going to be burdensome to the user to retain, so they’re either going to forget it, or write it down and stickynote it to the monitor, etc. The only way to mitigate this risk effectively is to *limit access to the data in the first place* – people look at FDE as a “silver bullet” to allow them to say, “We can now allow our vice president to take a copy of the financial database home on his laptop, because it is encrypted, so we don’t have to worry if the laptop is stolen”, but that assumes that (c)/(d)/(e) aren’t problems, which is screwy. Sensitive data shouldn’t leave the house, people. If the VP wants access to the data because it makes his life easier, say “No, you need to be in the office to get access to that,” or make sure ahead of time that everyone at the CEO/Board of Directors level knows that you have *no real data protection* – your data is only as secure as everyone is trustworthy. And while I may trust a particular worker to not read data to a corporate rival over the phone, I simply don’t trust any number of workers > 2 to *not put their password on a sticky note on the screen of their laptop*.
Protecting Data in Use:
This is basically impossible in today’s OS market… anyone who claims that they have have “secure data is use” is full of baloney. The best you can do here is mitigate the attack vectors. If you use FDE, you solve some of the problems because the swap space is encrypted, which prevents one attack vector, or you can get rid of swap altogether (and make sure you’re not using NVRAM). However, if you look at the various ways that Data in Use can be mishandled, virtually all of the major vulnerabilities are exploitable at the OS level, which is something that you’ve more or less outsourced to your OS vendor. Your only mitigation here is to lock down the OS as much as you possibly can (including using FDE to protect the OS files at rest!), and this is more often way more trouble than it is worth, given that even if you could cover all of your bases, it doesn’t protect from Kevin Mitnick. From a cost/benefit analysis, aside from taking basic steps to secure an operating system, you’re wasting money – locking down Windows to the point of near un-usability isn’t going to protect you from a zero-day IE exploit.
The number one way to prevent OS level exploits is to use a web proxy at your border and disallow all attachments via email. Anybody who can successfully sell #2, please let me know how you did it. If you can’t do those two things, though, spending more than a minimal effort locking down the host OS is largely a waste of time.
Protecting Data in Transit:
Here’s where S/MIME and SSL and IPSec and all that good stuff comes in. Actually, next to protecting Data at Rest, protecting Data in Transit is probably one of the easier tasks to accomplish at the present time, except for the fact that both hosts have to be able to protect the Data in Use, and we illustrated in the previous paragraph how hard that is. Yes, you can man-in-the-middle data in transit in many, many instances in today’s networked world, but we already have many of the technologies to mitigate this; we just don’t deploy them properly.
Looking at all of the above, it should be obvious to everybody that you can’t claim your data is “secure”. So, what you need to do is decide what constitutes “reasonably secure” and shoot for that; and that is an organizationally-dependent classification. There is no industry-wide Best Practice available here.
Assuming you are using a hardware token to provide two-factor authentication, hopefully it has a big red light on it to let you know when it is being used for encryption or decryption. And hopefully you log off of the token as soon as you have finished encrypting a file, and likewise whenever the screen-saver locks.
Although I agree, this would be awesome if used properly, it’s simply not ever going to be used properly. People won’t notice if their hardware token flashes red. They don’t notice when their browser doesn’t have the SSL lock icon. Even if they do notice, taking a set of users bigger than a few, most of them aren’t going to care. You can’t solve this problem with education or training.
Unfortunately, really large files can become rather cumbersome to deal with, and particularly the .pst files created by Outlook – some of which can grow to 4 GB. So archive your e-mail religiously to keep the working set small, and use s/mime for all your important correspondence.
Also great tips, which unfortunately are going to fail for any reasonable number of users.
Finally, plan ahead. File formats change, disk crashes occur, encryption hardware gets lost or broken, and your wife might need to access your income tax returns if you run into a tree some night.
Absolutely. Also, remember, your data is only as secure as your backups. If you’re busting your chops protecting your data, you should be busting your chops equally to protect your backups, whatever they are.
Edited to add (07-2010): Bruce has a recent post up that is related.