DLP Should Be A Help Not A Hindrance

Data loss prevention doesn’t have to be an “all-up” approach. Sometimes, it’s best to start with the simple things, says PJ Connolly

Data loss prevention isn’t a new idea, but it’s a concept that’s become increasingly important to IT as organisations recognise the threat to their operations from leaks by disgruntled insiders or intrusions by hostile outsiders. At least, that’s the pitch DLP (data-loss-prevention) vendors use.

But in most cases, notes Securosis analyst and CEO Rich Mogull, organisations that deploy DLP find data leaks are more likely to be caused by accident or by bad procedures rather than malice. As he explained in an interview with eWEEK, when the causes of leaks are explored, the “whoops” factor surfaces repeatedly. Someone in the US transmits unencrypted medical data in violation of HIPAA (Health Insurance Portability and Accountability Act) rules, or a file containing credit card numbers is moved into an unsecured area. This sort of thing, when discovered during an audit, can be a career-killer; if it becomes a news story, it’s damaging to the reputation of the business itself.

Of course, that doesn’t mean that companies not subject to regulatory regimes – such as HIPAA, Sarbanes-Oxley Act in the US or the Data Protection Act in the UK – can simply pass on implementing DLP. As Mogull explained, the risk of data loss isn’t always visible: When data is stolen, or merely mishandled, “you don’t even have the base monitoring to know about the problem.”

Effective data loss protection

How can an organisation introduce DLP in an effective manner, when the potential for leakage or loss is so pervasive? Let’s start with a conceptual discussion before moving on to specifics.

One can begin thinking about DLP by treating data as being in one or more states: data in motion, data at rest and data in use. But there’s a danger in focusing on only one of these aspects, because methods that work exceedingly well on, say, the network – or data in motion – may be of little or no use against threats that seek to obtain data in use at an endpoint. A sound DLP strategy will consider all three of these against the needs of an organisation, whether these are regulatory, operational or cultural. The poser, for both IT managers and security specialists, is that no single product adequately addresses all three categories.

Or if one chooses to look at DLP from another perspective, one can consider it from the standpoint of threat vectors. In this view, the tripod’s legs are email, the web and the endpoint. Protecting against the first two is fairly well understood and easily implemented. The third is a little more complicated. Removable media can be blocked or screened, but the near-ubiquity of phone-based cameras makes it possible to record on-screen data, albeit in a clumsy and terribly obvious fashion.

The first step in implementing a DLP strategy is data identification. Although it may be easy to specify the general nature of the data to be protected, such as financial records, customer information or product plans, it’s not always that simple to assign a risk value to an individual document. Mogull points out that one has to “understand what to protect.”

Context + Content

Perhaps it’s best to regard the context of data, with its content, as two sides of a coin. Context can take the form of file metadata, email headers or the application that’s consuming the data. In more complex forms of context analysis, a DLP process might look at file formats or network protocols, or use network information from a DHCP server and a directory service to identify who’s consuming the data. This can be expanded to take specific web services or network destinations into account or identify individual storage devices such as a USB drive.

Content, as one might think, is pretty self-explanatory. Being aware of the contents of data can often give one a good indication of what kind of protection needs to be applied. Analysing content is where things get tricky, because one has to start with the context of data, and then examine the contents. This might take a rules-based approach using regular expressions, file matching, database fingerprinting or statistical analysis. This “content awareness,” as Securosis’ Mogull (pictured) puts it, is what defines true DLP.

Continued on page 2