By Ben Rothke, Senior Security Consultant, BT Global Services, CISSP CISA
What do Amazon and Starbucks have to do with information security data? They seem to be the mechanism being used to obtain data and metrics from security practitioners.
On any given week, I, like many other information security professionals, receive a number of emails, presented under the guise of gift certificates to Amazon and Starbucks, which request completion of various surveys and questionnaires.
Often that data is used to create global security metrics, vendor statistics and reports. The question is — how effective is that data?
Many times, the results and underlying data are unqualified. Using a more technical term, it is worthless — worthless in the sense that the recipients may not be qualified to answer the questions, there is no verification of the data, and the information can be biased due to the underlying desire to get the gift cards.
The truth is that good infosec data is quite difficult to find. Part of the issue is that the people who create the surveys, often from the marketing department of an organization, may themselves not be qualified to do so. Often questions asked are vague and the terms ambiguous. Terms such as data breach and hacking attack mean different things to different people.
An often asked question is — “How many losses have you suffered due to data breaches in the past year?” When attempting to quantify data losses, it is often more of an art than a science. Take this scenario: an Arkansas-based retail firm has an encrypted backup tape that goes missing in transit that contains the credit card numbers of 10 million customers. What is the loss? An aggressive litigator may opine that the damages should be calculated as the number of victimized customers multiplied by the average cost to recover from such an identity theft attack.
On average, it costs $8,000 for a person to recover from identity theft, according to Northwestern University. So the litigator will sue for $80 billion in losses. The defense attorney will note that the $80 backup tape was encrypted with AES-256, and therefore the losses should be limited to incidental costs and a replacement backup tape. So is the loss in this case $80 or $80 billion? Same survey question, very different answers.
What this means is that before you make any information security decisions, understand the underlying data. Dust off your statistics books, and see how conclusions in the report were determined. Ask basic questions, such as how large their sample size was? Were all those who answered from qualified companies and/or individuals?
One of the tricky things here is that there are so many different types of data that it’s often difficult to obtain effective data from a generalized on-line survey. For example, there is a huge difference between opinions (stated preference) and more objective data (revealed preference).
The big question always centers around “bias.” Vendors have a particular incentive to connect the data to the solution they are proposing. Often, the questions they create will be tilted to their solution. Not that data from vendors can’t be trusted – it’s just that when they supply data, use extra scrutiny.
Pete Lindstrom, Research Director at Spire Security, astutely observed that, “There are many problems with data, but if you look a little closer, you will find the same problems and more with the everyday, qualitative information we base our decisions on. Our goal should be to get better with the data, not bash it and use that as justification to return to the ways of the medicine man.”
One can get a great cappuccino at Starbucks, but someone’s desire to get a $50 Starbucks gift by entering spurious results should not affect your ability to make an educated decision regarding information security.
So where can you find good information security data? Stay tuned for Part 2 of this piece in which I’ll provide details on some excellent sources.
