In Learning Tree’s System and Network Security Introduction course we talk a little about risk management. Unless you can put numbers on a thing, it’s hard to discuss it or plan for it (or plan how to avoid it!) in a meaningful way.
A startling result from late summer is still getting some attention. It’s an important thing to know about, as it has to do with those elusive numbers we need to do meaningful risk analysis.
“Brute force searching, the typical set and Guesswork” is an interesting technical paper you can download from the arXiv.org archive.
Claude Shannon largely founded the field of information theory back in the 1940s, as described in James Glieck’s book The Information which I have mentioned before. Shannon explored the concept of entropy, which can measure both randomness or unpredictability as well as potential information content.
The problem with most current analyses of cryptographic strength is that they’re using the wrong measurement of randomness or entropy, and therefore they overestimate the difficulty of certain types of cryptanalytic attack.
Put simply: Things aren’t as safe as we thought they were.
Don’t panic! The analysis by the electrical engineers from MIT and the National University of Ireland did not show that it was easy to break current cryptographic systems. Far from it, their attack is still extremely difficult. But it’s not as extremely difficult as we thought it would be.
The problem comes from the Shannon notion of entropy being based on overall or average characteristics, the probability or the frequency at which certain strings of bits will appear in certain types of files. Shannon worked for Bell Telephone, and this was precisely the correct approach for designing communications systems. You need overall statistic models for the traffic content.
Cryptography, however, specifically cryptanalysis or attacks on cipher systems, isn’t concerned with overall statistics of what happens on randomly selected chunks from continuous data streams. Cryptanalysis tries to break the system, so you’re interested in weak points. They’re extremely difficult to find automatically, but they make the overall attack much easier than a blind or “naive” search.
Typical data stored in files or transmitted as streams of packets across a network is not of perfectly uniform distributions. Compression (for example, encoding images or audio with algorithms like MP3 or JPEG) makes files smaller by removing much of the entropy or randomness. But compression isn’t a perfect process. Typical data files or streams will contain segments with lower entropy. The variations in entropy in typical data makes brute force searching more effective than typical analyses have assumed.
This doesn’t seem to favor one cipher over another, there’s no need to change to a different algorithm. The important thing to realize is that the absolute numbers may be a little lower.
What’s interesting to me is how this very nicely demonstrates the difficulty of meaningful cybersecurity analysis. Really sharp people are working on these problems, we must carefully follow their discoveries.