Linear Cryptanalysis Applied to Reduced-round RC5
This page is about a linear attack on the RC5 block cipher. I discovered this after learning the basics of Matsui's linear cryptanalysis while examining RC5. This is my first stab at analyzing a modern real-world cipher. I encourage the reader to email me if you notice any mistakes. In the future, I may write a formal paper about this but I'd like to work on it more and become more familiar with the conventions used in writing academic papers. Also, it should be noted that I have not done a lot of in-depth reading of existing papers analyzing RC5. It is very possible that this attack is simply re-inventing the wheel. I feel that rediscovering attacks is important to the growth of novice cryptographers like myself. If anyone knows of a paper that describes something similar to what I propose here, I would be grateful if they sent me a link to it.
This section will update as I add more to this page. At the moment, this attack has only been developed and tested against 1 round of a variant of RC5 (16 bit block size, 8 bit subkey). The next step is to scale the attack to a real 1 round RC5 (32 bit block size, 16 bit subkey). After that, we'll add more rounds and see how the effectiveness scales.
What is RC5?
RC5 is a very simple block cipher designed by crypto legend Ron Rivest. It is also very flexible: the data sizes used, as well as the number of rounds, are not set in stone. It uses a Feistel-like structure and is centered around an operation called a data-dependent rotation. This operation takes an input and rotates the bits to left a number of times determined by a second input; the rotated result is the function's output. This rotation adds diffusion and the data-dependence aspect may also create confusion. The other components of the cipher are XOR (used to mix the left and right halves) and modular addition (for key mixing). Rivest allows a minimum block size of 32 bits. The key size can be any number of bytes and any number of rounds are valid. For now, I will refer to the parameters as: 16 bit block size, 8 bit key, and 1 round. I realize that these are very small parameters and even fall below the loose requirements of the cipher itself. However, working with small sizes will allow easier explanation and faster testing for now.
Modular Addition
The first component of RC5 that I'd like to discuss is the modular addition key-mixing portion. We will skip discussion of the XOR that combines the left and right halves for now because it does not hinder the attack. Modular addition takes two inputs and adds them. The result is then reduced to the original data size of the inputs(not the combined size of them) by modulo. In our examples, the 8 bit input is added to the 8 bit subkey and then taken modulo 256 to produce an 8 bit output. We are trying to build a classic linear attack against this cipher and the modulo addition is a problem: it is not entirely linear; it is not completely non-linear. However, it is non-linear enough that we can justify trying something strange: lets treat it as an s-box. We will build linear expressions based on the input, the key, and finally the output. We will then process all of these equations against all possible inputs and all possible subkeys. By keeping track of when the equations holds true and when they do not, we can discover what exactly makes this modular addition function partially linear.
Data-dependent Rotations
This is the really scary part of the cipher that everyone seems to beat their heads against. It takes an input (8 bit in our example) and a rotation value (also 8 bit in our example). The input's bits are shifted in a circular fashion to the left a number of times determined by the rotation value. For example: if the input is 3 (00000011) and the rotation value is 9 (00001001), then the output is 6 (00000110). This seems like a really complicated function to analyze for a linear attack. However, there appears to be a case that makes the rotation aspect of this cipher completely irrelevant (perfect!). If we can find a linear expression for the modular addition which XORs together every single input bit; we can ignore the data-dependent rotation! Because parts of an XOR sum can be XOR'd together no matter what order they are in; it does not matter if they are rotated around or scrambled up. However, if we are only XORing together some of the addition's input bits; those bits do not stay consistent between the plaintext and the output of the rotation.
Finding Linear Expressions
Now we must find some linear expressions for the modular addition which hold true for more or less than half of the cases. We treat the 8 bit input and the 8 bit key as a single 16 bit input. However, we do have a stipulation for the sorts of linear expressions we are looking to generate. Our program needs to always use the XOR sum of EVERY single bit of the input. Remember that this input is being scrambled by the data-dependent rotation; therefore we can only use linear expressions that utilize every single one of these bits and not a subset of them. Taking this into account, the program will generate every single linear expression for both the key bits and the output bits while XOR-summing every input-bit. The end result of my program (keeping only the most biased expressions) was this listing. Notice that at the very end of that list, there is an expression which uses the XOR sum of every single input, output, and key bit involved. It also holds true more than half the time; we can use it! What this expression basically means is that, statistically, we can ignore any keys whose bits do not XOR-sum to a value that equals the XOR-sum of known-inputs to the modular addition operation.
Making the Expression Useful
We have now found a useful linear expression that linearly approximates the operation of the modular addition. How can we expand this to be applicable to the entire round and how can we use it to recover the key? I encourage you to refer to a block diagram of the RC5 round while reading this. We can see that the output of the addition becomes R1: a known value. K0 is the key that we are searching for (hopefully non-exhaustively). The real question is: how can we get from the plaintext input to the round down to the input of the modular addition? Firstly we must XOR together L0 and R0; we just add these bits to a list of input bits that will become the input XOR-sum. This listing of bits next pass through the data-dependent rotation. Because we are XOR-summing every single input bit to the addition and the order of these bits does not matter for XOR, the data-dependent rotation does not matter at all. We chose an expression that neutralized it for this analysis. Now we see that the XOR-sum of every single plaintext bit and every single R1 ciphertext bit should statistically equal the XOR-sum of every single key bit.
Implementing the Attack
This is a known-plaintext attack. This means that it requires a collection of known plaintexts that have all been encrypted using the same key as well as the resulting ciphertexts. Our job is to deduce the key without simply testing every single one (brute force). The first thing to do is determine what the XOR-sums of all of the known plaintext-ciphertext pairs. Then we figure out what the most common XOR-sum is among the collected known pairs. We will call this the known XOR-sum. Next we will loop through all possible keys and test encrypt every plaintext with each test-key until we determine that it produces the correct ciphertexts. When that test-key encrypts all of the known-plaintexts to their respective known-ciphertexts; the test-key is considered the real key and we win. The trick here is to completely ignore test-keys whose XOR-sum does not equal the known XOR-sum. Statistically this will allow us to determine the correct key more than 50% of the time and by testing less than 25% of the keyspace. This is better than brute force and can be considered a break. Here is the code which I used to test the attack written in C: RC5 Attack Code.
Results
One Round, 16 Bit Input, 8 Bit Key : 20 Plaintexts; 10,000 Runs Averaged:- Total Keys Found = 57%
- Keys Checked Average when Found = 60/255