Encryption is based on two principles: confusion and diffusion.
Confusion means that the process drastically changes data from the input to the output. For example, by translating the data through a non-linear table created from the key. We have lots of ways to reverse linear calculations (starting with high school algebra), so the more non-linear it is, the more analysis tools it breaks.
Diffusion means that changing a single character of the input will change many characters of the output. Done well, every part of the input affects every part of the output, making analysis much harder. No confusion process is perfect: it always lets through some patterns. Good diffusion scatters those patterns widely through the output, and and if there are several patterns making it through they scramble each other. This makes patterns vastly harder to spot, and vastly increases the amount of data to analyze to break the cipher.
AES has both excellent confusion and diffusion. Its confusion look up tables are very non-linear and good at destroying patterns. Its diffusion stage spreads every part of the input to every part of the output: changing one bit of input changes half the output bits on average. Both confusion and diffusion are repeated several times for each input to increase the amount of scrambling. The secret key is mixed in at every stage so that an attacker cannot precalculate what the cipher does.
None of this would happen if you used a simple one-stage scramble based on a key. Input patterns would flow straight through to the output. It might look random to the eye but analysis would find obvious patterns and the cipher could be broken.
I guess I had always assumed attacks were brute force input = output. But you seem to be implying that patterns would be looked for that would make the encryption/decryption vastly easier. That would probably be the piece that I wasn't grasping because if you have:
a -> f(a) -> b
vs.
a -> g(a) -> b
Then brute force really wouldn't be affected by complexity unless it was a time factor and thus either of the functions should be identical in security. But if analysis was the key to a good attack then this makes perfect sense. Thanks for the info.
Yes. As a simple example, if you XORd your input text with a 1-byte key, the letter frequency of english text would be unchanged.
So your most common byte in your output would be likely to be the encrypted char 'e'. You could then recover the key by XOR 'e' with that most common byte.
More complex systems have more complex patterns, but that's what cryptopgraphers do - try and spot weaknesses in an algorithm.
Search for "Garbling a block to confuse an app" and read on from there for a real-world use of output patterns (used when you can partially control the input).
Multibyte XOR keys are almost as trivial to break as single-byte XOR keys, for what it's worth. And if you know how to do that, there are "best practices" AES modes in which common implementation errors lead to the same attack.
You can brute force AES just like anything else. But AES keys are at least 2128 bits long, so you're racing the lifespan of the solar system to attempt it.
Confusion means that the process drastically changes data from the input to the output. For example, by translating the data through a non-linear table created from the key. We have lots of ways to reverse linear calculations (starting with high school algebra), so the more non-linear it is, the more analysis tools it breaks.
Diffusion means that changing a single character of the input will change many characters of the output. Done well, every part of the input affects every part of the output, making analysis much harder. No confusion process is perfect: it always lets through some patterns. Good diffusion scatters those patterns widely through the output, and and if there are several patterns making it through they scramble each other. This makes patterns vastly harder to spot, and vastly increases the amount of data to analyze to break the cipher.
AES has both excellent confusion and diffusion. Its confusion look up tables are very non-linear and good at destroying patterns. Its diffusion stage spreads every part of the input to every part of the output: changing one bit of input changes half the output bits on average. Both confusion and diffusion are repeated several times for each input to increase the amount of scrambling. The secret key is mixed in at every stage so that an attacker cannot precalculate what the cipher does.
None of this would happen if you used a simple one-stage scramble based on a key. Input patterns would flow straight through to the output. It might look random to the eye but analysis would find obvious patterns and the cipher could be broken.