It's a length extension on the beginning part of the PDF. There are 2 headers that have the same hash. As long as you append the same suffix to both of those headers, the hash will be the same. In this case the headers happen to contain a switch that select between one of two images. So the length extension is adding both images to both of the headers. Since the headers have the same hash and the suffix has the same hash, the overall document has the same hash. But because of the switch in the header you see two completely different contents.
So basically H1 and H2 have the same SHA1 hash. By adding suffix I1I2 to both you get H1I1I2 and H2I1I2. That's the length extension.
If you got two different messages to get into the same internal state with SHA-3, the same thing would apply, wouldn't it? Would that be a length extension attack on SHA-3?
This is really a terminology question. I had a clear understanding of "length extension attacks" but it seems on this comment page people are using something else now. I've been looking over crypto.stackexchance and twitter to see if I missed the memo but this looks like a new usage.
It's a different usage of the term than the normal case, but it takes advantage of the same vulnerability to length extension. It doesn't necessarily apply to SHA-3 and other sponge functions because there is a difference there between the internal state of the hash function and the actual hash itself. The internal state is much larger than the hash output for a sponge function like SHA-3, which means you can get two messages that have the same hash without having the same internal state, and therefore appending a suffix would mostly likely change that internal state enough to no longer have a hash collision.
> It's a different usage of the term than the normal case
Is it? How? It's a simple case of length extension, just that here, since we have two independent starting points sharing the same state, we start with a collision and we extend to a collision.
In other words, these are two length extensions on independent prefixes. It just happens that these prefixes share the same state / hash, hence the surprising result (on a first glance).
Normally when talking about length extension attacks the original plaintext is unknown, but we can compute the hash of the plaintext plus an extension if we know the hash of the plaintext. In this case we know what the plaintext is, and we happen to have two different texts that produce the same hash, which we can extend to generate many collisions. It's the same property but it's a different scenario than what is commonly referred to as length extension.
Both SHA-3 and BLAKE2 are not susceptible to length extension. Keyed BLAKE2 is actually just a prefix MAC (which would be completely unsecure with SHA-1/2 / classic Merkle-Damgard).
So basically H1 and H2 have the same SHA1 hash. By adding suffix I1I2 to both you get H1I1I2 and H2I1I2. That's the length extension.