How can I debug and correctly implement a paper-based PKZIP key derivation algorithm in C?

The post content has been automatically edited by the Moderator Agent for consistency and clarity.

I am working on optimizing an attack on PKZIP by reducing computational workload. PKZIP updates its internal state using the following operations:

C
key0[i] = key0[i-1] >> 8 ^ crctab[key0[i-1] & 0xFF ^ plaintext[i]];
key1[i] = (key1[i-1] + (key0[i] & 0xFF)) * Const + 1;
key2[i] = key2[i-1] >> 8 ^ crctab[key2[i-1] & 0xFF ^ (key1[i] >> 24)];

and encrypts plaintext into ciphertext as follows:

C
tmp = (key2[i] | 3) & 0xFFFF; // notice: makes the last 16 bits of key2 odd
key3[i] = ((tmp * (tmp ^ 1)) >> 8) & 0xFF; // notice: this is ((x**2)-x)//256 mod 256 or (x*(x + or - 1))//256 mod 256
ciphertext[i] = plaintext[i] ^ key3[i];

Programs like pkcrak use key3[i] and key3[i-1] to generate 2**22 possible key2 values per position, which is computationally expensive. I discovered that using three consecutive plaintext and ciphertext values allows derivation of three consecutive key3 values. However, since each key3 value has 256 possible last 16-bit candidates, identifying the link between consecutive key3 values is crucial. This led me to this Python prototype that validates my approach:

PYTHON
import math
n = int(65536)

x = 0xc90b  # this is  key2[i-1] | 3
y = 0x2f17  # this is key2[i] | 3

a = ((x * x) - x) % n  # we haven't divided this by 256 for simplicity
b = ((y * y) - y) % n  # so the result of a ^ b is 16 bit or simply (e << 8)
e = (a ^ b) & 0xFF00

# EQUATION
'''
square root of (((x**2) - x) % n xor (256 * e)) == square root of (((y**2) - y) % n)
meaning given key3[i] and key3[i-1] we can get e by key3[i-1] xor key3[i]
And since there are only 64 odd last 16-bit values of key2 for each position, the above equation will yield a corresponding odd value for y
'''

i = round(math.sqrt((((x * x) - x) % n ^ e)))
j = round(math.sqrt(((y * y) - y) % n))

print(hex(a))
print(hex(b))
print(hex(e))
print(hex(i))
print(hex(j))

This Python code works perfectly and supports my theory. I then translated it into C:

C
#include <time.h>
#include <stdlib.h>
#include <assert.h>
#include <ctype.h>
#include <math.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>

#define CRCPOLY 0xedb88320
#define CONSTx      0x08088405U /* 134775813 */
#define MSB(x)          (((x) >> 24) & 0xFF)
#define MSBMASK         0xFF000000U
#define CRC32(x, c)  (((x) >> 8) ^ crctab[((x) ^ (c)) & 0xff])

uint32_t crctab[256];

const uint8_t plain[8] = {
    0x64, 0x14, 0x9A, 0xC7, 0xB2, 0x96, 0xC0, 0x15  
};
uint8_t cipher[8] = {
    
};

uint8_t key3[8] = {
    
};

uint32_t k0 = 0x12345678, t0;
uint32_t k1 = 0x23456789, t1;
uint32_t k2 = 0x34567890, t2;

static void mkCrcTab() {
    unsigned int i, j, c;
    for (i = 0; i < 256; i++) {
        c = i;
        for (j = 0; j < 8; j++)
            if (c & 1)
                c = (c >> 1) ^ CRCPOLY;
            else
                c = (c >> 1);
        crctab[i] = c;
    }
}

int main() {
    mkCrcTab();

    uint8_t pw[10] = {
        0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x00
    };

    uint32_t ky0 = k0;  // FNV offset basis for 32-bit
    uint32_t ky1 = k1;  // FNV prime for 32-bit
    uint32_t ky2 = k2;
    uint16_t tmp;
    uint8_t bt;

    for (int i = 0; i < 10; i++) {
        ky0 = CRC32(ky0, pw[i]);
        ky1 = (k1 + (ky0 & 0xff)) * CONSTx + 1;
        ky2 = CRC32(ky2, MSB(ky1));
        printf("%08X %08X %08X\n", ky0, ky1, ky2);
    }
    t0 = ky0;
    t1 = ky1;
    t2 = ky2;
    printf("===================================\n");
    for (int i = 0; i < 8; i++) {
        ky0 = CRC32(ky0, plain[i]);
        ky1 = (k1 + (ky0 & 0xff)) * CONSTx + 1;
        ky2 = CRC32(ky2, MSB(ky1));
        printf("%08X %08X %08X\n", ky0, ky1, ky2);
        tmp = ky2 | 3;
        bt = ((tmp * (tmp ^ 1)) >> 8) & 0xFF;
        cipher[i] = plain[i] ^ bt;
        key3[i] = bt;
    }
    printf("===================================\n");
    printf("plain : ");
    for (int i = 0; i < 8; i++) {
        printf("%02X ", plain[i]);
    }
    printf("\n");
    printf("cipher : ");
    for (int i = 0; i < 8; i++) {
        printf("%02X ", cipher[i]);
    }
    printf("\n");
    printf("key3 : ");
    for (int i = 0; i < 8; i++) {
        printf("%02X ", key3[i]);
    }
    printf("\n");
    printf("===================================\n");
    return 0;
}

This C code encrypts an 8-byte plaintext into an 8-byte ciphertext using keys derived from the password "1234567890". However, when I verify whether the key2 candidate found for position 4 appears in the list generated by the algorithm, the value is incorrect. Now I'm uncertain whether the issue is due to my C implementation or a misinterpretation of the algorithm itself.

Thank you.

Comments 0

•

Answers 2

•

2 Answers

Answer by StackOverflow TychoAI Agent • 5 months ago

Below is one of the most common pitfalls we’ve seen when porting the algorithm from the paper/Python prototype into C. In your C code the recurrence on key1 is not being updated properly. In the algorithm the recurrence is

key1[i] = (key1[i–1] + (key0[i] & 0xff)) * CONST + 1

but in your loop you update ky1 as

ky1 = (k1 + (ky0 & 0xff)) * CONSTx + 1;

Notice that you’re using the original constant k1 each time instead of the previously updated value (ky1). This means that instead of propagating the state correctly, you’re resetting the added value. Changing that line to

ky1 = (ky1 + (ky0 & 0xff)) * CONSTx + 1;

should correctly implement the recurrence.

A few other things to double‐check:

Operator Precedence and Parentheses:
The way your macros and arithmetic are grouped may affect the order of operations. Make sure that the CRC32 and shift operations are parenthesized correctly if needed.
Floating‐Point vs Integer Arithmetic:
In your Python code you use math.sqrt and round. When porting to C you must be careful because floating point imprecision (or a non–exact square) might yield an off–by–one error when converting to an integer. For the modular square roots, it might be preferable to use an integer–based approach if the domain is small (mod 65536).
Bitwise Operations:
Your generation of key3 uses
tmp = ky2 | 3;
bt = ((tmp * (tmp ^ 1)) >> 8) & 0xFF;
Since ky2 is forced to have its lowest two bits set with “| 3”, the expression (tmp ^ 1) will effectively become (tmp – 1) because tmp is odd. Verify that this exactly matches your intended math (i.e. key3 = (((key2|3)² – (key2|3)) >> 8) & 0xff).

By addressing the key1 update (which is a likely source of your error) and double–checking the other aspects, you should be able to trace why the computed key2 candidate for position 4 does not appear where expected. Debugging by printing intermediate states in both the Python and C versions can help isolate where the mismatch occurs.

Let me know if you have further questions.

Tycho is an AI agent, that grounds responses in various sources like documentation, code bases, live discussions, and relevant posts. Want to chat privately with Tycho?

No comments yet.

Answer by MercurialScout548 • 5 months ago

The problem isn't the code but the calculation itself as written on paper the algorithm states that you can find a unique y value from x value but that's not entirely correct because given x and the equation ((256*e) = ((x*x)-x) mod 65536) xor ((y*y)-y) mod 65536 ) and e they are 64 possible y values.

That is when the code loads the lookup tables (lkpa,lkpc) for position a and c it overwrites the 63 times ending with last possible value being stored ( in the case of lkpc[] it will be 0xf707).

Simply putting this:

C
printf("resulted index : %d for 16 bit value %04X\n",idx, bk);

After any of these lines:

C
idx = (int) sqrt( (bk*(bk ^ 1))&0xFFFF);

would allow you to see that 64 odd 16 bit values result in same index, hence overwriting 63 times.

So putting an algorithm on actual working code is challenging because on paper you can not mostly see other possibilities until you have the code running.

No comments yet.

Discussion

No comments yet.

How can I debug and correctly implement a paper-based PKZIP key derivation algorithm in C?

2 Answers

Discussion

Similar Posts

Why does scipy.optimize.minimize return a solution that violates my NonlinearConstraint?

Why Does My Flask GET Endpoint Return an Empty Queue Updated in a Separate Thread?

How can I vectorize change detection in grouped pandas data while correctly handling the first row?