So for those who don't know, most of the actual data is in cd.wad and the metadata that actually tells you where everything is is cd.hed, which consists of a bunch of 3-uint32_t tuples with the filename_hash, the offset_into_the_wad, and the length. The upside is it's nice and fast to look things up. The downside is it's a nuisance to rip as you lack the filenames.

Here's the hash algorithm. It's a broken version of CRC-32 which may have been "obfuscated"... but for some reason they still kept the usual 0xEDB88320 polynomial so it sticks out like a sore thumb anyway.

Code: Select all

`uint32_t namehash(const char *s)`

{

uint32_t csum = (uint32_t)(int32_t)-1;

for(int i = 0; s[i] != '\x00'; i++) {

uint32_t a1 = (uint32_t)(uint8_t)s[i];

// Convert to lower case

if(a1 >= 'A' && a1 <= 'Z') {

a1 += 32;

}

// Broken CRC-32 implementation

//

// Normal CRC-32:

// * Shift right, if carry then XOR by 0xEDB88320

//

// THPS2 Special! Wow! edition:

// * Rotate left, shift original XOR'd val right, if carry then XOR by 0xEDB88320

uint32_t v0 = (a1 ^ csum);

uint32_t bits = (v0 & 0xFF);

// Reference implementation

for(int j = 0; j < 8; j++)

{

// Rotate left

csum = ((csum>>31)&1)|(csum<<1);

// Then apply the polynomial

if((bits & 1) != 0) {

csum ^= 0xEDB88320;

}

// Shift along

bits >>= 1;

}

}

return csum;

}

Also, huge shoutouts to Graymatter LTI for including a whole bunch of juicy stuff in the PC version that really shouldn't be there, like, ooh, the build logs and the complete symbol table for the PS1 demo version, and all the definitions used in the TRG files which makes reverse-engineering the maps that much easier. But on a more humble note, they at least kept the filenames fully intact on the PC version, which was invaluable for a few sets of filenames.

EDIT: Oops, the comment said "rotate right" - it's a left rotate.