Page 1 of 1

Working out the THPS2 PS1 filenames

Posted: Sat Aug 05, 2017 8:56 pm
by iamgreaser
First post so it would be unwise of me to post a link right now. I'm bruteforcing the one filename I don't have, and this will take a long time (it's a file from THPS1). You can find it on my GitHub Gist as "THPS2 PS1 (almost) complete filename list".

So for those who don't know, most of the actual data is in cd.wad and the metadata that actually tells you where everything is is cd.hed, which consists of a bunch of 3-uint32_t tuples with the filename_hash, the offset_into_the_wad, and the length. The upside is it's nice and fast to look things up. The downside is it's a nuisance to rip as you lack the filenames.

Here's the hash algorithm. It's a broken version of CRC-32 which may have been "obfuscated"... but for some reason they still kept the usual 0xEDB88320 polynomial so it sticks out like a sore thumb anyway.

Code: Select all

uint32_t namehash(const char *s)
	uint32_t csum = (uint32_t)(int32_t)-1;
	for(int i = 0; s[i] != '\x00'; i++) {
		uint32_t a1 = (uint32_t)(uint8_t)s[i];

		// Convert to lower case
		if(a1 >= 'A' && a1 <= 'Z') {
			a1 += 32;

		// Broken CRC-32 implementation
		// Normal CRC-32:
		// * Shift right, if carry then XOR by 0xEDB88320
		// THPS2 Special! Wow! edition:
		// * Rotate left, shift original XOR'd val right, if carry then XOR by 0xEDB88320

		uint32_t v0 = (a1 ^ csum);
		uint32_t bits = (v0 & 0xFF);
		// Reference implementation
		for(int j = 0; j < 8; j++)
			// Rotate left
			csum = ((csum>>31)&1)|(csum<<1);

			// Then apply the polynomial
			if((bits & 1) != 0) {
				csum ^= 0xEDB88320;

			// Shift along
			bits >>= 1;

	return csum;
Also, huge shoutouts to Graymatter LTI for including a whole bunch of juicy stuff in the PC version that really shouldn't be there, like, ooh, the build logs and the complete symbol table for the PS1 demo version, and all the definitions used in the TRG files which makes reverse-engineering the maps that much easier. But on a more humble note, they at least kept the filenames fully intact on the PC version, which was invaluable for a few sets of filenames.

EDIT: Oops, the comment said "rotate right" - it's a left rotate.

Re: Working out the THPS2 PS1 filenames

Posted: Sun Aug 06, 2017 7:54 am
by Morten1337
I think this hash function is called crc32-b.

Re: Working out the THPS2 PS1 filenames

Posted: Mon Aug 07, 2017 3:04 am
by iamgreaser
crc32b in that file is just an implementation of the usual zlib crc32.

It looks suspiciously like they took a working version of CRC-32, then decided to rotate the output to the left instead of shifting it to the right. The math is completely ruined but at least it keeps a somewhat decent level of entropy.

Oh, and now that you know I'm not a filthy spambot, here's the filename list. Have fun. ... 8494703730