Skip to content
View in the app

A better way to browse. Learn more.

ResHax

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.
Help us keep the site running.

Identify Unknown Compression

Featured Replies

  • Author
  • Localization

ngc_kor, posted Tue Oct 28, 2014 4:46 am (1296)


I had a strange compression file that I couldn't find any clue.
(From - Eternal Darkness by Silicon Knights, Published by Nintendo)
File signature start with *SK_ASC* and unknown compression.

The list below compression method that I tested, but doesn't match:

    LZ10
    LZ11
    LZ77
    LZO1x-1
    LZO1x-999
    LZSS
    LZW
    LZMA
    HUFF blocksize 4 & 8 byte
    RLE
    ZLIB

I had a decompression data, and still no clue for a few days.
Can anyone identify this compression method?

Here is the file: http://goo.gl/JJQfl4

p.s It seems like a bpe (Byte Pair Encoding). but not sure :(


[File Information]

CMP is a compressed data which always start with SK_ASC.
BIN is a same as CMP.
DCMP is a instant copy from memory file, which was extracted from offset 0x80CA6980.
DMP is a data value correction by subtracted 0x80CA6980 because there so many memory scrapping dummy. (so this is the cleaned decompressed data if i guess right)
RAM is a ram dump data

I already posted several forum. But still no clue for weeks :(
http://reverseengineering.stackexchange ... d-gamecube
http://encode.ru/threads/2074-Identifyi ... #post41188
  • Author
  • Localization

aluigi, posted Tue Oct 28, 2014 8:21 am (1303)


If we consider that at offset 8 it may contain a big endian uncompressed size (32bit) and the data starts from offset 0xc, the only "good" result I got from the scanner was 230.dat which is a RLE3 (which is just for tests, it's not used) but it's for sure invalid because doesn't contain strings or fields.
I tried also lzma from offset 8 but still nothing.
  • Author
  • Localization

ngc_kor, posted Tue Oct 28, 2014 9:43 am (1306)


aluigi wrote:
If we consider that at offset 8 it may contain a big endian uncompressed size (32bit) and the data starts from offset 0xc, the only "good" result I got from the scanner was 230.dat which is a RLE3 (which is just for tests, it's not used) but it's for sure invalid because doesn't contain strings or fields.
I tried also lzma from offset 8 but still nothing.


All of files start with 0x00 0x** from 0x08 which mean we can guess it's a some kind of size related.
But... I compared a lot of data, that is not a size value AFAIK.

e.g) EKisokMenu.dmp file is 2184192 byte, which mean size is 0x215400, but there is no match value inside of EKisokMenu.cmp!

I just guessing first 2-4 byte from 0x08 is a some kind of dictionary size.

Anyway, I uploaded data compare sheet here: http://goo.gl/wu6Pkv (re-compress size with various comp method also here)
  • Author
  • Localization

aluigi, posted Tue Oct 28, 2014 9:53 am (1308)


I made another compression scanning from offset 0x8 of EMnMenu.cmp with an uncompressed size of 0x26a920 (the size of the dmp one) but still nothing.
  • Author
  • Localization

ngc_kor, posted Sun Nov 09, 2014 9:41 am (1660)


Hump...
  • Author
  • Localization

Wulf, posted Mon Nov 10, 2014 6:03 pm (1705)


http://blog.delroth.net/2012/03/gcwii-d ... r-ida-6-1/

If you have access to IDA you can use the above plugin to load the DOL file. GC/Wii isn't my thing so I'm not able to tell you much about it, but with enough time you should be able to figure it out with that.

IDA found a few references to decompression.
  • Author
  • Localization

ngc_kor, posted Fri Nov 14, 2014 10:37 am (1782)


Thanks to wulf, Now I see some clue. And look like this.

sub_8014191C: (IDA PRO DOL PLUGIN START.dol)

.set var_28, -0x28
.set var_14, -0x14
.set arg_4, 4

stwu r1, -0x30(r1)
mflr r0
stw r0, 0x30 arg_4(r1)
stmw r27, 0x30 var_14(r1)
mr r28, r4
mr r27, r3
mr r30, r6
mr r29, r5
mr r12, r28
mr r31, r7
mr r5, r30
addi r3, r1, 0x30 var_28
li r4, 8
mtctr r12
bctrl
lis r3, ((aSk_asc 0x10000)@h) # "*SK_ASC*"
addi r4, r1, 0x30 var_28
addi r3, r3, -0xB68 # aSk_asc
li r5, 8
bl sub_800F3B34
cmpwi r3, 0
bne loc_80141994

So, this could be a some kind of structure of decompression.
BUT, what should I do? I'm not good in dis-assembly. :(

My goal is find out a compression method, making a compresser/decompresser tool.
If anyone could help me, I would be appreciated that.
  • Author
  • Localization

Wulf, posted Fri Nov 14, 2014 5:08 pm (1784)


Do you have the capability to debug the game while it runs?

If so, I'd start by feeding it smaller and smaller chunks of the archive and watching how far it gets. Does it read the first 20 bytes before branching to a new sub? 40? Etc. Once it copies out the first chunk of the code, does it do some math on it that converts it into chunks of the RAM dump you found? Or does it create some sort of header information that is then processed in its decrypted state? It could easily be a known compression method wrapped in some form of simple encryption.

What setup are you using to play around with the game? Original disc on original hardware, one of the emulators, something else? Do you have IDA 6.1, 6.5, or other?

Disclaimer: I have no clue what I'm talking about but usually fake it well enough to fluke into a solution.
  • Author
  • Localization

ngc_kor, posted Sat Nov 15, 2014 1:54 am (1788)


I'm using a IDA Pro 6.1 and I have a original disc and original hardware, but if I do some test, I use emulators.
And emulator have the debug function that I could debug.
Now I'm set 8014191C for breakpoint for seeking.

Wulf wrote:
Do you have the capability to debug the game while it runs?

If so, I'd start by feeding it smaller and smaller chunks of the archive and watching how far it gets. Does it read the first 20 bytes before branching to a new sub? 40? Etc. Once it copies out the first chunk of the code, does it do some math on it that converts it into chunks of the RAM dump you found? Or does it create some sort of header information that is then processed in its decrypted state? It could easily be a known compression method wrapped in some form of simple encryption.

What setup are you using to play around with the game? Original disc on original hardware, one of the emulators, something else? Do you have IDA 6.1, 6.5, or other?

Disclaimer: I have no clue what I'm talking about but usually fake it well enough to fluke into a solution.
  • Author
  • Localization

ngc_kor, posted Sat Nov 15, 2014 5:36 am (1791)


When I was set the breaking point in 80141374 / 800f3b34 / 8014191c / 80141998 / 8013FC58
(Except first signature) First 4000 byte of Compressed file(JMnMenu.cmp) is loaded (80)5ADEC0~(80)5AC3E0 in Memory File.
And this data is decompressed which was stored at (80)5AC3E0~

So I guessing that 8014191C (main) / 8013FC58 (sub_routine) is sure for Decompression algorithms.
I try to figure out by myself, but I'm not well in PPC Disassembly, So I couldn't do it.
Would anyone help me find out describe how it works?

I uploaded Power PC ASM code(decompression algorithms) here: http://goo.gl/2bQNfj

And also decompressed file/compressed file/main executive file/ram dump file uploaded
  • Author
  • Localization

Wulf, posted Sun Nov 16, 2014 6:09 am (1805)


I'll see if I can get something set up to debug the code on my own tomorrow. Understanding it all hands-off just from seeing the code is beyond my skills.

What I'd do is confirm the first instruction to read that byte, see if any operations are done to it, then see where it is stored. Then set a read breakpoint on the new location where it is stored, and repeat.

The fewer steps the data takes between first read and final write, the easier it is to figure out what's being done to it.
  • Author
  • Localization

Wulf, posted Mon Nov 17, 2014 2:07 am (1829)


I got everything set up and got the debugger working, but I don't have it in me tonight to do more than that. See if I can do more tomorrow.
  • Author
  • Localization

ngc_kor, posted Mon Nov 17, 2014 9:52 am (1832)


Wulf wrote:
I got everything set up and got the debugger working, but I don't have it in me tonight to do more than that. See if I can do more tomorrow.


I'm really counting on you. Thank you wulf.
I can't wait tomorrow! :)


p.s Yesterday, I got a de-compiled code.
I'm not sure whether the code is correct or not, so It could not much help for now, but it's better than asm code.

Here-> http://goo.gl/343oVv

3 different code start offset from 8013F29C, 8013F40C, 8013FC58 each.
Entrypoint: 8014191C, End offset: 801419CC
Decompiled from http://decompiler.fit.vutbr.cz/decompilation-run.
  • Author
  • Localization

Wulf, posted Mon Nov 17, 2014 5:25 pm (1835)


The biggest holdup at this moment is trying to get the Dolphin debugging environment similar enough to what I'm used to working in.

And I'll give it my best shot, but this is still pretty unfamiliar ground for me. I wouldn't even want to attach a number to my probability of success with it.
  • Author
  • Localization

Wulf, posted Tue Nov 18, 2014 2:53 pm (1840)


Didn't get a chance to work on it last night, and if I do get a chance tonight it won't be enough time to do much.

Have you made any progress yourself?
  • Author
  • Localization

ngc_kor, posted Wed Nov 19, 2014 10:01 am (1846)


I'm looking for a compression algorithms used at before the year 2002 (when the game released) to find the source and check the compression rate to compare with decompiled algorithm structure..
Because of compression rate is better than gzip/deflate(zlib), I'm guessing 2 or more algorithm combination (like a LZ77 Huffman Dictionary).

I'm also checking suspected algorithm like Arithmetic Coding / Byte Pair Encoding.
Interestingly, some of decompressed script data uses BPE compression. (Check here: http://goo.gl/5ztlRu)
It could be related with original one. So I'm deeply analysing too.

Wulf wrote:
Didn't get a chance to work on it last night, and if I do get a chance tonight it won't be enough time to do much.

Have you made any progress yourself?
  • Author
  • Localization

ngc_kor, posted Sun Nov 23, 2014 1:58 am (1921)


Last night, I got that compressed data loaded from register 28 (r28) and decompressed byte stored at register 25 (r25) by debugging.
When the breakpoint set to 8013fca8, loaded data stored at r28, and when break through the 80140688, the data stored at memory (c.g. 805ac3e0)
Um... I almost arrived at goal... Right?


Anyway, I uploaded decompiled routine graph here.

[English ver routine 80144F48-80147E64 (Entry: 80146110)] http://goo.gl/oc78ja
[Japanese ver routine 8013F29C-801419CC (Entry: 8013Fce8)] http://goo.gl/rJlS9J
[English ver routine 80146110-80148EB4 (Entry: 80146110)] http://goo.gl/X0elVU
  • Author
  • Localization

ngc_kor, posted Sun Nov 23, 2014 9:17 am (1930)


Upload debugging (register / breakpoint) data: http://goo.gl/hm2Q1z


p.s Can anyone help me to compile decompress routine program?
  • Author
  • Localization

Argonaut, posted Sun Nov 23, 2014 1:50 pm (1934)


Would you by any chance be interested in cracking another unknown compression type which would unlock the data for five games? Just a thought, no problem in taking alook:

http://forum.xentax.com/viewtopic.php?f=21&t=12133

Thanks (even if you don't check it out)
  • Author
  • Localization

Wulf, posted Mon Nov 24, 2014 3:29 pm (1967)


ngc, I still haven't given it my best shot but I haven't been able to get Dolphin into any sort of debug setup that I'm comfortable working in. I'm not giving up yet, but if I can't figure out how to set it up how I need, I won't be able to figure it out.

Argonaut, if that was directed at me then I've got too many projects going on to take a look. If it wasn't, you'll probably have better luck creating a new topic.

edit: Just read viewtopic.php?p=1829#p1921 closer, that definitely seems like you're close. I'll take another crack at it tonight, focusing on that area.
Are you working with the US or JPN version mainly? Also, what did you use to generate those charts? They look pretty slick.
  • Author
  • Localization

ngc_kor, posted Tue Nov 25, 2014 11:13 am (1980)


I use JPN version mainly. I've tested the US version to compare with JPN version, but all of routines and structures are the same except that memory location is different.
So all of break point offsets and relevant things that I mention is only work in JPN version. (If you need JPN ver, pm to me)
And I manually created a charts, to show look better. I did not use the program.

Wulf, I'm wondering what you mean to setup the comfortable debug.
I know that Dolphin's basic debug mode isn't good tool.
but it just enough to do simple debugging and there no other options AFAIK.
Or is there any other method to do debugging better?

Wulf wrote:
ngc, I still haven't given it my best shot but I haven't been able to get Dolphin into any sort of debug setup that I'm comfortable working in. I'm not giving up yet, but if I can't figure out how to set it up how I need, I won't be able to figure it out.

Argonaut, if that was directed at me then I've got too many projects going on to take a look. If it wasn't, you'll probably have better luck creating a new topic.

edit: Just read viewtopic.php?p=1829#p1921 closer, that definitely seems like you're close. I'll take another crack at it tonight, focusing on that area.
Are you working with the US or JPN version mainly? Also, what did you use to generate those charts? They look pretty slick.
  • Author
  • Localization

Wulf, posted Wed Nov 26, 2014 9:58 pm (1995)


On the PS3 I'm used to having a lot more information at my disposal.

I can have one large window showing the contents of memory live, highlighting anything that changes in red. Split it vertically, have the left half showing the section of memory that it's reading from and the right showing where it's writing to. Set a breakpoint on reading the source and a breakpoint on writing to the destination, tap F5 a few times to see which registers change and which remain constant. It will also show the contents of the stack, so you can tell that you're 7 functions deep from the main game loop for example.

I suppose that goes beyond the needs of a pure debugger, but it's incredibly useful to have all the information so nicely presented.

Tonight's the night though. After my kid goes to bed I'll sit down and give it my best shot, and declare either victory or failure.

Do you know the NTSC function/memory locations for the information you posted previously? And what's on the screen when it's decrypting the files? During/after the intros, or do you need to press a key first?
  • Author
  • Localization

Wulf, posted Thu Nov 27, 2014 2:58 am (1999)


Well, I'm gonna have to give up on solving this directly. I'm just out of my element with GC/Wii emulation and things aren't clicking together for me.

On your chart, which variables can you identify? Are any of them constant through every run? At what point is the variable encrypted, and at what point is it fully decrypted? If you edit the memory so that the encrypted data is entirely zeros, does the decrypted data have any sort of repeating pattern to it? With 1s? With 2s? If you replace half with 0s and half with 1s, does the decrypted data switch to the 1s pattern at exactly the point the encrypted data did?

If you change the first byte, does the entire decrypted block become corrupted? If you change the middle byte? Does a corrupt byte cause a predictable decrypted corruption?
  • Author
  • Localization

Wulf, posted Mon Dec 01, 2014 2:33 pm (2073)


I guess you've given up on this topic by now. Sorry I wasn't able to help, and good luck figuring it out.
Guest
This topic is now closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.