Jump to content

Shin'en GAX Sound Engine - .ghx2 file format


Recommended Posts

For a while now I have been trying to reverse the GAX Sound Engine .ghx2 project file format, and I currently have a roadblock regarding the compression method used for the sequence data.

From what I know the format was used with:

  • The GAX editor program (GHX2), which was used to create and export the tracks. A screenshot of this program (taken by an employee at Shin'en) was uploaded by ItsT3K on Twitter (now X). In the screenshot, this program takes in these .ghx2 files (in this case: Iridion 24.ghx2).
  • and GAXPlay: a program that reads these files and plays them back (with mix-rate adjusting and GBA speaker emulation).


I will be using my GitHub documentation + new stuff I found for this.
This is what I figured out so far:

  • The file is little-endian. Looking through GAX.ocx in Ghidra leads me to think this is built using the Windows CArchive format.
  • The file uses a RIFF-type file structure, though it's not actually RIFF, since it's missing the RIFF header. Certain tags in the LZO-compressed data stick out to me being similar to tags in a RIFF-formatted file: stuff like "AUTH", "SONG", "EXP", "INST", and "WAVE".
  • The files start with the file magic "GHXC". After this is the uncompressed data size / checksum(?) stored in 2 dwords. There may be a checksum because when I tried manually modifying the LZO data (i.e changing one letter in a song's name), GAXPlay gives me an error message.
  • The compressed data starts with the ASCII string "LZOD", implying the compression scheme to be LZO-equivalent.
  • The compressed data size is stored as a d-word. After this is an unknown uint16 value. In the many files I saw, these can be 0x1900, 0x1500, etc.
  • The file ends with a 5-byte footer: 0x0000110000.

The compression used may be a variant of the LZO algorithm that hasn't been documented yet, or at least I don't think it has been. The especially odd thing about this is that the song strings are mostly left intact, but some of them are partially garbled due to the compression. For example, "Beat the stage boss fanfare" becomes "Beat  Ñs#ðí boss fanfar  "

The song data is partially documented, but it is in bits and pieces since the compression makes it difficult for me to line things up.
I used QuickBMS to try to recover the uncompressed data; specifically the comtype_scan2.bms script, and none of the test outputs seem to be correct: The samples are still garbled to hell and back and none of the data seems to make sense.

Regarding the compression, I cannot seem to figure out what compression type (or LZO algorithm) this file uses. Any pointers or tips are greatly appreciated.

 

Quick edit: The file format samples are below.

ghx2.zip

Edited by beanieaxolotl
Grammatical errors fixed
  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...