Jump to content

Ninja Gaiden Sigma - Archive name table extraction


BetterDays
Go to solution Solved by BetterDays,

Recommended Posts

Hey, I'm looking into building a reliable databin (un)packer for Ninja Gaiden Sigma (and later 2&3). Reverse engineering the actual databin file itself was fairly straightforward. It doesn't however contain any names for archived content. Looking at the executable itself, a complete name table exists in the PE .rdata. Being rdata though, it is interspersed with unrelated content. Manually removing that yields the correct number of entries (6515, as per the archive header), but I have concerns over ordering and would preferably like to automate this process.

Does anyone have any experience with such a situation? How are the strings in the rdata actually referenced?

Thanks in advance

Link to comment
Share on other sites

I've managed to complete the content unpacking and repacking, and determined content-type hints for the majority of the file types. Additionally, I've got the algorithm for Index -> Name working, all I need now are the actual file names. I've located the array in the exe's .data section, which references the string data from the .rdata section in correct order.

Not being especially familiar with disassembly or the PE format, does anyone know how I can calculate the file offset of the .rdata entry from the address present in the .data section (as displayed in the hex view)? I'm essentially looking to dump ordered names from the two respective file tables. 

ida.png

Edited by BetterDays
Link to comment
Share on other sites

  • Solution

Success. Full name tables dumped for NG1-3, with a reasonable degree of automation. Ultimately landed up parsing the EXEs PE headers, and then doing something like this:

inline uint64_t NameTableExtractor::calculateRDataAddress(const uint32_t relative_to_data) const
{
    return relative_to_data + (pe_info.rdata_raw_addr - pe_info.rdata_virt_addr);
}

inline uint64_t NameTableExtractor::calculateDataAddress(const uint32_t relative_to_text) const
{
    return relative_to_text + (pe_info.data_raw_addr - pe_info.data_virt_addr);
}

inline uint64_t NameTableExtractor::extractOffsetDQAddress(uint8_t op[8]) const
{
    return (op[2] << 16) | (op[1] << 8) | op[0];
}

The dumper just scanned the .rdata section, finding strings that conformed to the file name convention. These where then correlated, by address, with their reference locations in the data sections. Entries where clustered and indexed by their reference location, and a second more permissive name matching pass was run on gaps within clusters just to make sure any paths that didn't contain a separator where filled in. 

Still a lot left to do. I've got preliminary big endian support for unpacking archives from console versions, and identified all content-type hints within the archive entries. A lot of the art-assets share a similar header structure, and I'm pretty far along with the overall structure of those file types. Given the popularity of some of the formats, it shouldn't take too long to get decent bi-directional conversion done. Then it's just a matter of handling game-specific data, like movesets.

archiver.png

  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...