BetterDays Posted October 24, 2024 Share Posted October 24, 2024 Hey, I'm looking into building a reliable databin (un)packer for Ninja Gaiden Sigma (and later 2&3). Reverse engineering the actual databin file itself was fairly straightforward. It doesn't however contain any names for archived content. Looking at the executable itself, a complete name table exists in the PE .rdata. Being rdata though, it is interspersed with unrelated content. Manually removing that yields the correct number of entries (6515, as per the archive header), but I have concerns over ordering and would preferably like to automate this process. Does anyone have any experience with such a situation? How are the strings in the rdata actually referenced? Thanks in advance Link to comment Share on other sites More sharing options...
BetterDays Posted October 25, 2024 Author Share Posted October 25, 2024 (edited) I've managed to complete the content unpacking and repacking, and determined content-type hints for the majority of the file types. Additionally, I've got the algorithm for Index -> Name working, all I need now are the actual file names. I've located the array in the exe's .data section, which references the string data from the .rdata section in correct order. Not being especially familiar with disassembly or the PE format, does anyone know how I can calculate the file offset of the .rdata entry from the address present in the .data section (as displayed in the hex view)? I'm essentially looking to dump ordered names from the two respective file tables. Edited October 25, 2024 by BetterDays Link to comment Share on other sites More sharing options...
Solution BetterDays Posted October 26, 2024 Author Solution Share Posted October 26, 2024 Success. Full name tables dumped for NG1-3, with a reasonable degree of automation. Ultimately landed up parsing the EXEs PE headers, and then doing something like this: inline uint64_t NameTableExtractor::calculateRDataAddress(const uint32_t relative_to_data) const { return relative_to_data + (pe_info.rdata_raw_addr - pe_info.rdata_virt_addr); } inline uint64_t NameTableExtractor::calculateDataAddress(const uint32_t relative_to_text) const { return relative_to_text + (pe_info.data_raw_addr - pe_info.data_virt_addr); } inline uint64_t NameTableExtractor::extractOffsetDQAddress(uint8_t op[8]) const { return (op[2] << 16) | (op[1] << 8) | op[0]; } The dumper just scanned the .rdata section, finding strings that conformed to the file name convention. These where then correlated, by address, with their reference locations in the data sections. Entries where clustered and indexed by their reference location, and a second more permissive name matching pass was run on gaps within clusters just to make sure any paths that didn't contain a separator where filled in. Still a lot left to do. I've got preliminary big endian support for unpacking archives from console versions, and identified all content-type hints within the archive entries. A lot of the art-assets share a similar header structure, and I'm pretty far along with the overall structure of those file types. Given the popularity of some of the formats, it shouldn't take too long to get decent bi-directional conversion done. Then it's just a matter of handling game-specific data, like movesets. 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now