Jump to content

Recommended Posts

  • Members
Posted

Author: Myro
Scope: Educational & research-oriented reverse engineering
Status: Ongoing research

 

Introduction:

  • This thread documents an ongoing technical research project focused on analyzing the MPK container system used by Where Winds Meet (WWM).
    The goal is to understand the structure, behavior, and data flow of the game’s asset containers from a reverse-engineering and archival perspective, strictly for educational and analytical purposes.

    No proprietary assets, binaries, or tools will be distributed as part of this work.

1. Patch*#.mpk vs Resource*#.mpk — Structural Similarity, Semantic Differences

At a container level, Patch#.mpk* and Resource#.mpk* share the same base format:

  • Indexed via mpkinfo / mpkdb
  • Entries contain: 
    • FileID
    • Size
    • Offset
    • MpkIndex
  • Use the same compression stack:
    • LZ4 / ZSTD / LZMA
    • EZST (AES-encrypted)

However, despite this structural similarity, their extraction semantics differ significantly.

 

2. Patch*#.mpk — Partial Transparency, Direct Asset Access

Patch MPKs behave primarily as delta / override containers:

  • Assets retain near-original offsets
  • Internal file structures remain largely recognizable
  • Minimal indirection compared to Resource MPKs

As a result:

  • ~99% of audio assets can be recovered successfully
  • A subset of Lua scripts becomes readable after decompression
  • Non-audio assets (including some images) are present and partially accessible

Testing on Patch3100.mpk confirmed that Patch archives are not limited to sound assets.

Conclusion:

Patch MPKs prioritize update efficiency and fast access, with limited transformation of payload data.

 

3. Resource*#.mpk — Indirection and Shared Data

Resource MPKs represent the primary asset pool and introduce several additional complexities:

  • The Size field often represents a logical or expanded size, not the physically stored payload
  • Naïve extraction pipelines tend to:
    • materialize padding and alignment blocks
    • duplicate shared data blobs
    • inflate archive size artificially (e.g. ~2 GB → tens of GB)

This behavior explains why Resource MPKs appear far more “opaque” when extracted without validation.

Important observations:

  • Fixed headers such as 02 02 02 01 … frequently represent logical structures (e.g. texture containers with multiple mip blocks)
  • Headers of the form size (4 bytes) + size × 20 bytes often describe lookup tables, not standalone assets
  • Many assets are shared, aliased, or referenced indirectly

Conclusion:

While Patch and Resource MPKs share the same container format, they do not share the same extraction semantics. Resource MPKs require strict validation, deduplication, and asset-type verification to avoid false positives.

 

4. Audio Extraction & Tooling

Initial testing with Ravioli Game Tools proved sub-optimal for this pipeline:

  • Inconsistent parsing of Patch audio assets
  • Unreliable WAV output in multiple cases

Switching to vgmstream yielded significantly better results:

  • Correct parsing of recovered audio data
  • Clean, fully functional WAV output
  • Proper handling of codec structures

Current result:

Audio extraction and conversion are fully verified and stable.

 

5. Resource.mpk Extraction – Current Progress

Testing on Resources49.mpk (~740.9 MB) produced the following results:

Successfully extracted:

  • Audio assets → converted to valid .wav files
    • Over 8,500 audio files identified in a single Resource MPK
  • Video files (.mp4) → fully playable
  • Lua scripts (.lua) → extracted but still encrypted / obfuscated

Unresolved:

  • Files detected as images (.png, .jpg, .bmp)
    • Headers contain the string “messiah”
    • Strongly suggests custom packing, encryption, or misclassification
    • Currently not valid image data despite file extensions

6. Current Status

  • ✔️All Patch*#.mpk, Resource*#.mpk, and lt*.mpk archives can be unpacked
  • ✔ Audio assets fully recovered and validated
  • ✔ MP4 video assets fully playable
  • ✔ Lua scripts extracted but not yet readable

 

7. Next Steps

Ongoing research will focus on:

  • Reversing additional transformation layers in Resource*#.mpk
  • Identifying:
    • secondary encryption stages
    • compression chaining
    • block or stream reordering
  • Investigating the custom image container format
  • Further analysis of Lua encryption / obfuscation
  • Automating Patch vs Resource handling as two distinct pipelines

 

 

Notes on Tooling & Distribution

 

All tools used in this project are:

  • developed entirely from scratch
  • created specifically for WWM research
  • used strictly for educational purposes

At this stage:

  • no tools, binaries, or scripts will be released
  • no proprietary assets will be redistributed

 

Closing

This thread is intended solely as a technical research and documentation log, not as a release, redistribution, or exploitation guide.

In accordance with applicable intellectual property laws and forum policies, no tools, binaries, scripts, or extracted assets will be posted or shared, either publicly or privately. All game data, assets, file formats, and related materials discussed here remain the exclusive proprietary property of NetEase and the developers of Where Winds Meet.

Any references to assets or file structures are made strictly for educational, analytical, and research purposes, with no intent to enable misuse, circumvention, or redistribution of protected content.

Should any concerns arise regarding compliance or scope, I am fully open to cooperating with forum moderation and adjusting the visibility or content of this thread accordingly.

Thank you for your understanding and for supporting responsible technical research.

— Myro

 

Below are a few illustrative screenshots and log excerpts from the current stage of the research, provided solely to contextualize the findings outlined above.

 

 

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

  • Like 6
Posted

Regarding Asset Names
Asset names are not stored in plain text in *.mpkinfo files; instead, only their hashes are saved.
Not all Assets seem to use the same hashing algorithm. 

Found Hashing Algorithm:

  • Custom MurmurHash3 (32-bit, x86 variant)

Implementation Details:

  • Input strings are encoded in UTF-8 before hashing.
  • Block final addition changed from "0xe6546b64" to "0xFADDAF14"

 

I've provided a python script that can hash any strings using MurmurHash3 and output the result in hexadecimal.
With this, I was able to recover ~40k filenames.

I've verified the names by manually checking random results inside the given ResourcesX.mpk for certain .lua and .wem files.

murmur3.py partial_recovered_filenames.zip

  • Members
Posted
18 hours ago, iKasu said:

Regarding Asset Names
Asset names are not stored in plain text in *.mpkinfo files; instead, only their hashes are saved.
Not all Assets seem to use the same hashing algorithm. 

Found Hashing Algorithm:

  • Custom MurmurHash3 (32-bit, x86 variant)

Implementation Details:

  • Input strings are encoded in UTF-8 before hashing.
  • Block final addition changed from "0xe6546b64" to "0xFADDAF14"

 

I've provided a python script that can hash any strings using MurmurHash3 and output the result in hexadecimal.
With this, I was able to recover ~40k filenames.

I've verified the names by manually checking random results inside the given ResourcesX.mpk for certain .lua and .wem files.

murmur3.py 1.36 kB · 5 downloads partial_recovered_filenames.zip 1.4 MB · 2 downloads

That's nice. Good job and thanks for sharing that. I knew that the assets have only the hashes saved as "plain text" but I did not tought that for now is something very important that i should try to fix. Thanks again for the script and info, i will adapt it further. 

  • Like 1
Posted

I wanted to ask if you found some things in the files related to the AI chat bots from the game that run on Qwen.
First if by chance the language model runs locally, but surely it doesn't.
What I want to find is the initial "context" that each NPC has with their name, story, personality, etc. that is sent to the language model to start the conversation, if by any chance these exist inside the local files or the game just sends an identifier to a server that runs everything there.

Posted
21 hours ago, WuxiaReti said:

I wanted to ask if you found some things in the files related to the AI chat bots from the game that run on Qwen.
First if by chance the language model runs locally, but surely it doesn't.
What I want to find is the initial "context" that each NPC has with their name, story, personality, etc. that is sent to the language model to start the conversation, if by any chance these exist inside the local files or the game just sends an identifier to a server that runs everything there.

It looks like there is a model locally -> LocalData/Patch/AILab/middle.mnn which looks like a alibaba Mobile Neural Network.
I've failed to get the locally available Model to run, but haven't spend that much time on it. I'll definately take a look again!

The Windtail FAQ, unfortunately, doesn't run locally. Each time you talk to FAQ, a request is send to a Web Server. This Request contains a short lift token
generated from the game server itself via RPC. Here is a code snippet from my Test Suite:

Spoiler
@dataclass
class Knowledge:
    id: str
    title: str
    recommendType: str
    evaluateType: str

    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> 'Knowledge':
        return cls(
            id=data.get('id', ''),
            title=data.get('title', ''),
            recommendType=data.get('recommendType', ''),
            evaluateType=data.get('evaluateType', '')
        )

@dataclass
class FAQData:
    knowledge: Optional[Knowledge]
    score: float
    type: str
    answer: str
    hasSensitiveWords: bool
    answerSource: str
    asyncQues: bool

    @property
    def formatted_answer(self) -> str:
        if not self.answer:
            return ""
        return convert_to_ansi(self.answer)

    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> 'FAQData':
        return cls(
            knowledge=Knowledge.from_dict(data.get('knowledge', {})) if data.get('knowledge') else None,
            score=data.get('score', 0.0),
            type=data.get('type', ''),
            answer=data.get('answer', ''),
            hasSensitiveWords=data.get('hasSensitiveWords', False),
            answerSource=data.get('answerSource', ''),
            asyncQues=data.get('asyncQues', False)
        )

@dataclass
class FAQResponse:
    code: int
    message: str
    data: Optional[FAQData]

    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> 'FAQResponse':
        return cls(
            code=data.get('code', 0),
            message=data.get('message', ''),
            data=FAQData.from_dict(data.get('data', {})) if data.get('data') else None
        )

class WWMFAQ:
    def __init__(self):
        self.http_client = HTTPClient(headers={
            "Host": "os-sprite.gameop.easebar.com",
            "Connection": "close",
            "Content-Type": "application/json",
            "token-type": "sprite",
            "Accept-Encoding": "identity",
            "User-Agent": "curl/7.29.0",
            "token": <SHORT_TOKEN>,
            "Accept": "*/*"
        })

    async def get_faq(self, question: str) -> Optional[FAQResponse]:
        data = {
            "source": "sprite",
            "question": question,
            "ismanual": 1,
            "customInfo": {
                "mode_level": "<YOUR_LEVEL>",
                "tracing_task": "0",
                "area_tansuo": "{2: <?>, 1: <?>}",
                "player_position": "[<POS_X>, <POS_Y>, <POS_Z>]",
                "mode": "<SINGLE_MULTIPLAYER>",
                "wulinzaoyi": "<GEAR_SCORE>",
                "space_no": "<SPACE_ID>"
            },
            "loginFrom": "sprite",
            "method": None
        }

        response = await self.http_client.post("https://os-sprite.gameop.easebar.com/sprite/api/h72naxx2gb/knowledge/get", data=json.dumps(data))
        if response:
             return FAQResponse.from_dict(json.loads(response))
        return None

 

 

Posted

Yeah, the chance the conversations run locally is almost none. Interesting that the FAQ grabs your position when you ask something.
That middle.mnn might be the one used to transform videos into emotes for the character, a feature still not available on global, nothing seems to break if I straight up delete it.

  • 2 weeks later...
Posted

I joined a bit late. I’m currently struggling through the encoding chaos, trying to extract the .lua files. I can see that there are some scripts, but I can’t access them at the moment. Could someone provide a Python script to extract Lua files from an MKP archive? I’d really appreciate it.

  • 4 weeks later...
  • Members
Posted
On 12/20/2025 at 11:59 AM, Myro said:

Unresolved:

  • Files detected as images (.png, .jpg, .bmp)
    • Headers contain the string “messiah”
    • Strongly suggests custom packing, encryption, or misclassification
    • Currently not valid image data despite file extensions

Hi, from what i know, .MESSIAH headers (2E4D455353494148) are the 3D models of the game. This tutorial on how to extract the models and textures talks about it. I'm not sure if every .MESSIAH is a model, but every model starts with .MESSIAH from what i saw. They are .mesh and can be cut out with VGMToolbox.

  • Like 1
  • Supporter
Posted
On 1/3/2026 at 7:15 AM, iKasu said:

Regarding Asset Names
Asset names are not stored in plain text in *.mpkinfo files; instead, only their hashes are saved.
Not all Assets seem to use the same hashing algorithm. 

Found Hashing Algorithm:

  • Custom MurmurHash3 (32-bit, x86 variant)

Implementation Details:

  • Input strings are encoded in UTF-8 before hashing.
  • Block final addition changed from "0xe6546b64" to "0xFADDAF14"

 

I've provided a python script that can hash any strings using MurmurHash3 and output the result in hexadecimal.
With this, I was able to recover ~40k filenames.

I've verified the names by manually checking random results inside the given ResourcesX.mpk for certain .lua and .wem files.

murmur3.py 1.36 kB · 28 downloads partial_recovered_filenames.zip 1.4 MB · 43 downloads

According to this method, some recovered file names are still not final, such as uuid names, but it provides significant help in recovering the original file names from the resource library, and eventually the 32-bit uuid will be mapped to another path. The path is extracted from the repository and the directory hierarchy is rebuilt based on the type index and folder index.

However, since the library files for this game are different from other games, I'm still trying to process them faster since I need to resolve hash collision issues. I'm currently collecting samples of possible judgments.

For example, texture data must start with 02020201. When matching, it is not this magic number, indicating a hash conflict.

I've recovered the real names of about 900,000 files this way and tried mapping them to local paths

Attached is a mapping example for 50,000 files. Since there are approximately 1 million complete files, the file content is too large to be uploaded.

uuid_full_mapping.zip

  • Like 1
  • Thanks 1
  • Supporter
Posted
8 hours ago, dsags said:

I'd like to ask which package the model with bones belongs to? Or what its prefix is?

 

I think it is in the same location as the basic mesh. If the mesh has the W4B_I4B attribute, it is a mesh with skinned data. Generally, in the directory with the original file name, there will be skinned files with similar names and different suffixes.

like this

xianhe/

xianhe/Mesh_xianhe.Mesh

xianhe/Skin_xianhe.SkinSkeleton

xianhe/xianhe_m.Texture2D

Have textures, meshes, and skins in the same folder

Of course, sometimes some files may be missing

  • Like 1
  • Supporter
Posted
1 hour ago, dsags said:

I still can't figure it out. I still can't merge the model and the skeleton together. Here's an example; hopefully someone can help me.

mesh.rar 43.5 MB · 0 downloads SkinSkeleton.rar 365.38 kB · 0 downloads

It is difficult to find the corresponding skeleton in the file you provided because you did not restore the file name. The script will read the associated files through files with the same name but different endings. For hash file names, the script is basically unable to match them.

The second is that none of the skinning folders you provided is a skinning file.

ParticleSystem starts with particles

C1 59 41 0D starts with the animation skeleton, which is two completely different things from the skinned skeleton.

Okay, now comes the hardest part

Since this game has a special mesh and may use compression, you have to find a way to match the index and vertex data in the file header

The second is to reverse engineer the skin file to find the correct matrix, the most important thing is the level, which will directly affect how you connect the skeleton

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...