Users browsing this thread: 1 Guest(s)
How i do extract .zda file?
#1
Hello everyone! I'm new here! Do anyone know how to extract .zda file? The .zda files are used in Giana's Return (A Giana Sisters fangame) which the files contain the sprites, sound effects, musics, levels and other stuff. Anyone can help me?

Here's the file : http://www.mediafire.com/file/9s076w7hzp...a.zip/file
Sorry for grammar, i'm french and dysphasia.
Reply
Thanked by:
#2
the files contain zlib compressed data
[Image: 6NVirY3.png]
Can't really say much else right now but it's doable

i'll get back to this
[Image: XezHFxV.gif]
Once there was a way to get back homeward
Reply
Thanked by:
#3
(10-09-2018, 04:01 PM)Raccoon Sam Wrote: the files contain zlib compressed data
[Image: 6NVirY3.png]
Can't really say much else right now but it's doable

i'll get back to this
Thank you so much for helping me ^^ How did you extract the files? Smile

Please tell me how you extract the file Sad I spent many hours to search without finding a way.
Sorry for grammar, i'm french and dysphasia.
Reply
Thanked by:
#4
So, getting back to this.
Quote:Thank you so much for helping me ^^ How did you extract the files?
I had no idea how ZDA files work, but the first thing one should do when working with any unknown file format is to open it with a hex editor. Let's see check out some file:
[Image: H4bbi2L.png]
Right off the bat, you can make out leveldone.xm and getready.xm and some other strings. Let's try adjusting the width of the window so we might align the data in a more readable way:
[Image: PTVjUDq.png]
Much better. Now let's do some analysis.
• The archive is a .zda file. The file's first bytes are 'ZDA'. This could be a file identifier (all other files begin with 'ZDA' too.)
• It appears there are nine filenames listed. The long after the ZDA identifier is 09 00 00 00. Is this a coincidence?
• The long after the 09000000 and just before the first filename is E0 01 00 00. Address 0x1E0 in the file looks like a first instance of "real" data instead of just filename definitions. Is this a coincidence?

Let's lay the bytes down in a bit nicer, neater manner:
[Image: Hz2bh0x.png]

That's good, but because we have a strong suspicion (because of the 09 00 00 00 being 0x9 and E0 01 00 00 being 0x1E0) that we should read Little Endian Longs instead of single bytes, let's shuffle the bytes around a bit:
[Image: 1eLJx4R.png]

It appears that We've got the first three longs figured out, the last thing to figure out is the three unknown values: value1, value2 and value3. Some things to notice:
• value3 is always bigger than the previous one. This suggests it might be a pointer.
• value2 is always bigger than its accompanying value1. This suggests value2 might be 'compressed filesize' and value1 might be 'decompressed filesize'

So let's check 0x1E0 again, because that's where our data starts. To test our hypothesis about value2 being compressed filesize and select exactly 0x49A9 bytes:
[Image: 30DkVUw.png]
Very interesting. Let's try selecting 0x46CE bytes exactly after it, as instructed by the second header.
[Image: UBB8Bnj.png]

I think I get it now! Do you notice anything special?
...
That's right! Every piece of data we've looked so far begins with $78 (or 'x' in ASCII). That's a telltale sign of zlib compressed data. I knew beforehand that all zlib compressed data always begins with $78 but you would've come to the same conclusion had you googled enough.

But yeah anyway now we know*
• How many files are there inside a ZDA package
• Their filenames
• Their filesizes
• Their addresses
• Their compression method
*) or at least have a pretty good hunch

So let's get to decompression. I had never worked with actually decompressing zlib data before so I googled a bit and found out that Python has zlib decompression methods built in. So all in all writing the decompressor was pretty easy - here's the script I wrote.
LINK

After analysing the resulting files, I also concluded that my hunch about the "decompressed size" value was correct.

So, what next? There still appears to be some kind of compression/encryption in the files, as the XM and WAV files don't work. The BMP files aren't legitimate either. What's going on...?

tbh no clue, it looks like some kind of pseudo-BMP. Variable bit depths, reverse row order, all kinds of typical BMP stuff.

Here are all the decompressed files: https://www.dropbox.com/sh/o016vwmxl3foy...xqWPa?dl=0

If we've got some kind of BMP in-house expert, maybe they can take a look. I'm out of ideas, sorry.
[Image: XezHFxV.gif]
Once there was a way to get back homeward
Reply
Thanked by: GianaSistersFan64
#5
(10-14-2018, 10:59 AM)Raccoon Sam Wrote: So, getting back to this.
Quote:Thank you so much for helping me ^^ How did you extract the files?
I had no idea how ZDA files work, but the first thing one should do when working with any unknown file format is to open it with a hex editor. Let's see check out some file:

Right off the bat, you can make out leveldone.xm and getready.xm and some other strings. Let's try adjusting the width of the window so we might align the data in a more readable way:

Much better. Now let's do some analysis.
• The archive is a .zda file. The file's first bytes are 'ZDA'. This could be a file identifier (all other files begin with 'ZDA' too.)
• It appears there are nine filenames listed. The long after the ZDA identifier is 09 00 00 00. Is this a coincidence?
• The long after the 09000000 and just before the first filename is E0 01 00 00. Address 0x1E0 in the file looks like a first instance of "real" data instead of just filename definitions. Is this a coincidence?

Let's lay the bytes down in a bit nicer, neater manner:

That's good, but because we have a strong suspicion (because of the 09 00 00 00 being 0x9 and E0 01 00 00 being 0x1E0) that we should read Little Endian Longs instead of single bytes, let's shuffle the bytes around a bit:

It appears that We've got the first three longs figured out, the last thing to figure out is the three unknown values: value1, value2 and value3. Some things to notice:
• value3 is always bigger than the previous one. This suggests it might be a pointer.
• value2 is always bigger than its accompanying value1. This suggests value2 might be 'compressed filesize' and value1 might be 'decompressed filesize'

So let's check 0x1E0 again, because that's where our data starts. To test our hypothesis about value2 being compressed filesize and select exactly 0x49A9 bytes.

Very interesting. Let's try selecting 0x46CE bytes exactly after it, as instructed by the second header.

I think I get it now! Do you notice anything special?
...
That's right! Every piece of data we've looked so far begins with $78 (or 'x' in ASCII). That's a telltale sign of zlib compressed data. I knew beforehand that all zlib compressed data always begins with $78 but you would've come to the same conclusion had you googled enough.

But yeah anyway now we know*
• How many files are there inside a ZDA package
• Their filenames
• Their filesizes
• Their addresses
• Their compression method
*) or at least have a pretty good hunch

So let's get to decompression. I had never worked with actually decompressing zlib data before so I googled a bit and found out that Python has zlib decompression methods built in. So all in all writing the decompressor was pretty easy - here's the script I wrote.
LINK

After analysing the resulting files, I also concluded that my hunch about the "decompressed size" value was correct.

So, what next? There still appears to be some kind of compression/encryption in the files, as the XM and WAV files don't work. The BMP files aren't legitimate either. What's going on...?

tbh no clue, it looks like some kind of pseudo-BMP. Variable bit depths, reverse row order, all kinds of typical BMP stuff.

Here are all the decompressed files: https://www.dropbox.com/sh/o016vwmxl3foy...xqWPa?dl=0

If we've got some kind of BMP in-house expert, maybe they can take a look. I'm out of ideas, sorry.

Well again, thank you so much Smile I guess my dream come true ^^... Anyway, we can open the BMPs on GGD. Sadly the original pallet isn't there and it give a bad pallets that make the picture like it was saved as a JPG format. I don't know what much to says about it, but i guess i must do myself Sad But don't worry, it's fine.

By the way, what is the .py for?


Attached Files Thumbnail(s)
   
Sorry for grammar, i'm french and dysphasia.
Reply
Thanked by: Raccoon Sam
#6
Quote:Well again, thank you so much. I guess my dream come true ^^... Anyway, we can open the BMPs on GGD.
It seems like the best choice as of now. If not for ripping, at least for analysis.
Quote:Sadly the original pallet isn't there and it give a bad pallets that make the picture like it was saved as a JPG format. I don't know what much to says about it, but i guess i must do myself Sad But don't worry, it's fine.
The palettes have got to be _somewhere_ I figure. Some images probably don't have palette at all, being direct color images, as in the color data is in the pixel itself.
Quote:By the way, what is the .py for? 
It is a Python script I wrote that handles the decompression of the ZDA files.
[Image: XezHFxV.gif]
Once there was a way to get back homeward
Reply
Thanked by:
#7
(10-15-2018, 01:46 AM)Raccoon Sam Wrote:
Quote:Sadly the original pallet isn't there and it give a bad pallets that make the picture like it was saved as a JPG format. I don't know what much to says about it, but i guess i must do myself Sad But don't worry, it's fine.
The palettes have got to be _somewhere_ I figure. Some images probably don't have palette at all, being direct color images, as in the color data is in the pixel itself.

Most of them have pallets but the picture messed up like glitched. These BMPs are in 8bit.
Sorry for grammar, i'm french and dysphasia.
Reply
Thanked by:


Forum Jump: