Introduction

This is going to be a repository for my findings as I start to pull apart the original You Don’t Know Jack game engine (as first seen in 1995).

The games use a combination of audio and basic animation, but run on low spec PCs and the early Power Macs. All the data is stored in a special archive format (srf), which looks like it can embed the audio and video with a good compression rate.

So far, here’s what I know about the SRF setup (as posted to http://wiki.xentax.com/index.php?title=You_Dont_Know_Jack )

Format Specifications

char {4}   - Header (srf1)
uint32 {4}   - Archive Size

// for each file

uint32 {4}   – File Size (including these two 4-byte fields)
char {4}     – File Type/Extension (32 terminated)
byte {X}     – File Data

Essentially, each file has a header that points to all of the resources, which store the name, size and offset of the file in question, permitting separate streams to be recorded in the same file. It’s like a standard Mac resource file, but grouped by type, and shorn of attributes (since these are read only).

So far I’ve noticed three filetypes stores – a simple text string, a file format similar to the “snd ” or .sfil format for Macs, and something that looks vaguely like a quicktime RLE animation. As a Windows user, .sfil is hard to work with, so I’m currently looking into converters (I know the newer Quicktime for OSX no longer supports it). In my next post I’ll dig into more detail for these files.

Introduction

Off3 – the original file format

Well, I sure neglected this, didn’t I? I have very good reasons for that, but none of which I really want to share (suffice it to say, when the in tray needs guy ropes to secure the contents, time spent looking at old multimedia titles has to be reduced).

I thought I’d jump back to the very first US release, which was one step up of the original, Hypercard based setup. As a result of its origins, it uses an early variant of the method used to store images and text scripts (off3 rather than off4 – which actually represents 300 vs 400, but that’s by the by).

Just treating this as a standard off4 is immediately problematic, as it’s quite clear that the image data is not present in the file at all, the filesize is way too small. Thinking I’d have to reverse engineer the whole thing, this is where I got stuck. Then I remembered an old trick – sometimes the demos have more debug info than the production builds, usually because they’ve been put together in a rush. Sure enough, in the marketing file that contains the purcahse info, a mysterious TMPL file was present with this info:

Offset Version   (300 for now)DWRD,

Block Word Size (usually 6, can be 12 or 18)DWRD

Number of Art StripsOCNT

***** LSTC

Strip Type (1 = RLE)DWRD

Strip Res ID DWRD

*Low Cast Member (Maps To frame 0 of strip)DWRD

High cast member DWRD

LSTE

Number of FramesOCNT

*****LSTC,

Block Offset of Frame Info (0 = empty frame)UWRD

LSTE

6-byte blocks (1-based indexing)LSTB

Frame Offset, or FlagsUWRD

Bounds, left DWRD

Bounds, top DWRD

Bounds, right DWRD

Bounds, bottom DWRD

Count, or Cast# DWRD

LSTE

 

So this appears to be a long and complex list, but essentially we can take this apart step by step. The format has been designed to be more flexible than needed, and I can see why they cut a lot of this out in later versions. Each file can specify slightly different compression sizes, and make reference to multiple ‘art strips’ – which are pointers to another file in the RLEP resource list. Each list can specify different compression types, but the main one is the RLE we already know. This then points to the filename of the RLEP file with the info, and where the first frame is in the file. The rest is reasonably self explanatory, listing where the frames are, any offset for the data from the RLEP, with the flags and dimensions as before.

I haven’t yet incorporated all this into code, because I haven’t begun to look at the RLEP but it sounds very similar to the original setup, with lots of little loops. Should be fun to actually get this in, then we should eb able to browse all YDKJ 1-4 assets, and start thinking about drawing them together into an engine.

Off3 – the original file format

Eye halve a spelling chequer

You may notice for fill in the blank type questions that spelling doesn’t really matter for the answers. This is all done behind the scenes by a fairly complex parser. Looking at the SRF, the data is all held in a list associated with the ‘Wrds’ resource

The content of the Wrds is a standard string list, split at null characters (byte =0), However, there are additional factors to take into account. The description is complex. The first byte of the whole file represents the total number of words in the answer (i.e. the number of times we need to check the spellings list to be sure everything is there in a typed answer). Thereafter, different properties are required for each string. The first string in an an entry has as its first byte the number of potential alternatives for a word, with the rest of the string being one of these alternatives.

After sorting this out, you can make direct comparisons – take each word in the typed answer and compare it to the list of alternatives for each entry – if you get a match for every one, job done!

Other ‘wrong answers’, the text field for typing in too early and the Secret Gibberish Response are just done as straight comparisons, so this isn’t used in those cases.

Eye halve a spelling chequer

Bug in The Ride

This is a repost of something I wrote in response to something asked during the 20th Feb Jackbox Games stream. The Ride engine is the most complicated of all the available YDKJ games, and as far as I can tell is the only one with a resource issue.

So, for the benefit of those who care, here’s the problem:
Floors on The Ride are built in individual folders, containing all the questions they need and a header file that says which questions should play, all with appropriate unique IDs. The game engine is built to read all game folders if a file is missing from where it should be.
For some reason, the Games floor folder (hGA) is missing its Jack Attack, and the header states the file should be JBT. JBT is used as an ID for the Jack attack on the literature floor (hLI), so the game works, albeit with the wrong attack. The correct file must exist somewhere at JackBox, but doesn’t appear in the released resources, and looks like it would have been a gamebreaker if it wasn’t for the ability to read files outside of the designated area.

So, in summary, the correct file is missing, it’s dumb luck that it works at all, and I can’t fix it.

Bug in The Ride

Entente Cordiale?

I’ve just found out that someone is doing what I’m doing, only better and in Pascal. Small problem, while like all cosmopolitan Euro-types I am conversant in a language that isn’t English, French is not it.

However Yann of http://www.ydkj.fr speaks far better English than I do French, so hopefully we’ll be able to collaborate (I know he can extract animations with his tool, which of course we can’t).

For the post that I found (in French): http://www.mwyann.fr/posts/723

Entente Cordiale?

A guide to annoying the YDKJ hosts (or, the Secret Gibberish Response explained)

Everyone knows about the SGR (indeed I’ve made a post on this already), but you may be intrigued as to how the game processes this. The below refers to anything based on the original YDKJ engine or derivative (so anything apart from the Ride, 5th Dementia and the titles post The Lost Gold in the US, or any of the first editions of the overseas releases, and the German volume 2)

For any other text comparison, this is done based on the content of the relevant SRF (for spell checking or alternative choices), but the SGR is actually hardcoded into the EXE near the gibberish routine (search for Gibber.srf in the exe for the original version, or the .jbg file in the rerelease).

In the US versions, don’t expect to see this in plain text, for obvious reasons it’s all encrypted, although you will see the list of names that the host can change yours to nearby.

Before the ‘This question has too many spelling variations’ text is a string something like “P_MU*cY_” (for YDKJV3).

The encryption for the SGR is a 10 position Caesar cipher (or shift cipher if you will), with every character shifted back 10 places, and a ‘*’ representing a space.

Use this as the string for conversion, and you should be fine:

ABCDEFGHIJKLMNOPQRSTUVWXYZ!ӣ$_%abcdefghijklmnopqrstuvwxyz

Without openly swearing here, you should see that P becomes F, _ becomes U and so on.

This has finally enabled me to work out the SGR for the UK version – it’s F**k Off, rather than the usual.

For the Ride/Abwarts, there are so many, and they’re in plain text, so a simple search wll make them obvious.

In the ride, this is just a pure string comparison

A guide to annoying the YDKJ hosts (or, the Secret Gibberish Response explained)

Soft launch

For those reading this right now, I have a Windows compatible version of the extractor in source form at https://github.com/james-wallace-ghub/ydkjx – currently this has only been tested with Windows 64 bit, but the plan is to make this scriptable for all platforms. Not particularly runnable, but at least people can see it.

It’s worth looking at the github, there’s a sample of the formats I currently don’t know anything about – the RLEP and the off3 formats. Ones clearly animation, one’s a generic object format.

Soft launch

SRF Extractor

As I now know enough about this to reliably decode most of the SRFs that are out there (no animations, but I will eventually work on that, I hope) – I’ve been working on a little tool that lets someone load a file, save out what’s relevant and play the sounds. I know there are a few tools out there, but I’m trying to make this one multiplatform through Java. Currently I have a proto build working on Windows that plays the PCM coded audio, but the finished version will play the compressed speech too, even if I have to decompress on the fly.

More on this as I get it together, but SWT isn’t exatcly something that lends itself to multiplatform, and I want to get this running on as many systems as I can, including OSX.

SRF Extractor

Caution: contains strong language

So, having started to play with the early build of the YDKJ SRF extractor and started to notice a few things. Firstly, the very first version of the YDKJ game engine attempts to make it harder to use the SRF sound content for certain files by changing the Apple codec id to YDKJ (for the record, it should be ima4, as these are Apple IMA ADPCM files). Not to difficult to work around. The other thing is that the question pack ‘upgrade’ actually censors part of the game. With the standard warning that all of the files in question, and the explanation of what occurs contains offensive material, you can make the comparison yourself at The Cutting Room Floor.

Many of the other titles have this easter egg in a clean form, but the code to trigger it is different for the foreign langauge editions, obviously.

Caution: contains strong language

Sound (and text) on, vision… nah.

So, a game that revolves almost entirely around speech and text needs to have a method to run this stuff quickly off the CD. Since Berkeley Systems were mainly Mac guys at the time, they seem to have gone down the route of just calling the System 7 sounds/Pascal strings to play them.

Sure enough, if you find an audio file in an srf (either under the name ‘snd ‘ or with an appropriate name tag for the engine script like ‘Mj19’, the offset and length just point to a perfectly standard SFIL. For the text, again it’s a perfect Pascal String with null termination.

This means that implementing an extractor for this format is largely straightforward, just a matter of converting the SFIL to something playable. The main issue remaining is the graphics file, a format developed seemingly by Twitter UI factotum Tom Wuttke (http://schmail.com/hireme/). I would wager that this is run length encoded, with maybe some pure LZS/Doublespace compression in there too to paint images to screen.

If anyone knows enough about Apple era graphics, I can probably get you a sample together to play with, there’s clearly two different encode methods depending on whether it’s an animation or a static image.

Sound (and text) on, vision… nah.