So, a game that revolves almost entirely around speech and text needs to have a method to run this stuff quickly off the CD. Since Berkeley Systems were mainly Mac guys at the time, they seem to have gone down the route of just calling the System 7 sounds/Pascal strings to play them.
Sure enough, if you find an audio file in an srf (either under the name ‘snd ‘ or with an appropriate name tag for the engine script like ‘Mj19’, the offset and length just point to a perfectly standard SFIL. For the text, again it’s a perfect Pascal String with null termination.
This means that implementing an extractor for this format is largely straightforward, just a matter of converting the SFIL to something playable. The main issue remaining is the graphics file, a format developed seemingly by Twitter UI factotum Tom Wuttke (http://schmail.com/hireme/). I would wager that this is run length encoded, with maybe some pure LZS/Doublespace compression in there too to paint images to screen.
If anyone knows enough about Apple era graphics, I can probably get you a sample together to play with, there’s clearly two different encode methods depending on whether it’s an animation or a static image.