News:

Upgrading to the new 2022 2.1 SMF core (years in WIP)! Mobile/Phone support by default, finally! I must port all my custom features and redo the theme however, but worth the trouble! I may still look at XenForo too.

Main Menu

Xanadu II Translation Development Blog

Started by elmer, August 31, 2015, 11:50:09 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

elmer

Quote from: NightWolve on September 30, 2015, 01:59:28 PMBut yeah, it seemed a nightmare at first given you'd have to rebuild all of them and it took me a while to come up with a clever solution to handle it.
I'd be curious to see a snippet from one of those script files, if you'd like to post one.

***********************

I'm beginning the process of dumping the Xanadu 2 scripts now, and it's already showing that the script really is a fully-fledged programming language.

Much to my disappointment, they're also interleaving "script" code with "assembler" code.

That means that I may have to add a complete HuC6280 assembler/disassembler into the translation tool!  ](*,)

At this point I'm very curious as to how Falcom originally wrote this whole thing.  :-k

I'd guess that it was all done with a good macro-assembler ... but if so, it was either a custom-developed one in order to deal with their text-encoding system, or they did a lot of cut-n-paste with both SJIS and "encoded" text string in the same file.

Anyway, here's just a small section of the very first script chunk in the game ...

$a6a3 .scriptA6A3:
$a6a3   _enable_8x12_font()
$a6a5   _set_pen_then_call_then_eol( orange, .scriptAE05 )
$a6a9   _disable_8x12_font()
$a6ab   _tst_2b03_x_bnz( $01, $02, .scriptA877 )
$a6b0   _tst_2b03_x_bnz( $01, $10, .scriptA748 )
$a6b5   _tst_2b03_x_bnz( $01, $20, .scriptA71B )
$a6ba   _tst_2b03_x_beq( $20, $01, .scriptA8AB )
$a6bf   _tst_2b03_x_beq( $20, $02, .scriptA8AB )
$a6c4   _tst_2b03_x_beq( $20, $04, .scriptA8AB )
$a6c9   _tst_2b03_x_beq( $20, $08, .scriptA8AB )
$a6ce   {アリオスさま、準備が整ったようですな。そういえば、航海長が}
$a6f1   _eol()
$a6f2   {お話があるとのことです。}
$a6ff   _wait_for_keypress_then_clear()
$a700   {後部甲板に行ってみてはいかがですか?}
$a717   _set_bits_2b03_x( $01, $20 )
$a71a   _wait_for_keypress_then_end()

NightWolve

#51
Quote from: elmer on September 30, 2015, 03:05:03 PM
Quote from: NightWolve on September 30, 2015, 01:59:28 PMBut yeah, it seemed a nightmare at first given you'd have to rebuild all of them and it took me a while to come up with a clever solution to handle it.
I'd be curious to see a snippet from one of those script files, if you'd like to post one.
Sure, let's pick one from Felghana. So the script was expanded out to 1,766 .XSO files when they used to use 1 or 2 files for Ys I & II Complete... Not every XSO has S-JIS text in it though, so you can eliminate a hundred or so that don't have it. But yeah, I always wondered why they did that, if it was intentional to possibly make the job of fan translation tougher...

IMG

So that's after it was ZLIB decoded/decompressed - the files had a .Z extension if compressed.

Here's the whole file:

http://www.mediafire.com/download/hqzh8xd9hedzmdu/TALKRANDOLF.XSO

I dunno if you have a hex editor with S-JIS-to-Unicode mapping to allowing easy viewing so I took that little snapshot.

So, I took the easy route here in the aftermath, just scanned for S-JIS lead byte and 2nd byte pairs and loaded that as a string till null, repeat, etc. I escaped having to rebuild any of these files since I came up with the idea of intercepting the print function to crunch the current Japanese string to a CRC32, take that as a 4 byte index and match it to these but return the English replacement I would have next to it in a database record, etc.

So in the database, you'd store the FileID, Offset, CRC32, Japanese string, and English string (after your translator did his/her job), etc. and then output that data as arrays in a "C" header file for compilation/usage. One array for the CRC32 and another for the English string, both sorted by CRC32. So when you search for a CRC32 based on what the print function was about to do, the index you find it at is the same index that'll fetch the English string in the other array. Blah blah, you get the idea.

It was pretty cool how it all worked out like a charm. I wondered if there'd be a detectable slowdown in implementing this though, but you couldn't tell the difference in the slightest bit! I did sort the CRC32, English String pairs by the CRC32 so I could use binary search instead of linear 1-to-n max iteration searches as a novice would do, but I'd bet even if I cheaped out and did a basic for loop for a linear search, I still wouldn't have noticed any difference because something like that with only 4,000 to 6,000 4-byte elements shouldn't have been much of a big deal.

I DID cause a detectable slowdown in another area though for image replacement, which I never got to correct in a released patch! But that's another paragraph or so in your thread. ;)

elmer

Quote from: NightWolve on September 30, 2015, 03:54:38 PMBut yeah, I always wondered why they did that, if it was intentional to possibly make the job of fan translation tougher...
Hahaha ... they could care less!  :lol:

They'll have done it for their own reasons, because it made sense at the time.

Probably so that they could have multiple designers working on different parts of the game at the same time.


QuoteI dunno if you have a hex editor with S-JIS-to-Unicode mapping to allowing easy viewing so I took that little snapshot.
Thanks, I took a quick look at that .xso file.

So there's a bunch of data (and script code?) at the start, and the whole thing ends with the string data.

The string data consists of a table of offsets to each string, and then the strings themselves in regular C format.

That seems like a very standard sort-of-thing for the 32-bit era when writing the game in "C".

It's nice that all of the string data is right at the end of the file ... that would have made it about as trivial to hack/replace as you can possibly get!

But your DLL hack is a really nice solution that avoids changing too many of the original files and bloating up the size of the patch.

Unfortunately, the Xanadu 2 patch is likely to be another "windows-executable" style patch, since almost all of the game data is going to get re-compressed.

I'm still curious what the PCE YsIV data looked like ... the 16-bit era was when developers were still coming up with "creative" solutions in order to fit things into the limited RAM/ROM.

TurboXray

Do any of the Xanadu games prime the LZ window/buffer before decompression? I can't remember if it was Dracula X or Gate of thunder, or some other PCECD game, but the game would prime the buffer with a series of values before running the decompression routine. Beginning/leading referencing strings would rely on the presence of these values (not just cleared or zero'd data in the buffer).

elmer

Quote from: TurboXray on October 02, 2015, 01:04:19 PMDo any of the Xanadu games prime the LZ window/buffer before decompression?
Not unless I'm totally missing something!  :wink:

From what I'm seeing, the game loads up a complete 128KB META_BLOCK into RAM, and then when it wants to decompress an 8KB DATA_CHUNK, it maps the appropriate section of the META_BLOCK into $8000-$BFFF, and then decompresses it into $C000-$DFFF.

That memory layout pretty much stops then from using the preload trick.

elmer

It's been a while, so time for an update.

The "script" code seems to be all extracted ... but that's not much use if it can't be modified and replaced.

The problem is that there's a lot of interleaved script code and assembly code ... there's even some bits of "dead" code and script in there!

That makes me absolutely certain that this was all created with a macro-assembler and not a "level editor".

I've written an HuC6280 disassembler and am now running that as part of the script-extraction.

It was actually quite fun to go "old-skool" with that and try to get it as small as possible so that I can have a version of it that runs in-game on the PCE, just like Chris Covell's excellent PCEmon. I think that it should fit into approx 1024 bytes (hopefully less) on a PCE, including instruction cycle counts.

***********************

AFAIK (and I'd love to know if I'm missing some other alternative), there are only 3 basic strategies for changing the text in a translation ...

  • Just overwrite the existing text and only allow strings the same size or shorter than the original.
  • Change the "pointer" to the string to point to your translated string that's somewhere else.
  • Reassemble the original code/script from "source" with the new translated strings, just like the original developers would have done.

Given the lack of free memory in the PCE, I've been thinking that option 3 is probably the best thing to do, especially since it imposes the least limits on the translator.

But the way that Falcom are mixing code and script makes this problematic ... for a start, I've actually got to reverse-engineer the script chunks back into a "source" format that I can either feed to PCEAS, or assemble/compile myself.

That's complicated by not knowing exactly where the code/script/data is in a chunk, and having to try to figure it out from various clues.

***********************

Which leads us on to this example from the very first script chunk.

We've got script that calls an assembly language function, that's next to other code that references a data table, and is followed by yet more script.

That's ugly ... but it's just about OK.

It does mean that I need to output this all in a format that some macro-assembler can handle.


$ad89   _set_pen_then_call_then_eol( orange, .scriptAE17 )
$ad8d   {アイアイサー!}
$ad94   _call_asm_from_script( .codeADFF )
$ad97   _wait_for_keypress_then_end()

.....

$ade6 .codeADE6:
$ade6   lda  .dataADFA,y
$ade9   sta  $2700,x
$adec   lda  #$20
$adee   jsr  $8a63
$adf1   iny 
$adf2   cpy  #$05
$adf4   bcc  .codeADE6
$adf6   jsr  $7feb
$adf9   rts 

$adfa .dataADFA:
$adfa   _byte( $08 )
$adfb   _byte( $00 )
$adfc   _byte( $0a )
$adfd   _byte( $00 )
$adfe   _byte( $0d )

$adff .codeADFF:
$adff   lda  #$01
$ae01   trb  $2c00
$ae04   rts 

$ae05 .scriptAE05: ; 8x12 font
$ae05   {ダイモス}
$ae09   _end()


***********************

Next up, here's some old-fashioned self-modifying code with a jump table.

This disassembly has been hand-tweaked, because it's something that I still need to write a specific disassembler-helper function to actually get it into a usable format.


$a442 .codeA442:
$a442   lda  $26c0,x
$a445   asl  a
$a446   tay 
$a447   lda  .tableA487+0,y
$a44a   sta  .dataA454
$a44d   lda  .tableA488+1,y
$a450   sta  .dataA455
$a453   jmp  $0000

$a487 .tableA487:
$a487   _eptr( .codeA456 )
$a489   _eptr( .codeA460 )
$a48b   _eptr( .codeA469 )
$a48d   _eptr( .codeA473 )


***********************

So ... there's definitely progress, but it's slow going.

NightWolve

#56
Yeah, games like Xak III had 16-bit pointers, sometimes a bit before the text block, sometimes after, so I would load the array after spotting it, and could then recompute each pointer to pack as much English text back into the text block, so you weren't limited by the original string size, you just had to mind the whole text block size and not go over it. In this way, you pretty much were able to fit accurate translations for every string in the block and not have to trim them to the point where loss of quality had to occur. (At least, that was the experience with Xak III.)

With a compressed text block, it's a whole other beast in how it operates. As far as I know, the way it works is the game code specifies an index based on the string that it wants at the time. So, if it wants the 5th string in the block, it specifies say 4 (if we're starting at 0) and so it keeps decompressing while counting the 0/null terminators, so when you've counted the 4th null terminator, that's the end of string 4, the start of string 5, and then it knows to finish off with that string and stop further decompression into the block. Something like that.

EDIT:
Quote from: elmer on October 01, 2015, 12:29:05 PMI'm still curious what the PCE YsIV data looked like ... the 16-bit era was when developers were still coming up with "creative" solutions in order to fit things into the limited RAM/ROM.
Oh right, about your Ys IV question, you basically saw it in that image of S-JIS. A decompressed text block was just null-terminated S-JIS text, that's it! No switching tricks with half-width characters, hiragana, etc. and what not. Just all S-JIS all the time... The game that uses switching tricks is Emerald Dragon and David did extensive work to decode it all to where what I and what SamIAm sees is S-JIS which was converted to Unicode for easier viewing on a Windows desktop.

elmer

Quote from: NightWolve on October 08, 2015, 07:48:14 PMYeah, games like Xak III had 16-bit pointers, sometimes a bit before the text block, sometimes after, so I would load the array after spotting it, and could then recompute each pointer to pack as much English text back into the text block, so you weren't limited by the original string size, you just had to mind the whole text block size and not go over it.
It's so nice when a developer uses a nice-and-simple scheme like that, it really makes a programmer's life so much easier.

The Zeroigar scripts were basically like that.


QuoteWith a compressed text block, it's a whole other beast in how it operates. As far as I know, the way it works is the game code specifies an index based on the string that it wants at the time.
Now that's just plain slow and fugly!  :shock:

I've not seen that trick done before.

I can just-about imagine that being used for a HuCard game on the PCE (becuase of it's limited memory), but it's horrible!

At least it should be fairly easy to translate since you've only got to worry about overall size of the complete block of compressed data.


QuoteOh right, about your Ys IV question, you basically saw it in that image of S-JIS. A decompressed text block was just null-terminated S-JIS text, that's it! No switching tricks with half-width characters, hiragana, etc. and what not. Just all S-JIS all the time...
That was nice of them.


QuoteThe game that uses switching tricks is Emerald Dragon and David did extensive work to decode it all to where what I and what SamIAm sees is S-JIS which was converted to Unicode for easier viewing on a Windows desktop.
Haha ... yes, you definitely want to hide the behind-the-scenes lunacy away from the poor translator.

Xanadu 2 uses a byte-to-sjis conversion table ... actually 2 of them, 1 for 12x12 glyphs and 1 for 8x12 glyphs.


***********************

Anyway ... back to Xanadu 2.

There are 8 really large script-chunks that I've been concerned about, because they're nearly 8KB big, but only seemed to contain about 1KB of "script".

That immediately made me concerned that I was missing something important.

Now that I've finally been able to disassemble the whole chunk, it turns out that I wasn't missing much, and that there really is a lot of (ugly) code in those particular chunks.

They're the ones that handle the 8 different Weapon Shops in the game.

The good news is that this all means that it's time to write the insertion tools and start testing a chunk with some real translated text.   :D

shawnji

Quote from: elmer on October 09, 2015, 01:33:37 PMThe good news is that this all means that it's time to write the insertion tools and start testing a chunk with some real translated text.   :D
Wow.  You guys are really making headway on this.  It'll be great to see it all finished; and it's fun to read through some of this and try to pretend like I know what you're talking about. XD  I wish I had a mind for this kind of thing, but I'm terrible at it.  The furthest I ever got was building a table for a Famicom game, and after that it all just got too confusing for me.

TurboXray

Elmer. So when are you going to do Cosmic Fantasy 4: Chapter 2 ??? The text is all uncompressed SJIS... just sayin ;)

elmer

Quote from: TurboXray on October 10, 2015, 12:11:24 AMElmer. So when are you going to do Cosmic Fantasy 4: Chapter 2 ??? The text is all uncompressed SJIS... just sayin ;)
I absolutely love the PCE ... but unless you can point me towards something of the quality of Ken Levine's System Shock 2, or System Shock, or Bio Shock ... or just get me into a party where I can finally meet him and maybe talk to Eric and Terri Brosius ...

Then I suspect that after Xanadu 2 and Xanadu 1, it'll be back to PC-FX development software and a "new" game (or 2) for me.

Despite the apparent lack of interest, I've already put my "mark" on trying to get Operation Thunderbolt, and maybe Osman ported over to the PCE. Surely that's enough!  :wink:

esteban

Elmer is working on a SCD upgrade of Blodia after The Xanadu.

Priorities are priorities. :)
IMGIMG IMG  |  IMG  |  IMG IMG

shawnji

Quote from: TurboXray on October 10, 2015, 12:11:24 AMElmer. So when are you going to do Cosmic Fantasy 4: Chapter 2 ??? The text is all uncompressed SJIS... just sayin ;)
Oh really?  Hmm... that might be something fun for me to poke at if I wanted to try and teach myself a little bit.  A friend was trying to help walk me through some processes with Blood Gear, but the text being compressed was making it more difficult for me to understand at the time.

VenomMacbeth

XanaDUUUUDE!  EXCELLENT!

The only think keeping me from playing this game is the language barrier, so I can't wait for this to be complete ^-^
Quote from: Gogan on August 01, 2013, 09:54:57 AMPlay Turbografx.
Play the Turbografx. PLAY
THE TURBOGRAFX!!!!!!

Buh buh buh, I have almost all teh games evar.  I R TEH BESTEST COLLECTR!!

TurboXray

Quote from: shawnji on October 10, 2015, 12:01:39 PM
Quote from: TurboXray on October 10, 2015, 12:11:24 AMElmer. So when are you going to do Cosmic Fantasy 4: Chapter 2 ??? The text is all uncompressed SJIS... just sayin ;)
Oh really?  Hmm... that might be something fun for me to poke at if I wanted to try and teach myself a little bit.  A friend was trying to help walk me through some processes with Blood Gear, but the text being compressed was making it more difficult for me to understand at the time.
I made an English print routine for it, as well as an ascii single byte read support, back in 2007/2008 for Animefx. He never finished translating it. If I still have it, it's yours if you want.

NightWolve

Quote from: TurboXray on October 10, 2015, 03:27:57 PM
Quote from: shawnji on October 10, 2015, 12:01:39 PM
Quote from: TurboXray on October 10, 2015, 12:11:24 AMElmer. So when are you going to do Cosmic Fantasy 4: Chapter 2 ??? The text is all uncompressed SJIS... just sayin ;)
Oh really?  Hmm... that might be something fun for me to poke at if I wanted to try and teach myself a little bit.  A friend was trying to help walk me through some processes with Blood Gear, but the text being compressed was making it more difficult for me to understand at the time.
I made an English print routine for it, as well as an ascii single byte read support, back in 2007/2008 for Animefx. He never finished translating it. If I still have it, it's yours if you want.
Well, I'll take it, that's good to have handy just in case the desire arises years from now. ;) Just PM it whenever you get the chance.

TurboXray

Quote from: guest on October 10, 2015, 03:37:49 PMWait, what about Chapter 1? You can't do Chapter 2 without Chapter 1!
I was told that Chapter 1 wasn't that good.

elmer

Quote from: shawnji on October 10, 2015, 12:01:39 PM
Quote from: TurboXray on October 10, 2015, 12:11:24 AMElmer. So when are you going to do Cosmic Fantasy 4: Chapter 2 ??? The text is all uncompressed SJIS... just sayin ;)
Oh really?  Hmm... that might be something fun for me to poke at if I wanted to try and teach myself a little bit.  A friend was trying to help walk me through some processes with Blood Gear, but the text being compressed was making it more difficult for me to understand at the time.
I took a quick look on PC Engine Bible, and it seems like it could be a fun game ... it would be great if you decided to have a go at translating it!  :D

As for me ... I'm definitely not saying that I'll never do a translation again ... I'm just going to want to take a break and do something different for a while after Xanadu 1 & 2.  :wink:

shawnji

#68
Quote from: TurboXray on October 10, 2015, 03:27:57 PM
Quote from: shawnji on October 10, 2015, 12:01:39 PM
Quote from: TurboXray on October 10, 2015, 12:11:24 AMElmer. So when are you going to do Cosmic Fantasy 4: Chapter 2 ??? The text is all uncompressed SJIS... just sayin ;)
Oh really?  Hmm... that might be something fun for me to poke at if I wanted to try and teach myself a little bit.  A friend was trying to help walk me through some processes with Blood Gear, but the text being compressed was making it more difficult for me to understand at the time.
I made an English print routine for it, as well as an ascii single byte read support, back in 2007/2008 for Animefx. He never finished translating it. If I still have it, it's yours if you want.
Sure, I'd like to take a look at that if you don't mind.  I can't promise I'll jump right on it, though; as I translate Japanese for a living already and I get burnt out occasionally.  I'm actually kind of surprised that I've been getting the itch to poke at a game translation again.

I can also understand why most people would want part two.  If you've never played the first Cosmic Fantasy, you wouldn't be terribly interested in Yuu as a main character instead of Van.  There's also the issue that the game is supposedly unbeatable due to some bug or something.  At least, I think that's what I heard.

As a side note, if it is the case that it's actually unbeatable, I bet it would make for a fun programming challenge to see if it could be fixed.

elmer

Quote from: elmer on October 09, 2015, 01:33:37 PMAnyway ... back to Xanadu 2.

There are 8 really large script-chunks that I've been concerned about, because they're nearly 8KB big, but only seemed to contain about 1KB of "script".

That immediately made me concerned that I was missing something important.

Now that I've finally been able to disassemble the whole chunk, it turns out that I wasn't missing much, and that there really is a lot of (ugly) code in those particular chunks.

They're the ones that handle the 8 different Weapon Shops in the game.
Good news ... some of that "ugly" code in the Weapon Shop chunks put me on the trail to find another 12 DATA_CHUNKS that contain a lot of text strings in "script" format.

These aren't part of the regular in-game streaming system, but instead look to be chunks that get loaded "permanently" for each of the different game-overlays (which I think are for the Main-Menu, In-Game, and major boss fights, etc).

These newly discovered chunks finally show where all the missing "menu" text has been hiding!  :D

Finding this "missing" text was what SamIAm originally asked me to do, back before I looked at the state of the translation and decided that it needed some serious TLC in order to get it finished.

elmer

Just in case anyone is interested ... I took a break and had a look at the original Legend of Xanadu 1 again.

The scripting language is almost identical.

The code "markers" that identify where the scripts are located are all different, but it doesn't take too long to find them and to fix up the search algorithm.

I've found and extracted 143 script chunks from it so far. :D

SamIAm

I didn't want to say anything prematurely, but it looks like a translation of Xanadu I is really going to happen. :D

I've already got a script for the cutscenes far along, and if all goes well, I'll be working on the in game stuff within the next week or two!

We're going to need a LOT of people for a dub, though.

jtucci31

Oh man this is amazing news! The cutscenes in Xanadu I are so damn cool, especially after playing some long winded area where the fetch quests drag because you have no idea what's going on (*cough*area 8*cough*).

Currently on area 10, another one that feels long. I can't wait to replay this game in all its English glory! :D

shawnji

I'll be involved in the dub if you want.  I used to work on stage professionally, but the only problem is the lack of a good mic setup.  I have wanted to get a boom mic for a long time, though...

LentFilms

Quote from: SamIAm on October 12, 2015, 10:36:51 PMWe're going to need a LOT of people for a dub, though.
If you need me to reprise my role for "Sailor/Thug" I'd be happy to do so.

johnnykonami

I would love to try, but I might be a horrible actor.  If you need a small part or something, it would be cool to do though.

Gredler

Hahaha man I'd send a few voice samples if needed, that'd be fun :D

SamIAm

I appreciate everyone's interest, as I will surely need a lot of help. However, for the time being, you'll just have to sit tight. I'm a ways away from having a recording-ready script, and I'd rather deal with one thing at a time.

I'll be sure to post on this forum before recruiting anywhere else! :D

elmer

#78
Quote from: elmer on October 12, 2015, 07:40:46 PMJust in case anyone is interested ... I took a break and had a look at the original Legend of Xanadu 1 again.

The scripting language is almost identical.

The code "markers" that identify where the scripts are located are all different, but it doesn't take too long to find them and to fix up the search algorithm.

I've found and extracted 143 script chunks from it so far. :D
Hmmm ... well Xanadu 1 is simultaneously both an "easier" and a "harder" game to hack than Xanadu 2.

The nice thing is that the architecture is basically the same as Xanadu 2, with a "permanent" set of boot/utility code from $2000-$3fff in memory, and then "overlay" code from $4000-$9fff, and "script" chunks that get decompressed as needed into $a000-$bfff.

The game "overlay" code is the same for every top-down level, and the game just loads in a different 176KB META-BLOCK compressed data file that contains the level's graphics and scripts.

So far, so good.

Common scripts, such as item-names are located in the game overlay to save memory.

But then they were still running out of memory ... a couple of the levels have only a few bytes free in the 176KB allowed for each level.

So they also mapped another block semi-permanently into $c000-$dfff and started putting some scripts into that area.

And they were still running out, so that last level splits the script area into 2 4KB chunks and hacks the loading system to decompress 2 different scripts into $a000-$afff and $b000-$bfff. This let them get some more reuse out of the code in that level.

Yuk!

So finding all the scripts has been a bit nasty ... particularly the side-view Weapon Shop scripts which are done in a very different method to everything else. (BTW ... up to 181 scripts with text, now.)

Anyway ... the conclusion from all of this is that Xanadu 1 is pretty short on free memory for the translation.

***********************

Falcom already compresses the original SJIS text by encoding the 192 most-common katankana/kanji into a single byte, and this really works out well for them.

Xanadu 1 has 235,523 SJIS glyphs stored using 271,679 bytes.
Xanadu 2 has  96,139 SJIS glyphs stored using 116,346 bytes.


That's an approx 1.2 multiplier, much better than the 2.0 multiplier of pure SJIS.

Apart from showing that Xanadu 2 really is a much shorter story than Xanadu 1, it shows that we've got a problem.

In order to get a good English translation, SamIAm estimates that we're going to need approx 1.5 to 2.0 times the amount of English characters as Kanji glyphs.

This gives me 2 problems ... how to fit all this English text into a level's compressed 176KB META-BLOCK that gets loaded into memory ... and then how to actually free up enough memory so that a large English script-chunk can be decompressed and accessed in the game.

***********************

Luckily, the first part is easy(ish) ... Xanadu 1 stores all it's data compressed in what I've called the FALCOM1 data format.

If I recompress ALL the data in the game in with SWD, it'll shrink each level's compressed META-BLOCK so that there will be enough memory to store all the extra English text.

The expectation that we were going to hit this sort of problem is one of the reasons for spending so much time messing around with compression earlier.

So here are the results of recompressing each of the 12 levels in different formats (the numbers in braces are the Falcom1 compressed and decompressed sizes).

Blk $00d9800  71 Chk (161,912 / 313,390), Fal2 135,530, Swd4 131,437, Swd5 130,601
Blk $0105800  69 Chk (150,455 / 304,021), Fal2 126,807, Swd4 123,600, Swd5 122,706
Blk $0131800  85 Chk (179,364 / 346,665), Fal2 150,903, Swd4 146,514, Swd5 145,484
Blk $015d800  83 Chk (177,218 / 344,515), Fal2 148,163, Swd4 143,686, Swd5 142,635
Blk $0189800  76 Chk (168,666 / 334,124), Fal2 141,733, Swd4 137,780, Swd5 136,851
Blk $01b5800  78 Chk (175,358 / 333,670), Fal2 146,457, Swd4 142,661, Swd5 141,742
Blk $01e1800  80 Chk (169,941 / 329,813), Fal2 142,395, Swd4 138,714, Swd5 137,754
Blk $020d800  79 Chk (179,178 / 334,334), Fal2 147,208, Swd4 143,309, Swd5 142,467
Blk $0239800  67 Chk (160,874 / 302,719), Fal2 136,316, Swd4 132,410, Swd5 131,443
Blk $0265800  84 Chk (178,110 / 338,782), Fal2 146,571, Swd4 142,236, Swd5 141,287
Blk $0291800  60 Chk (133,598 / 266,361), Fal2 113,042, Swd4 109,641, Swd5 108,871
Blk $02bd800 102 Chk (177,269 / 361,138), Fal2 144,124, Swd4 137,984, Swd5 137,007


The game allows for 180,224 bytes (176KB) for the compressed block.

Taking a look at the largest level ...

FAL1 compressed  179,364
SWD5 compressed  145,484


Which means we should be able to afford not only to have SamIAm do the best translation, but I should also be able to afford to leave 8KB of that space free to use for decompressing the text.

It's definitely going to be a pain to completely replace Xanadu 1's original compression code/data, but I really don't think that there's much of an alternative.

***********************

The second part, allowing decompressed scripts to be larger, is going to be tricky.

If the Xanadu games had just used a nice-and-simple text-printing routine, then it would have been easy to hack the code to switch in a new bank of text at the start of the routine, and then switch it back again at the end.

Unfortunately, since the text is contained within the game's scripting language, and those scripts are located in nearly every possible banked-region in memory, and the script itself is read from multiple different pieces of code ... I don't think that I can get away with anything that simple.

The only solution that I can come up with at the moment is going to mean switching out the CD BIOS code that's mapped into $e000-$ffff.

If I map the 8KB bank of RAM that I've freed up from the 176KB of compressed level data into that area, then I'm going to have a lot of extra space for the translation and for the English font code.

It'll mean hacking the loading code to decompresses scripts into both $a000-$bfff and $f000-$ffef, but that's not too horrible.

The idea would be to map the CD BIOS vectors and interrupt vectors to new code that switches to the original BIOS and executes the original BIOS functions, and then switches back to my RAM bank afterwards.

IIRC, Bonknuts suggested doing something like this on his blog, but I'm not sure if anyone has done this yet in practice.

If they have, then I'd love to hear about it, and about what the potential problems are!

gekioh

Dude hell yes, this is awesome. I've always wanted to play this game and actually KNOW what was going on. woo hoo!!!

TurboXray

So if I'm reading this right, the problem (mentioned at the end) is not enough logical address space? Do you have a free 8k bank of ram to swap into page #7? If you do map another bank there, I would definitely disable interrupts the alt bank is mapped there. Another approach, to logical address issues hacking, is to swap out page #0. Though that means re-routing interrupts from the CD bios routine to your hook code, so it can map it back in for that interrupt call, and then pull the TAM value off the stack and put the page #0 correct on exit (your bank).

 But, if you need to swap out banks to have access to more free space for decompression - aren't there other pages you could swap out first? If original library mapping is a concern, maybe disable interrupts for that extended decompression part (writing into an extended buffer)? Same for when the game routine need to read a character from the extended buffer?

elmer

#81
Quote from: TurboXray on October 19, 2015, 03:37:19 PMSo if I'm reading this right, the problem (mentioned at the end) is not enough logical address space? Do you have a free 8k bank of ram to swap into page #7? If you do map another bank there, I would definitely disable interrupts the alt bank is mapped there. Another approach, to logical address issues hacking, is to swap out page #0. Though that means re-routing interrupts from the CD bios routine to your hook code, so it can map it back in for that interrupt call, and then pull the TAM value off the stack and put the page #0 correct on exit (your bank).
It's a logical-address-space issue after decompression, and during game execution.

I can deal with the amount of space that each level's data takes up before-decompression by using a better compressor.

That will also free up an 8KB physical RAM bank that I can then use however I wish.

The problem is that the new English script-language data that's been decompressed is going to bigger than original Japanese script-language data.

It's got to be stored somewhere that's accessible to the script-interpreter.

AFAIK there's 2 possible solutions to that ...

1) Put the overflow somewhere in physical memory and only map it into logical memory just for the instant that the interpreter reads a byte from the script.

2) Put the overflow somewhere in physical memory and have it permanently mapped into logical memory.

***********************

If I choose option 1, then I've got to be 100% certain that I know every location in code where the interpreter reads a byte from the current script-pc. I've also got to either find somewhere "safe" in logical address space to temporarily map the bank, or else I've got to disable interrupts while I map in the bank and read the byte.

If I disable interrupts, then anything that relies on the interrupt timing will glitch.

I don't know that anything does ... but it's not the kind of thing that I like to do, especially when it's not my code in the first place, so I don't know exactly how it was designed.

If I were the original developer ... this is probably the method that I'd choose to implement. It's the cleanest, and the most standards-compliant.

***********************

If I choose option 2, then I've got to decide whether to use bank 0 or bank 7, and bank 7 is the obvious choice.

If I do that, then I can write new interrupt vectors and new BIOS function vectors, and just map the CD BIOS bank into $e000-$ffff when it actually needs to be there to handle an interrupt or a function call.

The new vectors will waste a bit of space, but the flexibility gained is pretty tremendous.

Now, if I were the original developer, I could do all of this without any knowledge of the BIOS internals, but since I don't have the luxury of having the source, I can make things easier for myself by taking advantage of some inner knowledge of how the CD BIOS is written.

This would totally break any console manufacturer's standards ... but that's not a problem these days.

At this point, we know that the game requires SuperCD, and there are only the Japanese and English CD BIOS 3.0 cards to worry about.

That'll let me take advantage of knowing how the BIOS code operates in order to create the RAM stubs that I have to write.

Naughty ... but this is a hack, after all.

NightWolve

Oh wow, looks like the game was lucky enough to finally come into the hands of someone knowledgeable enough to do this! Ys IV was tight only as far as wanting to implement subtitles instead of dubbing according to Neill Corlett, I remember that. But uh, I hope it doesn't come down to using the idea that's now available with the Turbo Everdrive's writable RAM and limiting the fan translation to that...

SamIAm


NightWolve


elmer

#85
Quote from: NightWolve on October 19, 2015, 06:11:06 PMOh wow, looks like the game was lucky enough to finally come into the hands of someone knowledgeable enough to do this!
Thanks, but I like to think that anyone could do this with enough time and a sufficiently bloody-minded refusal to let the problem beat them!  :wink:


QuoteI hope it doesn't come down to using the idea that's now available with the Turbo Everdrive's writable RAM and limiting the fan translation to that...
I think that there are definitely going to be translations out there that are only going to be possible or practical with extra memory, but I think that we got lucky on these 2 and that I can free up the necessary space.

I'd certainly like to keep them running within the original hardware.


Quote from: SamIAm on October 19, 2015, 09:34:39 PMYou should make this into your avatar:
Haha ... thanks, but usually I feel more like the one of the Infinite Number of Monkeys!

https://en.wikipedia.org/wiki/Infinite_monkey_theorem


Sometimes a good plan just doesn't work out as neatly as a programmer believes that it's going to when he/she has that oh-so-brilliant idea!  :roll:

Case-in-point ...


Quote from: elmer on October 19, 2015, 05:21:57 PMIf I choose option 2, then I've got to decide whether to use bank 0 or bank 7, and bank 7 is the obvious choice.

If I do that, then I can write new interrupt vectors and new BIOS function vectors, and just map the CD BIOS bank into $e000-$ffff when it actually needs to be there to handle an interrupt or a function call.
Well, this sounded like a good idea, and the thought of replacing every old ROM CD BIOS vector ...

e000: JMP $e0f3     ; CD_BOOT
e003: JMP $e8e3     ; CD_RESET
e006: JMP $eb8f     ; CD_BASE
....
e0f0: JMP $fddb     ; MA_CBASIS


with new vectors in the RAM bank ...

e000: JSR my_hack   ; CD_BOOT
e003: JSR my_hack   ; CD_RESET
e006: JSR my_hack   ; CD_BASE
....
e0f0: JSR my_hack   ; MA_CBASIS


... seemed really quite elegant.

The new "my_hack" routine would page in the original ROM bank, execute the original CD function, and then page the RAM bank back in.

The code isn't even very long ...

my_hack:   sta   .lda1+1
           pla
           sec
           sbc   #2
           sta   .callrom+1
           pla
           sbc   #0
           sta   .callrom+2
           tma   #$80
           pha
           lda   #$00
           tam   #$80
.lda1:     lda   #$00
.callrom:  jsr   $0000
           sta   .lda2+1
           pla
           tam   #$80
.lda2:     lda   #$00
           rts


Unfortunately, once I took a good look at it, I realized that it's not re-entrant, and that if an interrupt occurs part way through that routine that also calls a CD function, then you're going to get a random bug/crash.   :shock:

Now this can worked around by making sure that the interrupt handler also switches banks ...

my_irq1:   pha
           tma   #$80
           pha
           lda   #$00
           tam   #$80
           lda   #>my_rti
           pha
           lda   #<my_rti
           pha
           php
           jmp   ($FFF8)

my_rti:    pla
           tam   #$80
           pla
           rti


But now I've added a huge delay into the interrupt processing, which is exactly what I was trying to avoid by choosing "option 2"!  #-o

So it's back to "option 1", which turns out to both require less code and to also delay interrupts less.

// script_pc is stored in $37,$38.

read_pc:   lda   #RAMBANK
           php
           sei
           tam   #$80
           lda   ($37)
           sta   .lda1+1
           lda   #$00
           tam   #$80
           plp
           inc   $37
           bne   .lda1
           inc   $38
.lda1:     lda   #$nn
           rts


Whoops!  :oops:

dshadoff

Quote from: elmer on October 19, 2015, 02:36:00 PMFalcom already compresses the original SJIS text by encoding the 192 most-common katankana/kanji into a single byte, and this really works out well for them.

Xanadu 1 has 235,523 SJIS glyphs stored using 271,679 bytes.
Xanadu 2 has  96,139 SJIS glyphs stored using 116,346 bytes.


That's an approx 1.2 multiplier, much better than the 2.0 multiplier of pure SJIS.

Apart from showing that Xanadu 2 really is a much shorter story than Xanadu 1, it shows that we've got a problem.

In order to get a good English translation, SamIAm estimates that we're going to need approx 1.5 to 2.0 times the amount of English characters as Kanji glyphs.

This gives me 2 problems ... how to fit all this English text into a level's compressed 176KB META-BLOCK that gets loaded into memory ... and then how to actually free up enough memory so that a large English script-chunk can be decompressed and accessed in the game.
That's really interesting - that they can already use 192 bytes of the character set to expand into 2-byte SJIS (I guess using a lookup table).  I take it that this is separate from the compression scheme ?

The reason I ask is that I wonder what the most common words/phrases in English would be, and whether they could be single-byte-substituted separately from the window-based compression (for example, names, "the ", "and " - including spaces - etc).  The longer the word, the less likely that it would show up in the past window.

But even if that kind of substitution were easy to capitalize on, it could still be a combinatoric problem of processing the script and deciding which substitutions would yield the best net compression.

...Just a thought.

-Dave

elmer

Quote from: dshadoff on October 20, 2015, 10:22:22 PMThat's really interesting - that they can already use 192 bytes of the character set to expand into 2-byte SJIS (I guess using a lookup table).  I take it that this is separate from the compression scheme ?
Yep, they use the same lookup table for both Xanadu 1 and Xanadu 2. That's reversed back into a SJIS glyph only when each glyph is printed to the screen by the script interpreter.

So "yes", it's totally separate from the regular FALCOM1/FALCOM2 compression.


QuoteThe reason I ask is that I wonder what the most common words/phrases in English would be, and whether they could be single-byte-substituted separately from the window-based compression (for example, names, "the ", "and " - including spaces - etc).  The longer the word, the less likely that it would show up in the past window.
If you're never going to "localize" the translation into any of the other EFIGS languages, then using a dictionary substitution like that is a perfect use for character codes $80-$FF.

In these "Internet" days it's easy to find lists of the most-common English words (like http://www.wordfrequency.info) that can give you a good starting point for any dictionary, without having to do the work of producing your own "optimal" list.

Now, writing the code to create that "optimal" dictionary may be the kind of challenge that some programmers would enjoy ... but I don't think that extra work would be overly useful in a case like this.

The big question when using such a static dictionary is "Do the benefits outweigh the costs?".

There isn't a huge CPU cost, but you are going to have to store that dictionary in memory somewhere, so there's definitely a cost there.

There's also a question of the effect that it's going to have on the LZSS compression.

With 2-byte (or less) LZSS codes already acting as a dynamic dictionary, you're reducing the "gain" that you're going to get from using a static dictionary as well.

In the case of the Xanadu games, Falcom already take advantage of the "dictionary" idea by only storing some strings (such as speaker's names) once, and then using a script "call" to print them out.

So ... you've got a good idea  :)

... but I'm going to hold it in "reserve", and only use it if I run out of memory later.  :wink:

dshadoff

Quote from: elmer on October 21, 2015, 12:14:05 PM
QuoteThe reason I ask is that I wonder what the most common words/phrases in English would be, and whether they could be single-byte-substituted separately from the window-based compression (for example, names, "the ", "and " - including spaces - etc).  The longer the word, the less likely that it would show up in the past window.
If you're never going to "localize" the translation into any of the other EFIGS languages, then using a dictionary substitution like that is a perfect use for character codes $80-$FF.

In these "Internet" days it's easy to find lists of the most-common English words (like http://www.wordfrequency.info) that can give you a good starting point for any dictionary, without having to do the work of producing your own "optimal" list.

Now, writing the code to create that "optimal" dictionary may be the kind of challenge that some programmers would enjoy ... but I don't think that extra work would be overly useful in a case like this.

The big question when using such a static dictionary is "Do the benefits outweigh the costs?".
That's exactly what I meant when I said that it would be a combinatoric exercise - it's not just whole words, but rather partial words which could benefit, and it would be highly dependent on the actual script as well.  For example, maybe simple letter-pairs such as "sh", "st", "th", "tr", and "ea" yield good benefit without going as far as creating additional code for full words, and maybe they don't interfere with compression either.

One could find out the "net benefit" for character groups by finding groups of "n" characters and number of occurrences, running for "n" groups from 2 to the maximum size of a substitute.   This would be a very long list which would need to be sorted to determine maximum benefit.  Then, re-run again - but this time under the presumption that substitute #1 has been implemented...  All the while, keeping in mind that if you take advantage of the #1 benefit, you may reduce the effectiveness of #2 and #3... which turns it into a combinatoric problem, identifying all of the possible permutations and their maximal benefit.

But hopefully, it isn't such a brute force problem.

QuoteThere's also a question of the effect that it's going to have on the LZSS compression.

With 2-byte (or less) LZSS codes already acting as a dynamic dictionary, you're reducing the "gain" that you're going to get from using a static dictionary as well.
Very true.

QuoteSo ... you've got a good idea  :)

... but I'm going to hold it in "reserve", and only use it if I run out of memory later.  :wink:
Fair enough !  More complexity is more work, and more opportunity for bugs/issues.
Just thought I'd mention an idea that came to me while reading your reply.

-Dave

elmer

I think that "investigation" is over for a while, and it's time to get back to actually getting the text inserted into the game.

I've finally added all the programmer-garbage at the top of each extracted-script, so that I've got the information that I should need for the re-insertion, and now it's time to write that code.

When that's done, I'll test it by making sure that the existing scripts insert identically to the originals, and then it'll be time to try some test English in there.

For anyone that's interested, here's an example of what SamIAm will be working with over the next few weeks/months.

It is the complete script for the very first chunk in the Xanadu 2 game, when Arios is on the ship.

If you're interested, you can actually run the game and see when these pieces of text pop up; and if you've done any simple programming, then you should be able to follow the basic idea of what's going on with all the branches and calls, even if the exact details aren't clear.

  @chunkdefn( $a000, $bfff, $000fb800, 15 )

  @memregion( $a695, $ad97 )
  @memregion( $ae05, $bfff )

  @extscript( $9eb7, .script9EB7 )

  @scriptref( $a174, TYPE_JSR90D8, .scriptA6A3 )
  @scriptref( $a188, TYPE_JSR90D8, .scriptA8EF )
  @scriptref( $a18a, TYPE_JSR90D8, .scriptA95F )
  @scriptref( $a18c, TYPE_JSR90D8, .scriptA9E8 )
  @scriptref( $a18e, TYPE_JSR90D8, .scriptAAEF )
  @scriptref( $a262, TYPE_IMM37,   .scriptA695 )
  @scriptref( $a35b, TYPE_JSR9298, .scriptAB88 )
  @scriptref( $a37f, TYPE_JSR9298, .scriptAC22 )

.scriptA695:
  {帆船ローランディア号}
  _end()

.scriptA6A3:
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .scriptAE05 )
  _disable_8x12_font()
  _tst_2b03_x_bnz( $01, $02, .scriptA877 )
  _tst_2b03_x_bnz( $01, $10, .scriptA748 )
  _tst_2b03_x_bnz( $01, $20, .scriptA71B )
  _tst_2b03_x_beq( $20, $01, .scriptA8AB )
  _tst_2b03_x_beq( $20, $02, .scriptA8AB )
  _tst_2b03_x_beq( $20, $04, .scriptA8AB )
  _tst_2b03_x_beq( $20, $08, .scriptA8AB )
  {アリオスさま、準備が整ったようですな。そういえば、航海長が}
  _eol()
  {お話があるとのことです。}
  _wait_for_keypress_then_clear()
  {後部甲板に行ってみてはいかがですか?}
  _set_bits_2b03_x( $01, $20 )
  _wait_for_keypress_then_end()

.scriptA71B:
  {航海長がお話があるとのことです。}
  _eol()
  {後部甲板に行ってみてはいかがですか?}
  _wait_for_keypress_then_end()

.scriptA748:
  _tst_2b03_x_bnz( $01, $20, .scriptA762 )
  {アリオスさま。どうかされましたか?}
  _wait_for_keypress_then_clear()
  _jump( .scriptA775 )

.scriptA762:
  {アリオスさま。航海長はなんと?}
  _wait_for_keypress_then_clear()

.scriptA775:
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .script9EB7 )
  _disable_8x12_font()
  {うん‥‥}
  _eol()
  {航海長から新しい海域に入ったので、}
  _eol()
  {海図を描いてくれと頼まれたんだ。}
  _wait_for_keypress_then_clear()
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .scriptAE05 )
  _disable_8x12_font()
  {そうですか。}
  _eol()
  {‥‥確かアリオスさまは、}
  _eol()
  {絵画などは苦手のはずでは‥‥}
  _wait_for_keypress_then_clear()
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .script9EB7 )
  _disable_8x12_font()
  {はは、仕方ないさ。}
  _eol()
  {ところで、ダイモス‥‥その‥‥}
  _wait_for_keypress_then_clear()
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .scriptAE05 )
  _disable_8x12_font()
  {なんでしょう? アリオスさま。}
  _wait_for_keypress_then_clear()
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .script9EB7 )
  _disable_8x12_font()
  {私はもう、百騎長ではないのだから、}
  _eol()
  {いいかげん『さま』はよせ。}
  _wait_for_keypress_then_clear()
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .scriptAE05 )
  _disable_8x12_font()
  {はい。アリオスさま。}
  _eol()
  {‥‥あッ!!}
  _wait_for_keypress_then_clear()
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .script9EB7 )
  _disable_8x12_font()
  {やっぱりダメか‥‥}
  _set_bits_2b03_x( $01, $02 )
  _wait_for_keypress_then_end()

.scriptA877:
  {アリオスさまは、}
  _eol()
  {私めにとって大事な主君。}
  _wait_for_keypress_then_clear()
  {『アリオスさま』とお呼びしては}
  _eol()
  {いけませんか?}
  _wait_for_keypress_then_end()

.scriptA8AB:
  {アリオスさま。}
  _eol()
  {船倉の方に装備品がそろえてあります。}
  _wait_for_keypress_then_clear()
  {お急ぎにならずとも結構ですから、}
  _eol()
  {準備をなさってください。}
  _wait_for_keypress_then_end()

.scriptA8EF:
  _set_pen_then_call_then_eol( orange, .scriptAE0A )
  _tst_2b03_x_beq( $20, $01, .scriptA934 )
  _tst_2b03_x_beq( $20, $02, .scriptA934 )
  _tst_2b03_x_beq( $20, $04, .scriptA934 )
  _tst_2b03_x_beq( $20, $08, .scriptA934 )
  {しっかし、ここいらへんの海は}
  _eol()
  {穏やかだねぇ。}
  _wait_for_keypress_then_clear()
  {オレたちの故郷たぁ大違いさね。}
  _wait_for_keypress_then_end()

.scriptA934:
  {アリオスさん、結構広い船だからって、}
  _eol()
  {迷子になんかならないでくだせぇよ。}
  _wait_for_keypress_then_end()

.scriptA95F:
  _set_pen_then_call_then_eol( orange, .scriptAE0A )
  _tst_2b03_x_bnz( $01, $10, .scriptA9B8 )
  _tst_2b03_x_beq( $20, $01, .scriptA990 )
  _tst_2b03_x_beq( $20, $02, .scriptA990 )
  _tst_2b03_x_beq( $20, $04, .scriptA990 )
  _tst_2b03_x_beq( $20, $08, .scriptA990 )
  {船長、ご用は済んだんですかい?}
  _wait_for_keypress_then_end()

.scriptA990:
  {おや、船長。}
  _eol()
  {船倉は階段を下りて右、}
  _eol()
  {厨房のとなりですぜ。}
  _wait_for_keypress_then_end()

.scriptA9B8:
  {もうすぐ、見張り番を交代しますぜ。}
  _eol()
  {でも、マストに登るのは}
  _eol()
  {おっかねぇからなぁ‥‥}
  _wait_for_keypress_then_end()

.scriptA9E8:
  _set_pen_then_call_then_eol( orange, .scriptAE0A )
  _tst_2b03_x_bnz( $01, $10, .scriptAABD )
  _tst_2b03_x_bnz( $01, $04, .scriptAA8A )
  _tst_2b03_x_beq( $20, $01, .scriptAA55 )
  _tst_2b03_x_beq( $20, $02, .scriptAA55 )
  _tst_2b03_x_beq( $20, $04, .scriptAA55 )
  _tst_2b03_x_beq( $20, $08, .scriptAA55 )
  _conditional_jump( $c9, $b0, $10, $2aa5, .scriptAA66 )
  _conditional_jump( $c9, $b0, $20, $2aa6, .scriptAA66 )
  _conditional_jump( $c9, $b0, $30, $2aa7, .scriptAA66 )
  {いや~ 武器を持つと}
  _eol()
  {船長は見違えるね~}
  _wait_for_keypress_then_clear()
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .script9EB7 )
  _disable_8x12_font()
  {よ、よしてくれないか。}
  _set_bits_2b03_x( $01, $04 )
  _wait_for_keypress_then_end()

.scriptAA55:
  {いや~ 男はやっぱ海だよね~}
  _wait_for_keypress_then_end()

.scriptAA66:
  {船長、武器とかは}
  _eol()
  {ちゃんと装備しておいた方がいいですぜ。}
  _wait_for_keypress_then_end()

.scriptAA8A:
  {しかし、思い出すね~}
  _wait_for_keypress_then_clear()
  {3年前、船長を乗せて}
  _eol()
  {イクティア島まで運んだときのことを‥‥}
  _wait_for_keypress_then_end()

.scriptAABD:
  {船長、武器や防具は、}
  _eol()
  {装備してこそ意味があるんです。}
  _eol()
  {忘れねぇでくだせぇよ。}
  _wait_for_keypress_then_end()

.scriptAAEF:
  _set_pen_then_call_then_eol( orange, .scriptAE0A )
  _tst_2b03_x_bnz( $01, $10, .scriptAB65 )
  _tst_2b03_x_beq( $20, $01, .scriptAB30 )
  _tst_2b03_x_beq( $20, $02, .scriptAB30 )
  _tst_2b03_x_beq( $20, $04, .scriptAB30 )
  _tst_2b03_x_beq( $20, $08, .scriptAB30 )
  {もうすぐで、久しぶりの陸ですぜ。}
  _eol()
  {あと少し、ガマンしてくだせえ。}
  _wait_for_keypress_then_end()

.scriptAB30:
  {3年前、船を使ってもらった先生に}
  _eol()
  {頼まれたとはいえ、}
  _eol()
  {まさかこんな所までくるとはなぁ‥‥}
  _wait_for_keypress_then_end()

.scriptAB65:
  {ようやく、陸が見えるように}
  _eol()
  {なってきやした。}
  _eol()
  {長かったなぁ‥‥}
  _wait_for_keypress_then_end()

.scriptAB88:
  _modify_script_variable( $01, $10 )
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .scriptAE05 )
  _disable_8x12_font()
  {アリオスさま。いよいよですな。}
  _wait_for_keypress_then_clear()
  _modify_script_variable( $00, $11 )
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .script9EB7 )
  _disable_8x12_font()
  {うん。}
  _eol()
  {もうすぐリュコスの向かった新大陸か。}
  _wait_for_keypress_then_clear()
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .scriptAE05 )
  _disable_8x12_font()
  {そうです。}
  _eol()
  {上陸にそなえてください。}
  _wait_for_keypress_then_clear()
  {船倉の方に装備品がそろえてあります。}
  _eol()
  {お急ぎにならずとも結構ですから、}
  _eol()
  {準備をなさってください。}
  _wait_for_keypress_then_end()

.scriptAC22:
  _set_pen_then_call_then_eol( orange, .scriptAE0F )
  {大変です! 船長!!}
  _wait_for_keypress_then_clear()
  _call_asm_from_script( .codeAD98 )
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .scriptAE05 )
  _disable_8x12_font()
  {どうした、なにがあった?}
  _wait_for_keypress_then_clear()
  _set_pen_then_call_then_eol( orange, .scriptAE0F )
  {お捜しの船を見つけたんです!}
  _wait_for_keypress_then_clear()
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .script9EB7 )
  _disable_8x12_font()
  {なんだって!?}
  _wait_for_keypress_then_clear()
  _set_pen_then_call_then_eol( orange, .scriptAE0F )
  {前方の島の暗礁地帯に}
  _eol()
  {乗り上げているのを確認しました。}
  _eol()
  {間違いありません。}
  _wait_for_keypress_then_clear()
  _modify_script_variable( $00, $11 )
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .script9EB7 )
  _disable_8x12_font()
  {リュコスの船だ!}
  _eol()
  {やはり遭難していたんだ。}
  _eol()
  {すぐに向かおう。}
  _wait_for_keypress_then_clear()
  _modify_script_variable( $01, $10 )
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .scriptAE05 )
  _disable_8x12_font()
  {しかし困りましたな‥‥}
  _eol()
  {あのような岩ばかりでは}
  _eol()
  {近寄ることも難しい。}
  _wait_for_keypress_then_clear()
  _call_asm_from_script( .codeADBE )
  _set_pen_then_call_then_eol( orange, .scriptAE0A )
  {なに、島をぐるりと回りゃ}
  _eol()
  {船の入れるところもあるでしょう。}
  _wait_for_keypress_then_clear()
  {いざとなったら、ボートを使えばいい。}
  _wait_for_keypress_then_clear()
  _call_asm_from_script( .codeADDC )
  _enable_8x12_font()
  _set_pen_then_call_then_eol( orange, .script9EB7 )
  _disable_8x12_font()
  {よし、右舷回頭、面舵一杯!}
  _eol()
  {暗礁地帯をさけて、島に近づくぞ。}
  _eol()
  {急いでくれ!!}
  _wait_for_keypress_then_clear()
  _set_pen_then_call_then_eol( orange, .scriptAE17 )
  {アイアイサー!}
  _call_asm_from_script( .codeADFF )
  _wait_for_keypress_then_end()

.scriptAE05: ; 8x12 font
  {ダイモス}
  _end()

.scriptAE0A:
  {船員}
  _end()

.scriptAE0F:
  {見張り番}
  _end()

.scriptAE17:
  {船員たち}
  _end()

.endAE1E:

NightWolve

Dang, that's a messy job, so they spread text around inbetween code... :/ Falcom does usually go with a text block model of null-terminated strings and indexing into the block, but I believe Zwei! was one other exception (plus here for Xanadu) where you had most of the script stored as in-lined strings inbetween code.

elmer

#91
Quote from: NightWolve on October 23, 2015, 06:14:28 PMDang, that's a messy job, so they spread text around inbetween code...
Hahaha ... that's the clean version!  :wink:

Back when I was still trying to reverse-engineer complete script-blocks back into editable-source, you'd also see assembly code and data interleaved in there, too!  ](*,)

It's a powerful script-language, and a very powerful technique for making an 8-bit machine's macro-assembler to act as a high(ish)-level language.

I think that the whole scheme fell out of favor in the early years of the 5th generation when the "C" compiler replaced the macro-assembler ... but these kind of languages are back-with-a-vengence these days with Lua, Unreal Script, and every-other-developer's-proprietary-bytecode-interpreted-language.

At least most developers build games now with the knowledge that they need to think about localization into other languages.

elmer

As I'm sitting here, writing the script-assembler (it's too simple to call a compiler), and wondering if it's too early for a beer ... I'm dragging in various bits of code from old projects to help.

Since Bonknuts talked about writing a scripting language in another thread, I thought that I'd point out this article (and concept) for any programmer that's not seen it.

http://cowboyprogramming.com/2007/01/04/practical-hash-ids/

The first time that I saw it totally integrated into a game's engine/toolchain was in NeverSoft's Tony Hawk Engine.

It's an amazingly powerful method for solving a lot of game asset-management issues on 32-bit machines, and for dynamically linking script bytecode to "C" code.

I've been using my own version of the idea since the early 2000's ... and when I finally get back to working on a PC-FX toolchain/library, you'll probably see it as one of the "core concepts".

elmer

From the "System Card Dreams" thread ...

Quote from: elmer on October 25, 2015, 09:51:21 PMI could make the Xanadu translations require extra memory in order to create a market ... and it would definitely make my job easier. But that doesn't seem like a nice thing to do, even if it might slightly lower the chance of a PCEWorks boxed-set.
For anyone that's reading through both these threads, I though that I'd explain how this would make things easier ...

If I had another 256KB RAM on a "Translation Card", then I'd just load Xanadu's META_BLOCKs directly into the extra RAM.

That would avoid the need to change the game's compressor to a new one, and SamIAm's translations could be as-big-as-they-need with little effort on my part.

The extra RAM would also make it easy to solve the space-in-the-game problem for the decompressed English scripts, and also give me plenty of room for English fonts and a new VFW font routine.

These are things that would save me a fair amount of time-and-effort in getting the translation done.

The downside would be that then the META_BLOCKs wouldn't fit onto the CD anymore.

That can be solved by increasing the size of the ISO ... but it does show that there's no quick-fix that doesn't have consequences, and we're just moving programmer-effort from one problem into a different one.

Now ... if I wasn't able save memory by switching the compressor, then having this extra memory would be the only way that you'd get a decent translation.

Food-for-thought.  :-k

OldMan

Quotethere's no quick-fix that doesn't have consequences,
There never is <lol>. It's always a trade-off :)

A very silly question: Would it be possible to load the meta-blocks 'on the fly' from the cd? Or are they dependant on each other? (ie, block A calls block B such that both have to be in memory at once)

TurboXray

There's lot of free space in the ISO track - you don't need to expand it. It might look full, but there are lots of segments of left over 16bit PCM data in there. If you open the ISO track into a view than can interpret all data as PCE sprite format, you'll see that 16bit wave files have a very unmistakably unique signature to them. I usually run TMOD2 under dosbox (runs in XP too, but not on my win7 setup), with Dshadoffs PCE plugins for it.

elmer

#96
Quote from: TheOldMan on October 26, 2015, 01:14:07 PMThere never is <lol>. It's always a trade-off :)
Haha .. "yep", such is the life of a programmer ... or an electrical engineer, or most "design" jobs for that matter.   :wink:


QuoteA very silly question: Would it be possible to load the meta-blocks 'on the fly' from the cd? Or are they dependant on each other? (ie, block A calls block B such that both have to be in memory at once)
AFAIK Xanadu 1 just loads a single large META_BLOCK (176KB max size) for each game level.

That's a mix of graphics and code and script.

For instance, the 1st level's META_BLOCK is 160KB large and contains 71 individual DATA_CHUNKs, only 13 of which actually contain Falcom's script-language data.

In order to "split" that into 2 META_BLOCKs and dynamically swith between them, I'd have to totally understand exactly what's going on inside each of the 71 DATA_CHUNKs, and to then copy the ones that are needed for each of the 2 new META_BLOCKs, and also fix up the code/script in each to use the new DATA_CHUNK indexes.

Then I'd also need to understand the game well enough to hack in this dynamic-loading system, and to trigger it in places that won't break anything.

I suspect that it would just be easier to rewrite the game from scratch!

Now Xanadu 2 already implements a dynamic loading system ... that's it's big improvement, and it's why it can afford to use much prettier graphics.

So it's somewhat more possible to split it's META_BLOCKs into 2, but it would still be a major nightmare that would IMHO mean completely reverse-engineering the entire game.

So "possible"? Probably so with enough time and will, but not even slightly "practical" in any sane definition of the word.

Nope ... IMHO in order to get this actually done and not have it die-on-the-vine like the previous attempt, we're going to have to keep the basic structure of the game unaltered, and just change the details (such as the language of the text).

It's going to be the same when it comes to the font code.

While it would be "nice" to change things to use a full VWF ... the structure of the code is going to make that a little awkward.

The first step will be to use the 12x12 English SJIS glyphs in the ROM font, just like EsperKnight did.

It'll be ugly ... but it's just going to be a stop-gap measure to make sure that things don't break.

Then it'll be time to see what's the best way to make it all look nice, and to give SamIAm plenty of room on the screen for his translated text.

But that's really a problem that will get more serious attention when the English scripts are inserting correctly.

elmer

Quote from: TurboXray on October 26, 2015, 01:43:32 PMThere's lot of free space in the ISO track - you don't need to expand it. ... I usually run TMOD2 under dosbox (runs in XP too, but not on my win7 setup), with Dshadoffs PCE plugins for it.
Thanks for the suggestion!

I've never heard of those tools, and will have to check them out.  :D

It looks like the CD is pretty full on Xanadu 2, so that's great to hear.

They're definitely using fixed intervals for the data in the ISO track, and they didn't clean out all the old stuff in there ... so it'll be good to have some help in identifying what data is just left-over junk.

Xanadu 1 looks to have plenty of space left to just expand the ISO ... but then they could have hard-coded the CD audio locations, which would be annoying.

Either way ... I'm really hoping to avoid having to change any of the data locations at this point.

TurboXray

Here's pic of what 16bit wave files look like in PCE sprite format:
IMG

It's very distinct.


About 99.99999% of PCE game have junk interleaved between the data in the ISO tracks. Sometimes it's other games, or wave files, or even dev stuff (one game has the full source code to it in there along with developers notes, etc). Basically, whatever was on the harddrive at the time got copied into the ISO track.

TurboXray

Here's the link: http://pcedev.net/utils/tmod2_pce_support.zip
If you run it under dosbox, you'll need to set the memory to 64megs. Under XP, I was able to open files as well as 500megs. Under dosbox though, I think my limit was 200-300 megs before the app reported that it ran out of memory. It can have up to 7 files open at once. There's not readme file, so you have to click the "?" icon for how to use the features.

 At some point, I'm going to remake this util in windows (using SDL, with my own internal GUI support). Once you get to know your way around it, it has some really useful functionality.  I'll add more view features though (and control of bitplane orders).