RIP to BT Garner of MindRec.com... He passed away early 2023 from health problems. BT was one of the top PCE homebrew developers and founder of the OG Turbo List, then PCECP.com. Condolences to family and friends.
IMG
IMG
Main Menu

Cyber Knight translation

Started by megatron-uk, 02/05/2014, 01:03 PM

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

NightWolve

Ah man, I wish we had that debugger (David and I) back when we worked on Ys IV! Looks pretty decent. I used a brute-force method to find a string pointer and I got lucky at about 3000 or so byte changes to the state file per reloading in the YAME emulator! Heh-heh. Did what I had to do, and it managed to work after enough patience!

Basically, I wrote a Perl script that opened the YAME state file at the start of RAM, and it would INC a byte, then pause till I'd hit return. I would ALT+TAB to YAME, press the quick key to load a state file (set to F5), load the menu, see if I changed the text in question, if not, ALT+TAB back to the Perl script now open in a CMD window, press enter, and so then it would DEC the byte that it just changed, move to the next one, INC it, pause, rinse and repeat for hundreds and hundreds and hundreds of times till you find the byte that you're looking for!!! So I found it after like 3 or 4 days.... If not, David would've had to hand trace some assembly for me, but it didn't come to that for this issue.

megatron-uk


TurboXray

Quote from: NightWolve on 02/09/2014, 02:26 PMAh man, I wish we had that debugger (David and I) back when we worked on Ys IV! Looks pretty decent. I used a brute-force method to find a string pointer and I got lucky at about 3000 or so byte changes to the state file per reloading in the YAME emulator! Heh-heh. Did what I had to do, and it managed to work after enough patience!

Basically, I wrote a Perl script that opened the YAME state file at the start of RAM, and it would INC a byte, then pause till I'd hit return. I would ALT+TAB to YAME, press the quick key to load a state file (set to F5), load the menu, see if I changed the text in question, if not, ALT+TAB back to the Perl script now open in a CMD window, press enter, and so then it would DEC the byte that it just changed, move to the next one, INC it, pause, rinse and repeat for hundreds and hundreds and hundreds of times till you find the byte that you're looking for!!! So I found it after like 3 or 4 days.... If not, David would've had to hand trace some assembly for me, but it didn't come to that for this issue.
That's hardcore old skool!

megatron-uk

Ok, I now have two reproducible dialogue examples and have found where the string is located in logical memory for each of them.

The first is the message from MICA which activates if you stray too far on initial landing:

ROM position: 0x01DEFE
Logical position: 5EFE
Hex sequence: 4D 49 43 41 A2 BB B8 BE DD 5C B4 D8 B1 5C B6 D7 20 CA BD DE DA C3 B2 CF BD A3 08

The second is a message from the guards in the first town, move to them and they will activate a sequence where they check whether you are allowed to pass them. After that, this phrase will trigger when you then talk to each guard alternately:

ROM position: 0x01423E
Logical position: 423E
Hex Sequence: C0 C0 DE B2 CF 20 B7 DD D1 C1 AD B3 C3 DE B1 D8 CF BD 08

Now, I can see the text actually mapped in via the logical location that I have the read breakpoints set to (and the hex code matches that I scan in the rom). But I can't work back from that to figure out where the pointer to the string is loaded from.

OldMan

I assume the read breakpoints are at the actual text addresses.

When you hit the break, write down the pc (program counter) and which page is at that address.
Then set a breakpoint there, and remove the one on the text. If you start again, and go for the same message, you shoud break at the -code- where the address is accessed. From there, it's a matter of looking at the disassembly of the code in mednafen, and seeing what is being done before the text is printed. Most probably, the text is being loaded indirectly (ie, lda  [ax], x) from some address in the zero page. Examine that address, and your text pointer table should be there.

Also keep in mind mednafen has a memory viewer, so you can set it to the zero page and look at the stack to see what routines have been called to get where you are.

Hope that helps.

megatron-uk

#55
Ok, so taking the guards example; the read breakpoint for the actual text string halts on the following instruction:

CD54: LDA ($30), Y (@ $423E = $C0 ; B1 30)
It appears that 0x00014000 is mapped in at $4000 at that point as the text is visible at $423E.
I clear the read breakpoint on *0001423E and set it for CD54. Stepping through it calls the same line with different offsets, the first one it breaks on is:

CD54: LDA ($30), Y (@ $400B = $00 ; B1 30)
After that it does

CD55: BEQ $CD9F ;F0 47
CD56: CMP #$20  ;C9 20

CD9F: INY ;C8
CDA0: JSR $C287;20 87 C2

C287: TYA ;98
C288: CLC ;18
C289: ADD $30 (@ $2030 = $0B) ;65 30
C28B: STA $30 ;85 30
C28D: BCC $2C91 ;90 02

C291: RTS ;60

CDA3: DEC $34 (@ $2034 = $12) ;C6 34
CDA5: BNE $CD51 ;D0 AA

CD51: LDY #$FF ;A0 FF
CD53: INY ;C8
CD54: LDA ($30),Y (@400C = $1A) ;B1 30

This continues for hundreds and hundreds of cycles, incrementing the address used by the LDA @ PC CD54 by one each time - so many that I just had to hold 'r' until it popped up the dialogue box.

Edit: Looking back at both traces, in each case it seems the operand to load the text string address is sourced from $5E (as both $5E and $5F are being set earlier).

NightWolve

#56
Quote from: TurboXray on 02/09/2014, 05:13 PMThat's hardcore old skool!
Yeah, a "brute-force linear search," so max search time equals 1-n, thus worst case scenario with 256 KB RAM, the byte that you're looking for is the last one, so I would've had to press "Enter", "Alt+Tab", "F5", etc. each about 262,144 (x3-4) times!!! Heh-heh. No cheating either, you couldn't say INCrement/DECrement 10 bytes at a time in the state file to increase your search range and then test in-game, because you would likely damage computer instructions at the same time and crash it, so it was gonna have to be one byte at a time if you were gonna do it!

Anyway, there goes a "protip" for ya megatron, that is, if for some reason you had no hope of finding something any other way!  :lol: I wanted to find some way to usefully contribute to your thread, so this was the best that I could do! ;)

megatron-uk

Quote from: NightWolve on 02/10/2014, 04:55 AMAnyway, there's a "protip" for ya megatron, that is, if for some reason you had no hope of finding something any way else! ;)
:lol: Thanks for that one

OldMan

It looks like $30 and $31 hold a pointer to the current character to print. The code posted appears to be part of the character-convert code (ie, the stuff that interprets the control characters.)
Note that bne $cd51 - that's probably the jump to re-start the loop with another character.
If you set a break there, you should be able to step and see one character at a time appear.

You say it sets $5e and $5f to the string address earlier. Set a break on that, and see what appears on screen when it breaks. If you see new strings each time, then whatever is being put into $5e and $5f may be your string pointer table.
In any case, you now know where the logic is to interpret the strings. From that you should be able to deduce all the control codes, and what information they use.

megatron-uk

Thanks again for the info - I'll take a look at that and see if I can find where $5e/$5f are getting set.

TurboXray

I looked at the code, when in the first town. You're right, there's no pointer table. Not in the traditional sense.

 The text block is parse from the very first string of the block, until it reaches 0x00 (I assume End of String), then decrements the internal counter (which is the string offset mechanism). This is an old (and sloppy) method of text string handling. But on that note, it's easier for you. As in, you don't need to worry about any pointer tables. Simply just replace the strings. Though you'll have to reposition all the proceeding strings, but that's cake.

 You'll just have to locate the block of text. Which, 1) you can do visually if you have a hex editor with custom table/font, 2) could probably do this as you play through the game (find out which bank is loaded at MPR #2 and 3 in the debugger).

megatron-uk

Really? It's as straightforward as that?  #-o

I'll have to work backwards from that string I know and find the start - like you say, that won't be difficult (even with the two-char set font table). It looks like there is some extra space to play with (in the case some strings need to be made longer) as there are a block of some 40-odd 0-bytes at the end of that big block of text. This seems to be the case with several other big blocks of text, very useful!

It shouldn't be too difficult to extract those strings now. What would be interesting is to find out what those pointer-table looking structures are though.

megatron-uk

Actually, it doesn't seem quite as simple. Certain text blocks appear to use 0x00 as delimiters, but others are missing 0x00 and instead use 0x08.... and (at least) the intro cinematic uses a pairing of 0x04 followed by 0x3C (guessing one is end of string, one is start).

I'll need to work out which block uses which delimiter method and dump it using the appropriate method. Not so bad really though.

I wonder if they all use the same 'scan N strings until you get to X' in order to display the correct text though?

TurboXray

Yes, the routine/code specifically looks for 0x00 or 0x08. I figured you knew what all the control codes do. Is 0x08 a redirection? That's common in game. Strings split and re-continue at other places. It's usually used as a sort of 'compression' for redundant text. BubbleGum Crash game did this.

 If you run out of string space, you can write hook code that uses your own control codes to point to text that's in expanded rom area (again, did this with BGC on PCE).

megatron-uk

Quote from: TurboXray on 02/10/2014, 05:51 PMYes, the routine/code specifically looks for 0x00 or 0x08. I figured you knew what all the control codes do. Is 0x08 a redirection?
No, I don't think 0x08 is used like that, at least, I've not seen any examples of it used in that context. That said, I've not gone far into the game to see much text, so it may be feasible.

The control codes I've worked out so far are:

00 - Delimiter type 1
02 - Newline
03 - Newline
04 - Delimiter start type 2
3C - Delimiter end type 2
08 - Delimiter type 3
0D - Pad with N leading spaces

I'm going to spend some time playing with the various 2-byte control codes at the start of some of these blocks - they're a dialogue window property pairing.

TurboXray

#65
0x08 is wait for button to be pressed, for text to continue.
0x00 is string terminator (end of string).

 But there's a TST #$01, <$60 instruction, right after fetching the byte from the string. Not sure what it does, but it when the condition is met - it uses a different sets of compares. Looks like the game modifies this. Probably from a control code. I put $01 in there and the game put a bunch of blocks on screen.
 

 It's possible the game uses more than one type of text format/routine. I'd concentrate on the town text first, since that'll be the bulk of the game text.

Edited

TurboXray

#66
All normal control codes appear to be 0x00-0x1f. A few seem to be in the upper range (above 0x7f), but seem to be directly text related (probably japanese accent marks).

 I didn't see anything for 0x3c.

 Read char:
 if >0x20, do...
 else if ==0x5c, do..
 else if <0xDE, do..
else if >=0xE0, do...


 If <0xDE true, then
 if <0xA0, do...
else do...

megatron-uk

Yep, 0xDE/0xDF are the accent marks that appear over several characters - I've got them mapped in my table, so rather than printing two characters out when I dump the text, it will print the proper accented character.

megatron-uk

Quote from: TurboXray on 02/10/2014, 07:01 PM0x08 is wait for button to be pressed, for text to continue.
0x00 is string terminator (end of string).
Ah, interesting. That would explain why certain strings (such as cinematics) don't have it, as they automatically close after a short period of time.

TurboXray

That's all for the in town text routine. Not sure about the other routines.

megatron-uk

Quote from: TurboXray on 02/06/2014, 12:52 PMThat aside, have you thought about upgrading any of the graphics? They're probably in 2bit/3bit format. If you expand the rom, you could hook the character/sprite upload routine to upload upgraded 4bit versions of the graphics. A LOT of early hucard games used simple 1/2/3bit graphic formats, because there is no decompression resource penalty (planar graphics) and can be uploaded to vram on the fly without decompressing to a buffer first.
I've been loading up the translated version for the SNES to compare things, and you know what? In terms of graphics it's not significantly better.

  • The ship screens are much more detailed and less blocky on the PCE.
  • The character portraits are more colourful on the SNES, but that gives them a more cartoonish appearance (debatable whether that is better or worse).
  • The mecha detail screens where you choose your weapons etc - the mecha designs on the PCE are much chunkier and detailed. SNES designs are better coloured, but the design are not as good and have a 'plastic' look to them.
  • Overworld map screen - it's pretty close between the two systems, neither of which are particulary great.
  • Town screen - brighter colours but not quite as much detail on the SNES.
  • Dialogue boxes - SNES has transparent windows, but text is much blockier than the PCE.
  • Battlefield screen - this is where the SNES clearly shines, the small non-animated mech/enemy sprites are more colourful and more detailed than on the PCE, which, imo, look horrible. Animations and the larger 'firing' sprites are actually fairly evenly matched, SNES edges it slightly, but the PCE is not significantly worse. One area the PCE sucks at is the ground/grass/sky texture. It's horrible and makes the whole screen look really messy.

Out of all of them, the one area the PCE could definitely do with improvement is the battle screen. The
small sprites could do with being replaced and the ground/grass texture could actually do with having less detail, it's very 'noisy'.

megatron-uk

The mecha selection screen.

megatron-uk

Battle screen.

megatron-uk

Oh, and one area where I think the PCE stands head and shoulders above the SNES is the music score. The hard edged synth style music on the PCE really suits the game, whereas the SNES, although having more realistic sounds, just doesn't seem a right fit.

And the opening cinematic. The SNES tries to pull it off with some scaling routines, but it comes off looking corny compared to the anime-style panels showing the SS-Swordfish and the pursing vessels on the PCE.

The sprites in the viewscreen before the jump drive look much better than the very blocky scaling used on the SNES intro.

Still, the battle map on the SNES (although smaller) is much nicer.

megatron-uk

#74
I've started work on writing an extractor for the script. It's not anywhere near finished yet, but it can pull sections out and writes a document with the actual characters as found in the font table.

It does basic substitution and uses the correct character based on whether its in pre-font-shift or post-font-shift mode.

In the example I've given an arbitrary address range to translate. It would be useful to firm up the actual ranges used for the various dialogue sections at some point.

EDIT: There's a typo in the extractScript.py file as attached, the SWITCH_MODE byte should be defined simply as "5C", not as hex "\x5C". Change that and the font substitution will work correctly.

megatron-uk

This slight modification gives a better output.
Bear in mind that this extractor is only dealing with one type of string at the moment - that terminated by a 0x00. There are others (such as the intro text/cinematics and menus etc), and this will only work if you give the start and end address of a known block of text. That's the next key thing, IMO.

#!/usr/bin/env python

"""
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.


extractScript.py
================
A (basic, initial) script extractor for the PC-Engine game 'Cyber Knight'.

In order to run, it requires a headerless copy of the Cyber Knight ROM file
and a complete translation table (distributed with this programe).

John Snowdon <john@target-earth.net>
"""

import struct
import binascii

ROM_NAME = "Cyber Knight (J).pce"
TABLE_NAME = "CyberKnightTranslation.csv"
OUT_NAME = "out.sjs"
METHOD_1 = 1
METHOD_2 = 2
METHOD_3 = 3
SWITCH_MODE = '5C'

def load_table():
"""
load_table - load the translation table.
The translation table is a tab delimited data file
with the following columns:
hex code, actual char pre-0x5c byte, char set type (A/K/H/S), post-0x5c byte, char set type, notes

where A/S/H/K = ASCII, Symbol, Hiragana, Katakana
pre-0x5c = the character shown if the byte come before a 0x5c control byte
post-0x5c = the character shown if the byte comes after a 0x5c control byte
"""
trans_table = {}

f = open(TABLE_NAME, "r")
for line in f:
columns = line.split('\t')
byte_code = columns[0].replace('"', '')
trans_table[byte_code] = {}
trans_table[byte_code]["byte_code"] = byte_code
trans_table[byte_code]["pre_shift"] = columns[1].replace('"', '')
trans_table[byte_code]["pre_shift_type"] = columns[2].replace('"', '')
trans_table[byte_code]["post_shift"] = columns[3].replace('"', '')
trans_table[byte_code]["post_shift_type"] = columns[4].replace('"', '')
trans_table[byte_code]["notes"] = columns[5].replace('"', '')
f.close()
return trans_table


def translate_string(byte_sequence, trans_table):
"""
translate_string - construct the actual text, using multi-byte characters
where appropriate, that represent the hex codes found in the rom.
e.g. 0x1A 0x5F 0x76 0x61 0x62 0x63 0x64 0x65 0x00 = <control><control>vabcde<end>
"""

# method1 has two leading control bytes and a null byte as terminator
byte_sequence["text"] = []
if (byte_sequence["method"] == METHOD_1):
previous_b = ""
switch_mode = False
for i in range(2, len(byte_sequence["bytes"]) - 1):
b = str(binascii.hexlify(byte_sequence["bytes"][i])).upper()
if switch_mode:
if b == SWITCH_MODE:
switch_mode = False
else:
if b in trans_table.keys():
byte_sequence["text"].append(trans_table[b]["post_shift"])
else:
# warning - byte sequence not in table
print "WARNING: Untranslated byte <%s>" % b
byte_sequence["text"].append("<%s>" % b)
else:
if b == SWITCH_MODE:
switch_mode = True
else:
if b in trans_table.keys():
byte_sequence["text"].append(trans_table[b]["pre_shift"])
else:
# warning - byte sequence not in table
print "WARNING: Untranslated byte <%s>" % b
byte_sequence["text"].append("<%s>" % b)
return byte_sequence


def method1(ROM_NAME, rom_start_address, rom_end_address):
"""
method1 - extract text from a given byte range using
the notation of 2 control bytes, a variable number of
text bytes and then a single null closing byte.
e.g. 0x1A 0x2B 0x60 0x61 0x62 0x63 0x64 0x65 0x00
"""

ttable = load_table()

f = open(ROM_NAME, "rb")
f.seek(rom_start_address, 0)
rom_addr = rom_start_address

byte_strings = []
byte_sequence = {}
byte_sequence["start_pos"] = rom_addr
byte_sequence["bytes"] = []
byte_sequence["size"] = 0
byte_sequence["method"] = METHOD_1

while (rom_addr <= rom_end_address):
# Read a byte from the file at the current position
try:
byte = struct.unpack('c', f.read(1))[0]
if byte != "\x00":
# Add the byte
byte_sequence["bytes"].append(byte)
else:
# Add the end byte and record the string
byte_sequence["bytes"].append(byte)
byte_sequence["size"] = len(byte_sequence["bytes"])

# Generate the actual text string (which we will print for translation)
byte_sequence = translate_string(byte_sequence, ttable)

# Record the data
byte_strings.append(byte_sequence)

# Start a new byte sequence
byte_sequence = {}
byte_sequence["start_pos"] = rom_addr
byte_sequence["bytes"] = []
byte_sequence["size"] = 0
byte_sequence["method"] = METHOD_1

# Increment position ID
rom_addr += 1
except:
pass
f.close()
return byte_strings

def method2(ROM_NAME, rom_start_address, rom_end_address):
"""
method2 - extract text from a given byte range using
the notation of each string being wrapped in a single
control byte to start, and a single control byte to end.
e.g. 0x3C 0x60 0x61 0x62 0x63 0x64 0x65 0x04
"""
pass


byte_strings = method1(ROM_NAME, 0000, 500000)
f = open(OUT_NAME, "w")
for b in byte_strings:
f.write("Position: %s\n" % b["start_pos"])
f.write("Method: %s\n" % b["method"])
f.write("Length: %s\n" % b["size"])
f.write("Raw:\n")
for c in b["bytes"]:
f.write(str(binascii.hexlify(c)))
f.write(' ')
f.write('\n')
f.write("Text:\n")
for c in b["text"]:
f.write(c)
f.write('\n\n')
f.close()

megatron-uk

Here's the latest version of the script extractor. I've made it neater, moved the definition of the dialogue ranges up to the configuration options at the top of the file and have a help screen for the various input/output file options.

Run in its current form, it should generate an output file like this:

[
    {
        "block_range" : "0x1c87e-0x1c90d",
        "block_description" : "Main menu text and configuration options, start, continue, load, stereo/mono etc",
        "position" : "0x1c87e",
        "method" : 3,
        "start_bytes" : [],
        "raw_size" : 39,
        "raw" : ["20","ca","bc","de","d2","b6","d7","02","20","c2","c2","de","b7","b6","d7","02","20","5c","b8","db","b0","dd","5c","bb","b2","be","b2","02","20","b6","dd","b7","ae","b3","be","af","c3","b2","00"],
        "raw_text" : " はじめから\n つづきから\n クロ‐ンさいせい\n かんきぉうせってい",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1c87e-0x1c90d",
        "block_description" : "Main menu text and configuration options, start, continue, load, stereo/mono etc",
        "position" : "0x1c8a5",
        "method" : 3,
        "raw_size" : 27,
        "raw" : ["20","5c","ca","de","af","b8","b1","af","cc","df","d2","d3","d8","02","20","ba","b0","c4","de","5c","c6","ad","b3","d8","ae","b8","00"],
        "raw_text" : " ックアッメモリ\n コ‐にうりぉく",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1c87e-0x1c90d",
        "block_description" : "Main menu text and configuration options, start, continue, load, stereo/mono etc",
        "position" : "0x1c8c0",
        "method" : 3,
        "raw_size" : 26,
        "raw" : ["20","5c","d2","af","be","b0","bc","de","5c","bf","b8","c4","de","02","20","5c","bb","b3","dd","c4","de","d3","b0","c4","de","00"],
        "raw_text" : " メッセ‐そくど\n サウンモ‐",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1c87e-0x1c90d",
        "block_description" : "Main menu text and configuration options, start, continue, load, stereo/mono etc",
        "position" : "0x1c8da",
        "method" : 3,
        "raw_size" : 23,
        "raw" : ["20","20","b5","bf","b2","20","20","cc","c2","b3","20","20","ca","d4","b2","20","20","5c","c0","b0","ce","de","00"],
        "raw_text" : "  おそい  ふつう  はやい  タ‐",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1c87e-0x1c90d",
        "block_description" : "Main menu text and configuration options, start, continue, load, stereo/mono etc",
        "position" : "0x1c8f1",
        "method" : 3,
        "raw_size" : 14,
        "raw" : ["20","20","5c","bd","c3","da","b5","20","20","d3","c9","d7","d9","00"],
        "raw_text" : "  ステレオ  モノラル",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1c87e-0x1c90d",
        "block_description" : "Main menu text and configuration options, start, continue, load, stereo/mono etc",
        "position" : "0x1c8ff",
        "method" : 3,
        "raw_size" : 15,
        "raw" : ["0d","06","43","59","42","45","52","20","4b","4e","49","47","48","54","00"],
        "raw_text" : "<pad><06>CYBER KNIGHT",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1defc",
        "method" : 1,
        "start_bytes" : ["1a","5f"],
        "raw_size" : 32,
        "raw" : ["1a","5f","4d","49","43","41","a2","bb","b8","be","dd","5c","b4","d8","b1","5c","b6","d7","20","ca","bd","de","da","c3","b2","cf","bd","a3","08","1b","01","00"],
        "raw_text" : "MICA『さくせんエリアから はずれています』<wait><1B><01>",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1df1c",
        "method" : 1,
        "start_bytes" : ["10","34"],
        "raw_size" : 27,
        "raw" : ["10","34","03","a2","b6","b2","be","b7","20","bd","d9","d3","c9","ca","20","c5","c6","d3","20","c5","b2","bf","de","21","a3","08","00"],
        "raw_text" : "\n『かいせき するものは なにも ないぞ!』<wait>",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1df37",
        "method" : 1,
        "start_bytes" : ["10","35"],
        "raw_size" : 37,
        "raw" : ["10","35","a2","b9","b6","de","ca","20","c5","b5","af","c0","dc","d6","a1","03","b1","cf","d8","20","d1","c1","ac","a6","20","bc","c1","ac","20","c0","de","d2","d6","21","a3","08","00"],
        "raw_text" : "『けがは なおったわよ゛\nおまり むちを しち だめよ!』<wait>",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1df5c",
        "method" : 1,
        "start_bytes" : ["10","35"],
        "raw_size" : 29,
        "raw" : ["10","35","a2","ba","da","b6","de","20","b1","c5","c0","c0","c1","c9","20","5c","b8","db","b0","dd","ba","b0","c4","de","5c","d6","a1","a3","00"],
        "raw_text" : "『これが おなたたちの クロ‐ンコ‐よ゛』",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1df79",
        "method" : 1,
        "start_bytes" : ["10","35"],
        "raw_size" : 31,
        "raw" : ["10","35","a2","ba","b2","c9","d4","cf","b2","ca","20","dc","c0","bc","c9","20","c0","dd","c4","b3","bc","de","ac","03","c5","b2","c9","a1","a3","08","00"],
        "raw_text" : "『こいのやまいは わたしの たんとうじ\nないの゛』<wait>",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1df98",
        "method" : 1,
        "start_bytes" : ["10","36"],
        "raw_size" : 34,
        "raw" : ["10","36","a2","bc","ad","b3","d8","20","b6","dd","d8","ae","b3","a1","03","0d","04","d3","b3","20","ba","dc","bc","c3","20","b8","da","d9","c5","d6","21","a3","08","00"],
        "raw_text" : "『しうり かんりぉう゛\n<pad><04>もう こわして くれるなよ!』<wait>",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1dfba",
        "method" : 1,
        "start_bytes" : ["0c","10"],
        "raw_size" : 38,
        "raw" : ["0c","10","36","a2","5c","d8","cd","df","b1","b7","af","c4","a5","b7","ad","b1","b7","af","c4","5c","a6","20","ce","bc","de","ad","b3","03","bc","c4","b2","c0","be","de","a1","a3","08","00"],
        "raw_text" : "6『リアキット・キュアキットを ほじう\nしといたぜ゛』<wait>",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1dfe0",
        "method" : 1,
        "start_bytes" : ["10","36"],
        "raw_size" : 45,
        "raw" : ["10","36","a2","c4","de","ba","d3","20","ba","dc","da","c3","c5","b2","be","de","a1","03","0d","04","5c","c1","ad","b0","dd","5c","ca","20","5c","b7","de","dd","b7","de","dd","5c","c0","de","be","de","21","21","a3","08","00"],
        "raw_text" : "『どこも こわれてないぜ゛\n<pad><04>チュ‐ンは ンンだぜ!!』<wait>",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1e00d",
        "method" : 1,
        "start_bytes" : ["10","36"],
        "raw_size" : 27,
        "raw" : ["10","36","a2","bf","c9","cf","b4","c6","20","5c","d3","bc","de","ad","b0","d9","5c","a6","20","b7","c3","b8","da","a1","a3","08","00"],
        "raw_text" : "『そのまえに モュ‐ルを きてくれ゛』<wait>",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1e028",
        "method" : 1,
        "start_bytes" : ["20","10"],
        "raw_size" : 7,
        "raw" : ["20","10","ff","09","05","02","00"],
        "raw_text" : "゜<09><05>\n",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1e02f",
        "method" : 1,
        "start_bytes" : ["10","36"],
        "raw_size" : 19,
        "raw" : ["10","36","a2","5c","b8","de","af","c4","de","20","d7","af","b8","21","21","a3","1c","01","00"],
        "raw_text" : "『ッ ラック!!』<1C><01>",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1e042",
        "method" : 1,
        "start_bytes" : ["10","ff"],
        "raw_size" : 5,
        "raw" : ["10","ff","09","04","00"],
        "raw_text" : "<09><04>",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1e047",
        "method" : 1,
        "start_bytes" : ["10","30"],
        "raw_size" : 38,
        "raw" : ["10","30","a2","b5","b6","b4","d8","c5","bb","b2","21","21","03","0d","04","5c","d0","af","bc","ae","dd","5c","20","ba","de","b8","db","b3","bb","cf","c3","de","bc","c0","a1","a3","02","00"],
        "raw_text" : "『おかえりなさい!!\n<pad><04>ミッション ごくろうさまでした゛』\n",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1e06d",
        "method" : 1,
        "start_bytes" : ["10","35"],
        "raw_size" : 35,
        "raw" : ["10","35","a2","cf","c0","bc","dd","c0","de","c9","a1","03","b2","bf","b2","c3","de","20","5c","b8","db","b0","dd","5c","bb","b2","be","b2","20","bd","d9","dc","a1","a3","00"],
        "raw_text" : "『またしんだの゛\nいそいで クロ‐ンさいせい するわ゛』",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x1defc-0x1e0a5",
        "block_description" : "Unknown, possible ship dialogue for first world",
        "position" : "0x1e090",
        "method" : 1,
        "start_bytes" : ["0c","10"],
        "raw_size" : 21,
        "raw" : ["0c","10","35","a2","c2","b7","de","ca","20","b7","a6","c2","b9","c5","bb","b2","d6","a1","a3","08","00"],
        "raw_text" : "5『つぎは きをつけなさいよ゛』<wait>",
        "trans_size" : 0,
        "trans_text" : ""
    }
]

It's valid JSON format, so if you want to import it back into anything, you should be able to in almost any programming language with minimal effort.

megatron-uk

#77
OK, so here is what my extract script produces for the introductory cinematics:

[
    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x29eff",
        "method" : 2,
        "start_bytes" : [],
        "raw_size" : 29,
        "raw" : ["11", "16", "1a", "72", "5c", "c5", "dd", "ca", "de", "b0", "34", "cc", "de", "db", "af", "b8", "5c", "c6", "a4", "5c", "c0", "de", "d2", "b0", "bc", "de", "21", "04", "3c"],
        "raw_text" : "<16><1A>rなんば―4ぶろっくニ、だめ―じ!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x29f1c",
        "method" : 2,
        "raw_size" : 19,
        "raw" : ["0c", "5c", "b7", "b1", "c2", "b6", "de", "20", "bb", "b6", "de", "af", "c3", "b2", "cf", "bd", "21", "04", "3c"],
        "raw_text" : "きおつが さがっています!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x29f2f",
        "method" : 2,
        "raw_size" : 20,
        "raw" : ["1a", "73", "ba", "c1", "d7", "20", "5c", "d2", "c3", "de", "a8", "b6", "d9", "a5", "d9", "b0", "d1", "21", "04", "3c"],
        "raw_text" : "sコチラ めでぃかる・る―む!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x29f43",
        "method" : 2,
        "raw_size" : 21,
        "raw" : ["0c", "bc", "bd", "c3", "d1", "5c", "c6", "20", "45", "4d", "50", "5c", "c0", "de", "d2", "b0", "bc", "de", "21", "04", "3c"],
        "raw_text" : "システムに EMPダメ‐ジ!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x29f58",
        "method" : 2,
        "raw_size" : 28,
        "raw" : ["1b", "00", "00", "11", "16", "1a", "73", "5c", "cc", "de", "d8", "af", "bc", "de", "5c", "c6", "20", "b6", "bb", "b2", "20", "ca", "af", "be", "b2", "21", "04", "3c"],
        "raw_text" : "<end><end><11><16><1A>sぶりっじニ カサイ ハッセイ!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x29f74",
        "method" : 2,
        "raw_size" : 37,
        "raw" : ["1a", "72", "5c", "ba", "cf", "dd", "c0", "de", "b0", "21", "20", "b7", "ac", "cc", "df", "c3", "dd", "5c", "b6", "de", "3b", "3b", "3b", "3b", "5c", "b7", "ac", "cc", "df", "c3", "dd", "5c", "b6", "de", "21", "04", "3c"],
        "raw_text" : "rこまんだ―! きぷてんガ‥‥‥‥きぷてんガ!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x29f99",
        "method" : 2,
        "raw_size" : 38,
        "raw" : ["1b", "00", "00", "11", "16", "1a", "72", "bc", "bc", "ae", "b3", "bc", "ac", "20", "c0", "bd", "b3", "21", "20", "ba", "c9", "cf", "cf", "c3", "de", "ca", "20", "be", "de", "dd", "d2", "c2", "c3", "de", "bd", "21", "04", "3c"],
        "raw_text" : "<end><end><11><16><1A>rシショウシ タスウ! コノママデハ ゼンメツデス!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x29fbf",
        "method" : 2,
        "raw_size" : 17,
        "raw" : ["1a", "73", "b8", "bf", "a4", "b6", "b2", "bf", "de", "b8", "c4", "de", "d3", "d2", "21", "04", "3c"],
        "raw_text" : "sクソ、カイゾクドモメ!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x29fd0",
        "method" : 2,
        "raw_size" : 24,
        "raw" : ["0c", "b5", "da", "c0", "c1", "a6", "20", "c5", "cc", "de", "d8", "ba", "de", "db", "bc", "c6", "20", "bd", "d9", "b7", "b6", "21", "04", "3c"],
        "raw_text" : "オレタチヲ ナブリゴロシニ スルキカ!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x29fe8",
        "method" : 2,
        "raw_size" : 28,
        "raw" : ["1b", "00", "00", "1a", "71", "ba", "b3", "c5", "af", "c3", "ca", "20", "c0", "de", "af", "bc", "ad", "c2", "bd", "d9", "20", "bc", "b6", "c5", "b2", "a1", "04", "3c"],
        "raw_text" : "<end><end><1A>qコウナッテハ ダッシュツスル シカナイ゛",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a004",
        "method" : 2,
        "raw_size" : 24,
        "raw" : ["0c", "5c", "bc", "de", "ac", "dd", "cc", "df", "a5", "c4", "de", "d7", "b2", "cc", "de", "5c", "a6", "20", "c2", "b6", "b3", "21", "04", "3c"],
        "raw_text" : "じんぷ・どらいぶヲ ツカウ!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a01c",
        "method" : 2,
        "raw_size" : 17,
        "raw" : ["1a", "73", "bc", "b6", "bc", "a4", "5c", "ba", "cf", "dd", "c0", "de", "b0", "5c", "21", "04", "3c"],
        "raw_text" : "sシカシ、こまんだ―!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a02d",
        "method" : 2,
        "raw_size" : 37,
        "raw" : ["0c", "b2", "cf", "20", "5c", "bc", "de", "ac", "dd", "cc", "df", "5c", "bc", "c0", "d7", "20", "5c", "bc", "de", "ac", "dd", "cc", "df", "a5", "d0", "bd", "5c", "c9", "b7", "b9", "dd", "b6", "de", "3b", "3b", "04", "3c"],
        "raw_text" : "イマ じんぷシタラ じんぷ・みすノキケンガ‥‥",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a052",
        "method" : 2,
        "raw_size" : 33,
        "raw" : ["1a", "72", "bf", "b3", "c3", "de", "bd", "21", "20", "b3", "c1", "ad", "b3", "c9", "ca", "c3", "cf", "c3", "de", "20", "cc", "af", "c4", "dd", "c3", "de", "d5", "b8", "b6", "d3", "3f", "04", "3c"],
        "raw_text" : "rソウデス! ウチュウノハテマデ フットンデユクカモ?",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a073",
        "method" : 2,
        "raw_size" : 25,
        "raw" : ["19", "71", "c0", "de", "b6", "de", "20", "ba", "c9", "cf", "cf", "c3", "de", "ca", "20", "d4", "d7", "da", "c3", "bc", "cf", "b3", "a1", "04", "3c"],
        "raw_text" : "qダガ コノママデハ ヤラレテシマウ゛",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a08c",
        "method" : 2,
        "raw_size" : 20,
        "raw" : ["0c", "d0", "c1", "c9", "20", "b6", "c9", "b3", "be", "b2", "c6", "20", "b6", "b9", "c3", "d0", "d6", "b3", "04", "3c"],
        "raw_text" : "ミチノ カノウセイニ カケテミヨウ",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a0a0",
        "method" : 2,
        "raw_size" : 39,
        "raw" : ["0c", "5c", "ca", "df", "dc", "b0", "5c", "a6", "20", "bd", "cd", "de", "c3", "5c", "bc", "de", "ac", "dd", "cc", "df", "a5", "bc", "de", "aa", "c8", "da", "b0", "c0", "b0", "5c", "c6", "cf", "dc", "be", "21", "05", "1e", "04", "3c"],
        "raw_text" : "ぱわ―ヲ スベテじんぷ・じぇねれ―た―ニマワセ!<05><1E>",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a0c7",
        "method" : 2,
        "raw_size" : 22,
        "raw" : ["19", "73", "5c", "b4", "c8", "d9", "b7", "de", "b0", "a5", "b9", "de", "b2", "dd", "5c", "20", "4d", "41", "58", "21", "04", "3c"],
        "raw_text" : "sえねるぎ―・げいん MAX!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a0dd",
        "method" : 2,
        "raw_size" : 32,
        "raw" : ["05", "37", "19", "72", "5c", "b5", "b0", "d9", "a5", "bc", "bd", "c3", "d1", "20", "b4", "cf", "bc", "de", "aa", "dd", "bc", "b0", "a5", "d3", "b0", "c4", "de", "5c", "cd", "21", "04", "3c"],
        "raw_text" : "7<19>rお―る・しすてむ えまじぇんし―・も―どヘ!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a0fd",
        "method" : 2,
        "raw_size" : 37,
        "raw" : ["00", "19", "73", "ce", "de", "b3", "b7", "de", "ae", "5c", "cc", "a8", "b0", "d9", "c4", "de", "5c", "c9", "20", "5c", "ca", "df", "dc", "b0", "5c", "b6", "de", "20", "b5", "c1", "c3", "b2", "cf", "bd", "21", "04", "3c"],
        "raw_text" : "<19>sボウギョふぃ―るどノ ぱわ―ガ オチテイマス!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a122",
        "method" : 2,
        "raw_size" : 22,
        "raw" : ["19", "71", "b1", "c4", "20", "31", "30", "cb", "de", "ae", "b3", "c0", "de", "b9", "20", "d3", "c0", "be", "db", "21", "04", "3c"],
        "raw_text" : "qアト 10ビョウダケ モタセロ!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a138",
        "method" : 2,
        "raw_size" : 25,
        "raw" : ["19", "73", "5c", "bc", "bd", "c3", "d1", "a5", "ba", "dd", "c3", "de", "bc", "ae", "dd", "20", "b5", "da", "dd", "bc", "de", "21", "5c", "04", "3c"],
        "raw_text" : "sしすてむ・こんでしぉん おれんじ!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a151",
        "method" : 2,
        "raw_size" : 30,
        "raw" : ["19", "72", "5c", "bc", "de", "ac", "dd", "cc", "df", "5c", "bb", "de", "cb", "ae", "b3", "a6", "20", "be", "af", "c3", "b2", "c3", "de", "b7", "cf", "be", "dd", "21", "04", "3c"],
        "raw_text" : "rじんぷザヒョウヲ セッテイデキマセン!",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "block_range" : "0x29eff-0x2a1ad",
        "block_description" : "Introductory cinematics.",
        "position" : "0x2a16f",
        "method" : 2,
        "raw_size" : 63,
        "raw" : ["19", "71", "b6", "cf", "dc", "dd", "21", "20", "b2", "b8", "bf", "de", "21", "1a", "74", "3b", "3b", "3b", "3b", "33", "3b", "3b", "3b", "3b", "0c", "3b", "3b", "3b", "3b", "32", "3b", "3b", "3b", "3b", "0c", "3b", "3b", "3b", "3b", "31", "3b", "3b", "3b", "3b", "0c", "3b", "3b", "3b", "3b", "30", "0b", "0e", "05", "5c", "bc", "de", "ac", "dd", "cc", "df", "21", "04", "3c"],
        "raw_text" : "qカマワン! イクゾ!<1A>t‥‥‥‥3‥‥‥‥<0C>‥‥‥‥2‥‥‥‥<0C>‥‥‥‥1‥‥‥‥<0C>‥‥‥‥0<0B><0E><05>じんぷ!",
        "trans_size" : 0,
        "trans_text" : ""
    }
]

It looks reasonable, until you start to examine a few of the strings. Some of them are fine, however, some are in the wrong font set (remember Cyber Knight uses 0x5c to indicate 'swap to alt font set' - Katakana to Hiragana).
The problem being that at least a couple of the strings I've analysed should have a 0x5c byte embedded in order to match the on-screen text, but they don't. Whereas others do and match the on-screen text perfectly.

Depending on whether I choose the initial state to be Hiragana or Katakana, several of the strings break. One state doesn't satisfy all of them. e.g.

Defaulting to switch mode on (Katakana enabled by default) shows the correct output for the 'EMP' string, but causes incorrect characters for the 'MAX' string. Defaulting to Hiragana fixes the 'MAX' output, but switches the characters for 'EMP'.

BTW, code is now on github: https://github.com/megatron-uk/cyberknight-pce

Edit: As a quick workaround, I've enabled dual-output for the affected text - both 'assume katakana' and 'assume hiragana' strings are printed. I'm hoping that this is not also the case elsewhere throughout the game, but this needs more investigation.

megatron-uk

Ok, I now have a basic injector script that can take the output form the extractor, pick up those strings that have had a translation entered and reinject them back into the file.

This code is now up on github.

The injector only works with fixed length strings at the moment (eg main menu, hud elements, stat screens etc), but actually, the variable length strings such as dialogue etc, should actually be easier, as they're really only one very long string with the phrases delimited by 0x00 bytes.

The way I have it setup is to load multiple json patch files, so as the main menu text is done, I can forget about that one and move on to the next file - that way I don't have one massive translation file to do.

Here's the result of extractScript and injectScript translating the main menu dialogue strings:

Screenshot-Cyber Knight (E).png Screenshot-Cyber Knight (E)-1.png
Screenshot-Cyber Knight (E)-2.png Screenshot from 2014-02-25 22:07:37.png   

megatron-uk

#79
Player character name entry screen now done (excluding Kanji symbols), also started translating the list of npc names (located at 0x1cbad-0x1ccac). Here's the first few (your ship mates) re-injected back into the rom.

I haven't yet written an intelligent inserter, so I'm just using the same space I've got from the original strings - in the case of the npc names most of them give extra space when using ascii as they no longer need multiply 0x5c control bytes, that frees up some space for extra characters in the case of 'Ki', this is a 4 character name (Kiri), but I can't currently fit it in as one of those bytes is reserved by 0x00 as a delimiter to the next string. What I need to do now is write some code that given a number of strings (like the names, above) and an address range of a file, shuffle the starting positions of each reinserted string so that they are still in order (important for the seemingly dumb text lookup of Cyber Knight), but gives more space to those strings that need it, and less that don't.

TurboXray

So, you have most of it figured out?

megatron-uk

#81
Pretty much. Your realisation about the dialogue simply being null delimited and the display routine crudely looping over those bytes until it gets to a particular count pretty much broke the back of the thing. There are some exceptions (intro text, some fixed layout like the menu's, player status screen etc - but these are easy to replace and in almost all cases only occur once in the rom), but knowing that, writing an extractor and inserter was pretty simple.

The main things that need doing are identifying all of the text; which will probably only come with a playthrough, and working out how to do longer string insertion, when I come to need to do it for the in-game dialogue.

Oh yeah, the one thing I haven't looked at yet is replacing the double height Kanji characters - there are not many of them (30-40), so it should be possible just to write some new tiles to replace them; I may need help reinserting them however.

I've asked the translator of the SNES version (http://agtp.romhack.net/) if he has the script left over from doing the work, this would *really* speed things up, so I'm hoping he can find it - though it was done almost 10 years ago....

Once that is done, I'd love to do an overhaul of some of the sprites, using SNES ones where appropriate

megatron-uk

Did a short little clip of the translated main menu and some of the introduction - it's not the greatest quality (only my first attempt at using the video record features of Mednafen), but it shows the changes made so far.

Cyber Knight PC-Engine initial translation test

peperocket

Vive la Supergrafx !!!

esteban

Comrade, good work. I know you will see this through to the end. IMG
IMGIMG IMG  |  IMG  |  IMG IMG

Keith Courage

Awesome!!! I love it when pc engine games get translated.

megatron-uk

The introductory cinematics are now fully translated!
I've also had the help of a couple of people over on Romhacking.net on tracking down the last few unmapped characters (some are only 4-5 pixels high) and now the table is complete (though not all control codes are known yet).

Full files and the scripts to patch your own version are on github, as normal.

whisper2053

IMG
My Retro Gaming Channel: https://www.youtube.com/user/whisper2053

jeffhlewis

Great thread - makes me feel horribly inadequate about my programming skills, haha. keep up the good work!

megatron-uk

The scrolling intro text that appears after the intro cinema is now done. Most of what is translated so far is based on the SNES version, but some slight changes where I thought it sounded better. The patches are all on github (not the scrolling text yet however), so you're free to change it yourself.

This text is expanded compared to the japanese original - around 300 bytes more. Fortunately there's a big bank of 0xFF bytes (about 1k of it) that occurs just after the text.

Anyway, patch for this section is not up yet as I have one or two spacing issues to play with - there are a couple of words that run right up to the right hand screen edge.

Other than that, enjoy!

SuperPlay

Great work so far :-) looking forward to this when it is complete.

roflmao

To echo SuperPlay's sentiments, you are an inspiration.  I don't remember if I've posted to this thread yet, but wanted to share my enthusiasm for what you're doing. :D

megatron-uk

Thanks for all the kind words :D

I'm hopeful that when I start the main in-game text it will be all of a similar type to (null delimited strings). The best thing that could happen is I get hold of a dump of the raw and converted SNES translation - I don't have any Japanese language skills - so this would speed up the conversion process immensely.

One thing I've found even with just the intro text is the amount of extra space needed for an English representation of the Japanese text - it's often anywhere up to 50% longer. That could be a sticking point (if there's not room to expand into), and I may need to seek others' help to modify the text loader code as I know virtually zero 6502 assembly.

Things may slow down for the next few weeks - my wife and I have just accepted an offer on our house, so it's all hands to packing and moving duties!

megatron-uk

#93
Hit a bit of stumbling block with the next section of text that appears just before the game actually starts - it's just after the scrolling intro text that summarises what has happened so far.

It's translated, and inserted, but when I run the game it seems as if the previous text block insertion (the scrolling text, as posted earlier) breaks it. So rather than my nicely translated text, I get random hiragana/katakana characters.

The text inserted before this one was expanded into what seems to be an unused section of the rom (0x1bca7 - 0x1bff0, all set to 0xff). When I disable that particular patch, my new translation works correctly. I'm wondering if the game uses some offset from this last text block to jump to the next... in which case, any string expansion is going to fail, like this appears to.

megatron-uk

With expanded scrolling intro text patch applied and without. In both cases the translation is present for the displayed section, but in the case where the intro patch is applied, the output of this particular section is jibberish.

OldMan

QuoteThe text inserted before this one was expanded into what seems to be an unused section of the rom (0x1bca7 - 0x1bff0, all set to 0xff)
Just out of curiosity, did you insetrt the text at the beginning or end of that block? Some of those xFF's may actually be termination codes....

megatron-uk

The beginning. It runs from it's normal start location to around 200-300 bytes into that 0xff section. It's around 850 bytes in normal length, it's now about 1100. The remainder (and end) of the 0xff block is intact.

megatron-uk

Slight OT, but interesting nontheless; it appears as though Cyber Knight may well have a hidden sound/music test facility as I've found this in one of my latest dumps:

    {
        "string_description" : "unknown 1",
        "string_start" : "0x1c0c4",
        "method" : 3,
        "start_bytes" : [],
        "end_bytes" : ["00"],
        "raw_size" : 137,
        "raw" : ["0b", "05", "06", "2a", "2a", "2a", "20", "43", "79", "62", "65", "72", "20", "4b", "6e", "69", "67", "68", "74", "20", "2a", "2a", "2a", "0b", "05", "08", "2a", "2a", "2a", "20", "20", "53", "4f", "55", "4e", "44", "20", "54", "45", "53", "54", "20", "20", "2a", "2a", "2a", "0b", "08", "0c", "41", "75", "64", "69", "6f", "20", "6d", "6f", "64", "65", "20", "3a", "0b", "08", "0e", "42", "47", "4d", "20", "52", "65", "71", "20", "4e", "6f", "2e", "3a", "0b", "08", "10", "42", "47", "4d", "20", "52", "65", "71", "20", "4e", "6f", "2e", "3a", "0b", "08", "12", "45", "46", "43", "20", "52", "65", "71", "20", "4e", "6f", "2e", "3a", "0b", "08", "14", "45", "46", "43", "20", "52", "65", "71", "20", "4e", "6f", "2e", "3a", "0b", "08", "16", "46", "61", "64", "65", "2d", "4f", "75", "74", "20", "20", "20", "3a", "00"],
        "raw_text" : "<0B><05><06>*** Cyber Knight ***<0B><05><wait>***  SOUND TEST  ***<0B><wait><erase>Audio mode :<0B><wait><0E>BGM Req No.:<0B><wait><10>BGM Req No.:<0B><wait><12>EFC Req No.:<0B><wait><14>EFC Req No.:<0B><wait><16>Fade-Out   :",
        "alt_text" : "<0B><05><06>*** Cyber Knight ***<0B><05><wait>***  SOUND TEST  ***<0B><wait><erase>Audio mode :<0B><wait><0E>BGM Req No.:<0B><wait><10>BGM Req No.:<0B><wait><12>EFC Req No.:<0B><wait><14>EFC Req No.:<0B><wait><16>Fade-Out   :",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "string_description" : "unknown 1",
        "string_start" : "0x1c14d",
        "method" : 3,
        "start_bytes" : [],
        "end_bytes" : ["00"],
        "raw_size" : 12,
        "raw" : ["0b", "15", "0c", "4d", "6f", "6e", "61", "75", "72", "61", "6c", "00"],
        "raw_text" : "<0B><15><erase>Monaural",
        "alt_text" : "<0B><15><erase>Monaural",
        "trans_size" : 0,
        "trans_text" : ""
    },

    {
        "string_description" : "unknown 1",
        "string_start" : "0x1c159",
        "method" : 3,
        "start_bytes" : [],
        "end_bytes" : ["00"],
        "raw_size" : 12,
        "raw" : ["0b", "15", "0c", "53", "74", "65", "72", "65", "6f", "20", "20", "00"],
        "raw_text" : "<0B><15><erase>Stereo  ",
        "alt_text" : "<0B><15><erase>Stereo  ",
        "trans_size" : 0,
        "trans_text" : ""
    },

megatron-uk

For now I've backed out the translated scrolling text from the intro. Until I can work out how to get the longer english text in there without affecting later strings I've moved on with the in-game dialogue.

I'm now working on translating the load/save game screens as well as the player character/npc stats screens. The load save game screen is done, except for the Kanji - the player stats screen is also mostly done; including the 4 skill categories (combat, tech/engineering, medic, science) and player roles (commander, soldier, doctor, professor, tech/engineer) and sex.

I'm starting to try and figure out how the Kanji tiles are printed - I thought they may have been double-byte chars, but that doesn't appear to be the case, and I'm not sure yet how the text printing engine figures out if a byte is part of a kanji sequence or a single kana/hira/ascii tile....

esteban

Sound Test must be revealed. IMG
IMGIMG IMG  |  IMG  |  IMG IMG