Sega Lord X reviews the Street Fighter II Champion Edition PC Engine port.
Main Menu
Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - dshadoff

#1
Quote from: gilbert on 01/02/2019, 03:25 AMOr do you mean that the English fonts in the system cards suck (regardless of region), but it's only less annoying when most of the text are in Japanese?

I think for whatever reasons, the English font in Japanese system card 3.0 is a bit different from those in 1.x as well (don't know when it was changed, either from v1 -> v2, or v2 -> v3), as the screenshots in game manuals of non-super CD games usually show letters that are a bit different from playing the game with system 3.0.
Edit: If Wikipedia is to be trusted, the 12x12 and 16x16 fonts were changed in Super System Card(i.e. since v3) to add more symbols and enhance readibility. Maybe for unknown reasons the western system card v3 still uses the old fonts?
I should have been more clear with the scope of my change - I only changed:
1) Upper and lower-case letters from A-Z
2) Numbers from 0-9
3) The '?' and '!' punctuation marks

...And the American cards contain all the same font information as their corresponding Japanese cards (to the best of my knowledge).


System Card 3.0 probably does have more symbols, but that wasn't the focus of my studies.
If I can get the ZIP file uploaded to the board, there is a utility in there to browse the characters sets.

I just checked what you mentioned about the updated font, and it appears that they did make a couple of changes:

System Card 1.0 -> 2.0:
- replace the 12x12 numeric characters with a slightly larger, less-bold font (but this doesn't exactly match the rest of the Western characters anymore)

System Card 2.x -> 3.0
- replace at least the 12x12 uppercase alphabetic characters with new ones.

It seems that the old ones might have been too 'blocky', so they may have thought that they were making them prettier - but the replacements don't use the same baseline, so capitals sit lower than lowercase characters (which is jarring).  This would have been obvious if they had written a sentence in English, so I can't accept that it enhances readability.

IMG IMG
IMG
#2
I'm putting together a few articles on the subject of translation patch creation; I'm hoping that the forum-post format (thread per major subject) will allow a lively supplemental discussion to follow each post and explore a little deeper into areas in which people are interested. In the event that an initial post on a topic is too large, I'll try to break it into sections and post consecutively to the same thread.

Part of the process of selecting a game is determining how difficult the technical portion will be.

HuCard games generally use the 8x8 tiles available in hardware – but can define their own custom character sets, often making it difficult to search for text (due to the custom encoding). I'm not going to focus on this type of game today, but I will mention that the best place to start is to locate the definition of the character set (held in VRAM), so the encoding can be determined. This would be done with Mednafen in the same way as any character graphics would be found/isolated, and worked back to the source location.

Today's focus, however, is on locating the print function and script from CDROM games (or at least trying to).

Before We Start
  • You should have a working version of Mednafen (any version in the past few years should be adequate).
  • Also, have a digital copy of a CDROM game you wish to locate text on (RPGs or digital comics are likely to be better examples). Make sure that your digital copy is ISO/WAV. That is to say, the CUE file should refer to the data track(s) as "MODE1/2048".
Note that I will be using hexadecimal a lot in this post; since the 6502 convention is to prefix values with the dollar sign and capitalize the letters, I will try to ensure that this is done for addresses and values that the processor uses (i.e. '$F8'). For offsets into the ISO file, I generally use the 'C' convention of the prefix '0x', often with lowercase alphabetic characters ('0xffff'). And for a string of bytes, I hope that just using the pattern of repeating 2 digits+space is adequate.  It'll make sense... (I hope).


Where to Start ?

Like any sufficiently difficult puzzle, the key is to start at the most basic/simple/already familiar part, working outwards and solving the unknown at the edge of what it already known.

In this case, the key is the kanji graphics – NEC had the foresight to put a substantial kanji character set into the System Card, so that game developers wouldn't have to create their own character set definitions for the huge set of kanji in the Japanese language (effort which is better spent on other graphics). In order to make use of it, the game needs to make a system card call with a 2-byte SJIS value, getting the graphics data back in a buffer. This in turn means that the text to be printed is either stored directly in SJIS, or in a source format from which SJIS can be created easily.

The EX_GETFNT Call

The EX_GETFNT function is at location $E060, and the system card functions always expect parameters to be passed via the zero-page location between $F8 and $FF (or in registers).

For EX_GETFNT, the parameters are passed as follows:
$F8/$F9 = Kanji code – note: this processor is little-endian, so $F8 holds the least-significant byte (LSB), and $F9 holds the most significant (MSB)
$FA/$FB = destination address for the graphics (32-byte buffer)
$FF = transfer mode ($00 for 16x16 size; $01 for 12x12 size)


Mednafen's Debugger

If you've never used Mednafen's debugger before, it's indispensible for this kind of work. You should get accustomed to the debugging functions and features.
  • Start up your game, and the CDROM "Press Start" screen comes up.
  • Press 'ALT-D' to enter the debugger. Multicoloured information will appear, moving quite quickly. (Don't worry, you don't need to make sense of it yet.)
  • Press 'g', and a popup box will appear with the heading 'Disassembly Address'... back up over the existing address, and enter 'E060' (the EX_GETFNT address mentioned above)
  • A long list of 'JMP <address>' statements should appear in the disassembly list, with the 'E060' line highlighted. Press the spacebar to set a breakpoint, and a '#' will appear at that address.
  • Press 'ALT-D' again to make the debug screen disappear; now start the game. As soon as the EX_GETFNT function is called, the debug screen will appear again (and the game will stop executing)... if the game starts printing text without stopping, chances are that you've chosen a rare game which doesn't use the built-in font. Or, perhaps the game has stored some title-screen graphics as graphic data, and the game isn't actually trying to print anything yet... advance the game a little bit to confirm.

OK, The Debugger Returned... Now What ?

So now, the debugger appeared again, and the game stopped. The disassembly list shows the jump table, just like the last time you left it, with the E060 line highlighted. You might ask yourself... "now what ?"

Get your deerstalker cap out of the closet... the game is afoot !

I've attached a screenshot of this exact moment while playing Dead of the Brain 1:

IMG

If you look closely, you'll see that I've also put a few red rectangles around some key information:

The patchwork square(ish) block of coloured numbers is zero page memory, which you will frequently consult while debugging; I put boxes around each of the parameters which EX_GETFNT uses... so:
$F8/$F9 -> shows us that the SJIS character is $8352 (remember, LSB is stored first)
$FA/$FB -> show us that the graphics buffer is at $3529
$FF -> shows us that the 12x12 version of the character is being requested

I placed another box in a list area – this is a traceback queue, which tells us where the processor has been before it came here. If you hit 'g' then put 'C778' in as the address, Mednafen will display a disassembly of the most recently-executed section of the game's print function.


Suggested Clue Gathering

A short list of things I usually do next is as follows (but other people may have a different approach):
  • Note the location of the print function tidbit ($C778), as this will be the part of the print function from which the disassembly will start. This disassembly is needed in order to gain an understanding of how the function works internally (and the basis of how to patch it). Much of the traceback queue will be parts of the print function; try to understand its scope.
  • Write down a sequence of actual bytes from the routine (several bytes before and including the call to $E060). Later, search the ISO file to find the origin sector(s) of this program; usually it just needs a few bytes in order to find it definitively (although sometimes the same code may appear more than once, because it is repeated in different overlays, or implemented several times related to different parts of the game).
  • While the disassembly is still open, hit 'R' (run), and there should be a brief advancement in the game (about 1/60 of a second), before the next call to EX_GETFNT. Note down this SJIS value as well (on Dead of the Brain 1, this is $815B). And get one more SJIS value... (DotB1 = $838B). Using these values, search through the ISO to locate this group of bytes to find the string. Hopefully, the first few characters are more unique than the name of the main character (which may show up hundreds of times).

If all of this works out, you are well on your way... but if it doesn't, here are a few possibilities:
  • If EX_GETFNT is never called, you'll need to find a completely different way to get at the text and the print function.
  • If EX_GETFNT is called, but you can't find your SJIS on the disc (the above example would search for the following sequence of bytes in hexadecimal: 83 52 81 5B 83 8B), make sure that you haven't transposed them... the script itself is MSB first (such as in a byte stream), but as a 2-byte value used by the processor, it's LSB first. Could be confusing the first time you see it.
    • If you still can't find it, the text could be stored in a different format – either as compressed blocks, or using a token system, or some other format. Or it may have some control characters interspersed... If it's not easy to locate the first string, that game may not make a good candidate for a first translation... (unless you were already using assembly in the 8-bit era, and enjoy a good challenge).
  • If EX_GENTFNT is called, but you can't locate the print function on disc based on the bytes you grabbed from the disassembly, check again – you may have transcribed something improperly (it's happened to me). It's not very likely that the code itself would be compressed or self-modifying.

Next Steps (Still Early Days)

Next, you could continue in either of two places – the script, or the print function.

For the script:

You may want to make a small adjustment in the script (within the ISO, where you found it):
  • Change a Kanji character into – for example – SJIS 'A' (hex 8260), just to prove to yourself that you actually found it. (Run the game again to see the effect)
  • Then, try changing it into couple of ASCII characters, just to see whether the print function can currently support regular ASCII (rare, but worth a try).
In order to really understand the script organization, though, you'll need to understand some more about the tokens, and the overall complexity of the strings. For that, you'll need at least some of the print function to be disassembled and understood.

For the print function:

Use a disassembler, and read the code in order to distil meaning from it.
...I know, easier said than done - but as I mentioned at the beginning of the post, start with things that are obvious, and comment them until you reach the edge of what is obvious. Including the scratchpad RAM usage. And a 100% understanding isn't always needed in order to get what you need.

So, this will start with the part leading up to the call to EX_GETFNT; if you trace back enough, you'll find the loop where it fetches the string's characters, and checks token values. At some point, as you try to understand what the original programmer was doing, you may reach a dead end... at that point, look for other familiar things, such as accesses to the VRAM (another 'fixed truth' of the machine are the VDC hardware addresses), and look at how they manipulate data and so on.

It's not a trivial piece of work, so you will need patience and an inquisitive nature to accomplish this. Chances are, you will at some point find something that looks like a bug. Maybe it is a bug, but the programmer 'fixed' it with a countervailing bug elsewhere. Or the programmer had a strange way of viewing the problem and implemented the solution in a completely counter-intuitive and inefficient way. Ah, the joys of examining somebody else's code...

Reverse-engineering somebody else's program without source code is not easy (it's often difficult even with source code!), but – thinking of it as a puzzle – it can be incredibly satisfying to solve.

I'm going to repeat this, because I don't think I can stress it enough – while understanding the print function, I found the most important thing was determining what scratchpad memory was being used for, so whatever you do, don't skip documenting that.

Hopefully, you will eventually come up with something like the files I am posting here – but it will take some time. Mednafen's single step function ('S' in the debugger) is also helpful, and so is setting other breakpoints to go over the boring parts. With a debugging emulator, we now have the luxury of seeing what values are reasonable (by viewing them 'live'), where branches actually take us, and so on. Much easier than just using a paper disassembly.


Notes (follow-up on my 'clue gathering' suggestions above):
  • Based on where the call to EX_GETFNT takes place, the print function is anchored at 0x15f9e in the ISO file (corresponding to $C79E in memory)
  • It turns out that the first few characters of the first message in DotB1 aren't unique enough, being the main character's name (I mentioned this could happen). The actual location would be found at 0x70f8f7, if you took enough characters from that message to get a unique string. This corresponds with the in-memory address $40F7, which coincidentally is an address you can see in the screenshot above, in the list of zero page values, at <$72/<$73.
Attached are my commented disassemblies of the print function, for your perusal:

printfunc-disassembly-ramuse.asm
printfunc-disassembly.asm


To Study/Consider in Advance of the Next Post
  • Take a look at the disassembly of the print function -- in particular, the main print loop
  • Think about where/how you might patch this print function in order to get western characters.  (Note: Like most surgeries, I always find that being minimally-invasive is a good policy to prevent failure.)
  • If you have the ISO of this game, study the block of text in the ISO file, and see if you can identify any patterns/structure behind the strings, which may need to be preserved/updated on extract/re-insert

Next post: the print function patch

Continued: Part III
#3
I'm trying to clean up a bit.

I'm not going to watch these again, so I thought I'd give them to somebody who wants them more than I do.

7 VHS tapes, including:
- Lords of Thunder (US)
- Den Den no (Kabuki) Den (JP)
- Valis 3 (JP)
- PC Engine Perfect Video (JP)
- Tengai Makyo Ziria Anime (2 tapes - JP)

(no subtitles)

Shipping from the Toronto area will vary by location and service level... rough estimates are:

At roughly 2.5kg, the cheapest shipping would be about $20US to the NorthEast part of the USA; about double that to Europe.  Better service (such as tracking) would increase the price.

First positive response can have it.  Please respond first to the thread with approximate location (so that others will know when it's gone).  I will PM with shipping options; paypal for payment.  I should be able to ship on Tuesday or Wednesday.

-Dave

PCE_Videos.JPG
#4
Quote from: elmer on 12/31/2015, 02:47 PMIt took a while to figure out how Falcom were setting up the positions of the text sprites, but here you go ...

IMG

There are a lot of different text messages that go into that "Attack +12" string area ... and the number itself can be up to 6 digits long.

I think that I'm going to need that 6x12 font in order to keep any abbreviated text messages understandable.  :-k
Wait... did you say "text sprites" ?
If they're using sprites, you need to know that beyond 16 sprites on the same scan line, there will be hardware flickering.  One more difficulty with longer text phrases.
#5
Quote from: kisaku on 06/22/2015, 12:27 PMoriginal nec! ;)

IMG
They still have those, and other "Bazaru de Gozaru" goods (bags, etc.).  I saw a whole bunch of them last time I was in Japan, but I can't clearly remember where I was (it was a storefront somewhere, probably Akihabara).

Although they created a PC Engine game for it, this mascot was used for a wide variety of NEC goods and services back around 1995, and for quite a while afterwards (perhaps even now, in some cases).

After some searching, I found this:
http://jpn.nec.com/bazar/

In the "history" tab, it says that the characters existed since 1991, and there is a section called "premium goods" with 6 pages of Bazaru items.
http://jpn.nec.com/bazar/library/history/cm.html

But your watch isn't in there.  Maybe it was a game-related limited-edition...

-Dave
#6
Quote from: guest on 02/10/2015, 01:05 PMI pulled out my poster last night and took a picture:
IMG
It says PC Engine in the lower right corner, and it's the size of like a regular piece of paper when folder so I'm guessing it's from a magazine too.  I got it awhile back randomly with a bunch of other stuff I got from Japan.  If you know what issue though, I'd love to find that as well to match them up :)

Sorry for derailing too!
That doesn't just say "PC Engine", it says "Dengeki PC Engine" (the name of the magazine).
The lower-right corner also says "2月  付録2", which means "February, second pack-in/insert".
Below that is a copyright message for 1993, so I guess February 1993.... but that's WRONG.

I check my own copy of the magazine (it's the first issue of Dengeki PC Engine), and confirm that insert #2 was a poster for.... Sotsugyou (Graduation).

I check February 1994's issue, and it's a veritable Ys 4 festival, including the poster.
#7
Wow, max. attachment size = 128KB... not sure I can make any useful images that small.

I took some snaps of September 1995 Dengeki PC Engine (which I actually have 3 extra copies of, since I had to buy large lots in some cases in order to get individual rare issues).  The pic's were several megabytes though.

So instead of posting actual images of the actual magazines (which I will figure out at some point, when I'm ready to deal with a hiundred or so...), I'll post links to images I found on the internet, so you can see what they are.


Here's some specs on the magazines themselves:

IMG
Marukatsu PC Engine:
- Ran from January 1989 to March 1994
- staple-bound, roughly 120 pages to 160 pages
- included furigana over the kanji
- years 1989 and 1990 are super-rare; I'm still looking for several issues
- mascot on the front cover varied over the years, but was a bunny-girl for most of the run




IMG
Gekkan PC Engine:
- Ran from January 1989 to March 1994, but had 3 "special issues" published in 1998, in June, August, and October
- staple-bound, roughly 90 to 150 pages
- included furigana over the kanji
- less rare than Marukatsu, but 1989 and 1990 are still hard to find
- mascot on the front cover was various iterations of a gorilla, and slogan was "Monthly magazine for game freaks"




IMG
PC Engine Fan:
- Probably the most popular magazine outside of Japan, but not sure why
- ran from December 1988 to October 1996 consecutively
- "Super PC Engine Fan Deluxe" vol 1 was published in January 1997, and volume 2 was published in May 1997
- staple-bound from December 1988 to July 1996 inclusive; glue-spine-bound starting in August 1996
- roughly 90 to 150 pages
- no furigana
- issues from 1992 onward generally not so rare, but 1988 to 1991 can be as rare as early Marukatsu (I still don't have them all)




IMG
Dengeki PC Engine / G's Engine:
- published from February 1993 to May 1996 as "Dengeki PC Engine"
- name changed to "Dengeki G's Engine" for the run between June 1996 and July 1997
- they dropped the "Engine" part of the name starting in August 1997, and likely PC Engine coverage in general
- staple-bound from February 1993 to February 1994; glue-spine-bound from March 1994 onward
- roughly 120 to 200 pages
- no furigana
- individual issues can be rare, but generally not so much.
- bunny girl on the cover of virtually every issue; this is likely because MediaWorks was formed by a group of former Kadokawa employees (ie. Marukatsu)
- this magazine has considerably more "hentai" than the others
- later issues have a "cut the pages yourself for an 18+ picture guide" section


-Dave