Archive for the ‘Uncategorized’ Category

Patchmaker International

Monday, June 14th, 2010

Hi folks. Since I started at GD, I’ve been away from the blog, but here’s a format I’ve been working on for a while: IPS, or International Patching Specification.

My last entry about ROM hacking only discussed what a person does to make a hack. But with a mind for distribution, no hack would ever legally pass muster without some sort of patching scheme. This involves a format that facilitates patching, and software applications that can make and apply such patches.

A patch file contains only changes to a file. Suppose you have version A and version B of a file: the patch, once applied to A, will “transform” A into B, provided you have version A to begin with.

The most common type of patch file used in ROM hacks is IPS, and for good reason:

1) It’s common.
2) It’s simple.
3) It works well with small binaries.

A number of utilities exist to apply patches, such as IPSWin.exe. It applies patch files with IPS extensions to whole ROMs or other files. Generally speaking, you cannot distribute the whole ROMs legally to another person, but you can distribute the patch, which contains only changes you’ve made personally.

There is a lot of legal gray area here. But I’m not concerned with that–and neither should you. What concerned me was that I never found many utilities that could make IPS files in the way I wanted.

I went to work on BARfly and made a new I.F. called IPS.BAR. This file actually has three functions rolled into one:

1) It characterizes patch records of an existing IPS file.
2) It allows you to patch a file using version A of a file and an IPS file.
3) It allows you to make an IPS from version A and version B of a file.

It’s #3 that is most useful to me. I had to give it some thought, because there are some nuances to making a patch have a small size. Run-length encoding is supported in IPS, so I took advantage of that whenever possible.

When testing it out, I was shocked to discover that in some cases, IPS.BAR generated patch files nearly half the size of some of the originals I’ve found online!

Anyway, I’m satisfied with the utility, and I encourage anyone who wants to learn about IPS to check out the new file on my website. Download it here: http://www.chriskallen.com/barfly/IPS_Patch_File.bar.

I will note that the instructions are not that straightforward because they involve some copying, pasting, and deleting. This is because BARfly’s original charter was to characterize data formats. Its utility in performing more complex file operations is limited as a result. I will be changing this soon, of course. Having a schema have diverse file inputs and file outputs raises the “BAR” significantly–pun intended.

MDB.BAR: Making a database out of…a database

Wednesday, November 18th, 2009

Since mid-September, I’ve hosted a limited BAR I.F. of the Microsoft Jet database format, or MDB. This I.F. breaks the database records into appropriately-sized chunks, identifying various fields.

Of course, the first I.F. for MDB only allows one to look at data table format information. It does not support the “magic row-column” formula used to extract the data, which is to say, you can plug in a record number, a column number, and then you’ve got your data for that row-column combination.

I’m working to correct this, which will allow a person to get information out of any field in a data table. But even this seems like it’s too limited. So you get the information out of row 3, column 7. What does this mean? Most people reference records by a primary key, which is one or more of the columns. Oftentimes, non-key fields must be searched, with each matching record returned. And, of course, let’s not forget that columns are usually referred to in queries by name, and not by number.

BARfly would do wonders to extract entire data tables at once, given this modification. But what would it mean to use a BAR I.F. to manage the database file like you would expect an actual database to be managed? What would you have to do?

Well, I’m not going to discuss all the details, because it would take too long. So I’ll limit my focus to just a fundamental operation programmers and DBAs routinely perform on databases: the select query.

Queries can be written in many forms, but I’m just going to focus on MDB’s preferred format, which is called SQL. A SQL query has the following general form:

SELECT alpha,gamma,beta FROM greekdb.alphabetletters WHERE (caps=”true” AND highlite<>“yellow”) ORDER BY delta

Roughly translated, we want to extract a variable number of records, with three columns (“alpha”, “gamma”, and “beta”) per record returned, from the table “alphabetletters,” which belongs to database greekdb. Limit the records returned to those whose column “caps” has a true value, and whose column “highlite” has any value other than yellow. Finally, order the results, ascending, by the contents of the “delta” field in each returned record.

Any decent SQL processing system will take a database query and strip it down to the fundamental inputs that really matter in the query itself. Once all the text is read and understood, the database query looks more like this to the query handling logic:

Search database 2, table 5, for records where (column 10 is true and column 12 has a negative string comparison result with the text yellow). From these records, extract columns 4, 6, and 5 into the result dataset, in that order, ignoring the other columns. Finally, order the records of the result dataset using column 7 (ascending by numeric comparison).

Computers are nothing but numbers, and everything ultimately breaks down into a number. Once all the SQL has been translated, the real work is retrieving (or populating) all the pertinent data in the database for the user.

So, could a BAR I.F. do all that? Translate the query text, AND collect the necessary data, AND package the results in a format the user had requested? Absolutely.

BAR, remember, has decent regular expression processing capabilities, and can translate the SQL query as needed. Offset tweaking and length calculations are a fundamental part of the BAR deserialization procedure, allowing one to perform the “magic row-column” formula as many times as needed. Finally, the Deserialize functionality allows endless possibilities for data translation after nodes have already been characterized, yielding an appropriate dataset in the order the user had requested.

SQL gurus know that many queries are far more complex than the example I’ve just given. One-to-many relationships, multiple keys, internal consistency rules, etc. are not covered by the example. Clearly, I would be reinventing many wheels by making BARfly encompass nearly every possible creative query that a person could ever dream up, with few obvious returns. It’s far more worth it to use a genuine frontend, like Microsoft Access, if you need to use all the features of the database software.

But if you’re needing to perform data recovery, or if you want to perform easy translation of data without regards to internal consistency, or if you want to migrate the data into XML, UTD, or another format in preparation for a big move to a new format, or if you are writing a program that requires a light memory footprint with little need for all the advanced features of a database-access library of functions, consider the I.F. approach.

And don’t forget this–an I.F. is platform-independent.

I want to mention one more point. Microsoft has released 2.x of the Entity framework, which attempts to provide more universal access to different types of databases, with SQL Server being the preferred target. Query format is no longer important, nor is programmatic knowledge of database-specific features (Oracle has different features of SQL Server, for instance). Great stuff. But the framework does not, to my knowledge, allow anyone to customize what, exactly, the backend is. You’re locked into a predefined set of known database formats.

BAR represents the final piece of the puzzle. With BAR, there is the very real possibility of having a unique component that defines the format for a custom database backend. Having one or more implementation files act as “backend definitions” could, theoretically, allow for a framework with unfettered access to any type of data, regardless of whether or not the data accessed is even a database at all!

A New Beginning

Monday, June 8th, 2009

Okay, people; BARfly Bronze Demo is now posted. Feel free to download and comment here as needed.

I’ve given the general public some basic versions of common file formats to play with. The detailed versions of the same file formats I’m saving for those who want to purchase them. For example, a detailed version of GIF performs LZW to generate pixel data, and a detailed version of MIDI parses the individual MIDI commands in each track!

There is no limit to how “low level” BARfly can go when it comes to analyzing data. Even better is the fact that the deserialization process is kind of like a “Wonkavator:” it can parse forwards, backwards, sideways, slantways, you name it! How else can a schema handle the likes of nonlinear formats like TIFF, ICO, and WAD?