Note: You will need to have your speakers turned up to fully appreciate this entry.
I’m going to depart from the normal technical focus of my blog and talk about something more people can relate to: music.
That is, the format of music, as used in the context of computers. Most people can clearly see how enormous an influence technology has exerted on music in the last few decades. In the case of computers, there have been two main revolutions in the musical realm:
1) The ability to synthesize and create music by means not limited to the classical methods, performing and rehearsal.
2) The ability to script playback of digitized music in an automated fashion.
The first revolution came in the form of the synthesizer-loving 1980s, as well as in the various hardware and software that facilitated note-by-note recording and playback of wired instruments. The second revolution occurred at the roughly the same time, perhaps a bit later, in the form of video games, CD-ROM players, and MP3 players.
Both of these revolutions needed digital storage formats to keep track of the musical data. Without them, you’d have no way to “store” the music for playback beyond straight analog recordings.
The foremost and most well-known digital music format is Musical Instrument Data Interface, or MIDI. I originally created the 14 musical tracks of Cruz as MIDI files. A MIDI file stores notes and instruments as metadata–C key octave 4 note down, E key octave 3 note down, C key octave 4 note up, etc. The sequence of “command” and “data” bytes in a MIDI file, when interpreted by software and hardware, can be used to play back a composition that a musician is working with.
In this respect, MIDI has immeasurably helped in the composition process. The ability to listen to how your music sounds, when not being performed, is critical when deciding which music is good and which music still needs work.
Take the “Cruz Puzzle Theme” track’s music, composed in Anvil Studio. The metadata exists as command and data bytes, but it can be rendered on the computer screen as a musical staff customarily seen in sheet music:

Anvil Studio lets the composer add effects and notations to the staff. I didn’t in this case, because I was more concerned with how the music sounded than how it would look printed.
MIDI sequencers like Anvil Studio are invaluable assets to people like me–I absolutely need to hear the music while composing it to determine if I’m on the right path or not. This is very much unlike programming, where so much of a program must be written before compiling it to see if the entire thing works at once.
With only a few tweaks, a composer can move notes, change the tempo, change keys, change instruments, and make other changes that might have required countless hours if hand corrections were used on paper!
So MIDI is great. But it’s not enough.
Anyone who appreciates good musical performances knows that music, as performed, is a lot more nuanced than what simply appears on a staff. Many effects are instrument-specific, and the need for custom instruments is very important. MIDI locks the composer into a relatively limited number of instruments.
But it gets worse. Each MIDI playback device has its own wavetable “soundfont” for the instrument set! This means that the instruments for a MIDI song composed in one MIDI context might sound very different for some instruments when played back in a different MIDI context. If a composer works hard on making a song sound a certain way, he or she will not want to see it “ruined” by the wrong soundfont.
Compare the two songs, and tell if they sound the same:
Cruz Puzzle Theme – MP3
Cruz Puzzle Theme – MIDI
If the MIDI theme sounds different (and probably worse), it’s because the SoundMAX default soundfont is not the MIDI playback device on your computer. I converted the MIDI format to MP3, which is purely digital waveform, in order to head off the problem of inconsistent playback.
Which leads me to the next part of this topic: tracker software.
An early alternative to MIDI that allowed for direct wavetable configuration is MOD. The idea behind MOD, UMX, and other “tracker” formats is that both the note-by-note metadata and the wavetable samples used in playback are stored in the same file, resulting in consistent-sounding music regardless of playback device.
There are many brands of “tracker” software, but the idea behind them is essentially the same: you edit note metadata, applying effects, etc. This is much like Anvil Studio, although the editing display is often different from the “sheet music” view seen in Anvil Studio.
![]()
Above is a screenshot from FamiTracker, a utility that makes tracks compatible with the sound-generating capabilities from the 8-bit NES. I used it to make the “chiptune” variant of the Cruz Puzzle Theme. Have a listen:
Cruz Chiptune Puzzle Theme – MP3
The differences between MIDI and FamiTracker formats required some time to port over, but because I had already composed the original, it came together in only a single evening.
It certainly demonstrates the utility of FamiTracker to have allowed me to customize the note sampling periods. If you’re sharp, you noted from the Anvil Studio staff screenshot that I’ve hit you over the head with a very unusual feature in music: the 5/4 time signature!
The 5-count time signature is extremely rare. The vast majority of songs (and musicians, it would appear) never leave the comfortable realm of 4/4. But well-made software will accommodate nearly any “trick” the composer might have up his sleeve.
I made a 5-count song for two reasons: influence from Jesus Christ Superstar, which has 5-count and 7-count songs, and because I really wanted to challenge myself to make a quality 5-count song.
Of course, composing also requires audio editing, which is challenging no matter how difficult the source material was to create. This leads me to the final part of this topic: purely digital formats.
To represent digital waveform data, no matter what the source is (voice included), you must use a format like WAV or MP3. The WAV format contains lots (and I do mean lots) of individual sample points of the audio waveform for any one second of playback. For CD-quality audio, that’s 44,100 sample points per second!
Obviously, the large storage size required for waveform audio has created a need for compressed formats. The most common compression format is MP3, which sacrifices a bit of quality for reduced file size. There are a few others, such as OGG.
After music is composed, one must translate it to digital waveform data. But just one direct-translation operation is rarely adequate–filtering operations, volume control, fade-in and fade-out, making stereo tracks from mono, and a whole lot of other clipping, segueing, and tweaking operations must occur to make a song ready for use in the world of computing.
Several waveform-editing applications exist, such as Audacity. The screenshot below is of the waveform of the Cruz Puzzle Theme, as seen in Audacity.

In conclusion, I hope I haven’t blown anyone’s brains out. I’ve just summarized a huge topic, leaving out countless details, in the hopes of giving people a general idea of the “world” of musical composition that computer-minded composers live in.
The skill required to master all of these tools and formats? Hard to say. It definitely makes a difference how musically inclined you are in the first place, as well as how computer-savvy you already are.
Take me, for instance. I’ve used audio editing software for years, but I never composed anything (in Anvil Studio, OR FamiTracker, OR anything else) before 2011!
Only one thing is required for sure: you must love music.
