Topics
How to Copy a File From a 30-year-old Laptop
How do you transfer files off an Apple laptop from 1994? It's harder than it sounds!
Between a dock and a hard place
A family member presented me with an interesting problem: a PowerBook Duo 280c laptop from 1994 containing a few short audio recordings he wanted to preserve. A quick analysis of the problem at hand revealed a couple things working in our favor:
- The laptop still boots... after a few gentle taps on the hard drive to get it to spin up!
- The audio files play back locally through the laptop's internal speakers.
On the other hand:
- The laptop has no audio jacks, so we couldn't get good analog copies of the recordings even if we wanted to.
- The internal hard drive uses SCSI with an unusual connector. Adapting it didn't seem straightforward, and we weren't confident the old file system (HFS) would be easy to read from a modern system.
- An external floppy drive accompanied the laptop... but the computer would not run with it connected. We couldn't figure out the hardware fault causing this.
- The laptop has an AppleTalk port and a phone jack... but no networking software installed.
Remember, it comes from a time before the Internet was mainstream. HTTP, the protocol underpinning the modern Internet, wasn't finalized until 1996. Software for connecting to a dial-up provider would have been available at the time, but was never installed. However, the laptop does have a phone dialer application, which gave us our first clue towards a potential solution.
Deus fax machina
While the laptop has no networking software, it does have fax software. We confirmed the modem could dial, so this might just be crazy enough to work.
The first question was how to turn the audio file into something faxable. The laptop contains a collection of games. Alongside them is a resource editor, called ResEdit, which had previously been used to inspect and modify the aforementioned games. Let's see what it can do.
ResEdit has probably the most adorable splash screen ever.
Luckily, it includes a hex editor which allows you to view the raw contents of files in hexadecimal form. This is exactly what we need! Although, since ResEdit doesn't support printing, we'll have to copy the text into a different application that does.
This particular sound file is 37,928 bytes long. The hexadecimal representation is double that, since each byte is represented by two characters (0-F).
Huh. I guess we'll have to do it piecemeal. (And yes, that message literally means 32000, not 32767. Some programmer must have really liked base-10 numbers and signed shorts.) We'll use batches of 12288 (0x3000 in hex) to make the offsets easier to remember.
Now that we have a chunk of the document in the clipboard, what do we do with it? Thankfully, Microsoft Office is also installed on this laptop, and it supports printing!
Pasting the contents of the clipboard into a Word document inserts the hexadecimal representation verbatim.
These sound files are all under 100KB, so copying them into Word doesn't take long. I'd estimate the transfer speed of manually copying the contents from ResEdit into Word via the clipboard averages about 316 bytes/second.
With the whole file copied into a text document, we can leverage the previously-discovered fax software from the print dialog. But, who do we send the fax to?
Fax is the new file copy
My college laptop, a ThinkPad T60, is the only computer I still own with an internal dial-up modem. It's running Windows XP, which includes a fax application that can listen to the modem for incoming faxes and store them as multi-page TIF images.
There's a problem though. Simply plugging the two laptops together with a phone cable doesn't work. The PowerBook dials, but the ThinkPad doesn't answer. There is an "Ignore Dial Tone" option, but it makes no difference.
It turns out the voltage provided by an actual landline is important for the modem to function properly. Fortunately, a simple phone line simulator circuit can overcome this problem, and is very easy to build with some common electronic components and a 9V battery.
It's crude, but effective! (Side note: WAGO lever nuts are super handy for all sorts of electrical projects.) This schematic shows the circuit diagram of the mess of wires above.
With the two laptops now connected via a simulated phone line, we attempt to fax the document again.
Oops. The FCC says I need a cover page. Let's try to appease them.
There we go.
It's working! It takes a while for it to rasterize the fax pages. After that's done, it estimates the fax will take 24 minutes to send.
That's kind of long. What if we reduce the font size?
Down to 6 pages in 7 minutes. Much better. It chugs along at 14400bps until the fax is fully received by the ThinkPad.
It worked! Now the last step is figuring out how to turn this image back into a binary file.
The file transfer to end all file transfers
My first idea is to convert the TIF to a PDF and use optical character recognition (OCR) to translate it back to text. These images seem like the ideal candidate for an OCR algorithm:
- The text is computer-generated, so it's perfectly aligned and consistent.
- Only 16 different characters are used (0-9, A-F).
- It uses Courier, an extremely common fixed-width font.
After running the PDF run through an OCR process, the text is now selectable. I can copy it into a hex editor and save it as a binary file.
The resulting file's size is close to, but not exactly, what I expected. Nonetheless, I import it into Audacity which is able to deduce the audio format: unsigned 8-bit PCM, little-endian, 22050 Hz, mono. A waveform appears!
I play it back, and it sounds pretty good! Especially considering it was recorded from the internal microphone of a laptop from 1994. There's an audible problem though, and it's actually visible on the waveform itself. Zooming in reveals it more clearly.
All of those abrupt dips are the result of OCR transcription errors. They manifest as crackling or popping noises throughout the recording. I need a better way to convert this text to a file.
I try a bunch of different OCR programs, but can't find any that can transcribe the document with 100% accuracy. They often confuse certain letters or numbers (like 0 and C, 9 and 4, 0 and D). Sometimes they omit characters, sometimes they introduce new ones. I try different font sizes and different fonts, but it doesn't matter.
I need to process these files with perfect accuracy, and OCR software is not going to cut it. With no sign of improvement a couple dozen faxes later, I decide to write my own.
The fax, the whole fax, and nothing but the fax
Knowing that the text in these images is tightly structured, I can make a lot of assumptions when processing them. They were generated with a fixed-width font, which means the whole document is essentially an evenly-spaced grid of characters. Once I figure out the correct starting point, character offsets, and line spacing, I can capture and analyze each character individually.
I can use the preview on the left to quickly confirm whether the size and offset parameters are correct. Once they are, there will be no drift in the preview when moving between characters or lines. Processing can commence.
The actual character recognition takes a quick-and-dirty approach. For each unique pattern encountered, the program simply asks you to specify what it is, and subsequently remembers the answer for the next time it finds an identical-looking character.
(If for some strange reason anyone might find this program valuable, the source is available to download here.)
After a bit of manual training, it outputs the file. Success! The audio plays back smoothly without any popping. It's a perfect byte-for-byte copy of the original file!
With the most complicated file copy in history finally complete, there's only one thing left to do.
Nostalgia achievement unlocked.
(Of course it runs DOOM.)
Recents
- How to Copy a File From a 30-year-old Laptop
- Chocolate Tempering
- Potato Chips & Onion Dip Mix (2023)
- Mint Hard Candy (2022)
- Cheddar Cheese Popcorn (2021)
Tags
- Cooking (13)
- Holiday Gifts (12)
- HVAC (1)
- Lego (2)
- Technology (4)