Tape reel data recovery from MERA-400 polish computer

Friday February 17th, 2017 asbesto

Around May 2015, Andrea “Mancausoft” Milazzo got in touch with Jakub Filipowicz, a Polish guy involved in MERA-400 computer historical researches; Jakub was writing an emulator of this machine, but the operating system was missing and almost unavailable. (Details on mera400.pl website)

Jakub found 5 magnetic tapes at the Warsaw Museum of Technology, containing hopefully copies of the CROOK operating system. The Museum was not able to read them. After some months, he managed to get the tapes, to try a data recovery, extracting the operating system.

We offered our collaboration, so we had the tapes shipped at our Museum. We also received a nice “good luck” sign, that now hangs in our library 🙂

The tapes were recorded around 1990; they were well stored, in original boxes, at a constant temperature, and well sealed. They seemed written using the Phase Encoding standard, 1600 BPI, but maybe the header was recorded at 800 BPI NRZ. This could (and in fact did) create some problems to our tape reader, which is able to read only 1600 / 3250bpi tapes.

At our Museum we have about 3000 magnetic tapes, of all brands/types. This means that we can get a perfect matching tape of the same brand / type / age of virtually any other tape, and make any kind of needed test without wearing off the original tapes. Those tests regards consistence and state of magnetic tape, and the needed treatments for successful data extraction (heat settlement, chemical / heat / mechanical treatments).

We recorded a few test patterns on our test tapes, to have something to test our recovery system with. After having done that, we proceeded to test the data extraction pipeline using our equipment.

Jakub’s suspects about the impossibility of reading those tapes were well-grounded: our tape reader (an IBM 9348 tape drive) got stuck at the 800bpi NRZ unrecognized header.

So the problem was to get another tape drive to recover data. In our deposit we have a Qualstar 1052 tape drive, but we don’t have its computer interface (a PERTEC). We decided to take it and maybe check it: the idea was to create ad-hoc hardware able to read data directly from the magnetic head, translating signals into data bits.

We opened the drive, and we looked at a nice circuit board in which we could see some sort of pattern: there were 9 circuit parts dedicated to every single track in the tape (8 bit data + 1 bit parity).

And here a crazy idea sprung to our minds: reading the converted digital signals directly from the circuits, instead of reading them from the reading head, in the hope that we could have clean bits there.

We performed a few preliminary tests using an Arduino Mega hooked to the tape, at certain test points obtained looking at the tape drive schematics, available online. Help by Enzo “Katolaz” Nicosia, that came from London and, remotely, by Mancausoft in Krakow, was crucial at this stage 🙂

By putting the tape in TEST mode, we could roll and unroll the entire tape; and we saw digital data flashing, data that could be read by our Arduino Mega.

We started a discussion about this method in the Vintage Computer Federation forum; that discussion is avaliable here.

We want to thank Al Kossow, Chuck Guzis, Gerardcjat and the other friends at VCF Forum for the invaluable help they gave us! 🙂

This system was able to read all the tapes but not the header; moreover, we had some errors because the Arduino software didn’t do any kind of error checking/correction. But the results were really encouraging!

At this point, Jakub proposed a different approach: to use a logic analyzer (saleae, 16 channels) to dump the entire tape and use a software to decode the saved signals and extract the data, according to what is described in this document.

So we ordered a chinese “Saleae” 16 channels logic analyzer; in the meanwhile, Jakub sent us his analyzer to start testing. While we were waiting for the analyzers to get delivered, Jakub made some additional tests, by creating a simulation of the tape reader circuits using LTSpice.

The software must check data consistency, solve some skew track misaligning (a problem related of how the reading head is made), and correct data using parity bits, if necessary.

As Jakub says:

“The real problem I see is the processing power needed to do that sampling _reliably_. 1600BPI = 3200 flux changes per inch. At 20in/s (slowest drive speed I know of) it means 64kHz signal. Taking Nyquist frequency into account we need to sample at 128kS/s, but from my experience with such a type of signals (where clocking “floats” due to physical nature of media and drive) we need at least 5x-10x more than the signal frequency, so at least 320kS/s. Now, I don’t know how fast the Qualstar drive goes, but fastest speed for 1600BPI is 200in/s, which requires 3.2MS/s… And even 320kS/s can be a problem for an arduino – with 8MHz CPU you have ~25 instructions per sample to spend, in which you need to not only sample the data, store it in a buffer, do some maintenance work, but also send it to a host (not enough memory to store it locally). I really think that fast programmable logic analyzer is the way to go here. The key is to sample the data with precise intervals, otherwise later analysis becomes impossible.”

Another problem was to clean out some “noise”, magnetic flux fluctuations which could be misinterpreted as data.

The software was written in Python by Jakub Filipowicz after a lot of reading tests and debugging sessions; it’s a very beautiful piece of work and is available on this link.

The nice thing of this software is that it is able to read any kind of magnetic tape, recorded at any BPI, at any tape speed or writing standard (PE or NRZI).

The fine-tuning of all the software parameters was not easy, but eventually we were able to recover all the data from all the magnetic tapes, restoring everything without errors!

We had several very nice surprises: not only we recovered a copy of the CROOK operating system (which was believed to be missed forever), but we found a lot of different versions of various operating systems, a dump of a MERA-400 System Disk, OS source codes, and a lot of programs.

Jakub wrote:

“Tapes were written on two different systems: MERA-400 and K-202. K-202 is probably the most famous Polish computer system, a predecessor of MERA-400 (MERA-400 is often called a “production-ready” version of K-202).

Four of five tapes contain source code only, and these sources are absolutely golden: 260 files (dated 1973-1984) contain different versions of various operating systems. For most, if not all of them, there were no preserved copies up until now.

– SOK-1 (K-202 original operating system)

– SOWA (K-202)

– CROOK-1 (K-202)

– CROOK-2 (K-202)

– CROOK-3 (K-202 and MERA-400)

– CROOK-4 (MERA-400)

– SOM (MERA-400)

Huge number of versions allow for reconstruction of the whole CROOK development process.

Other sources (hundreds of files) include various OS tools and utilities as well as:

* K-202 command shell
* CEMMA analog circuit simulator (several versions)
* 8080 simulator
* ASSK assembler (for MERA-400 and K202, dozens of versions)
* Basic (for K-202 and MERA-400)

And finally, the fifth tape contain several backups of a live MERA-400 system with complete CROOK-3 OS installed. This means we have a binary CROOK-3 “distribution” ready to run in the MERA-400 emulator. There are also users’ home directories with loads of other software in various stages of development.”

And here is our final press release written by Jakub:

“Recent cooperation between Museo dell’Informatica Funzionante and mera400.pl, a Polish site specializing in preserving history of MERA-400 minicomputer family, proved to be a great success.

We’ve managed to restore data from 5 NRZ1-encoded 9-track tapes written between early 70’s and early 80’s, despite the fact that we didn’t have an NRZ1 compatible tape drive at our disposal.

Restoration was done in two phases. First, a low-level tape images were taken by tapping into 9 head signal paths with a logic analyzer, running all five tapes at a low speed of 50 in/s, and sampling the signals at a rate of 1MS/s. This eventually resulted in 2.2GB of raw head signal images ready for further processing.

To convert the signal data into actual files that were stored on a magnetic tape, custom software was needed. Second phase of the restoration process brought to life a software called “Nine Track Lab” – graphical tool that allows not only decoding NRZ1-encoded data, but also fixing problems related to tape age or drive’s read inaccuracies.

Development version of the software is available at: https://github.com/jakubfi/ninetracklab

Final result was a 100% success: 8340 data blocks were correctly read, yielding almost 1500 files written between 1973 and 1984. Among them were 260 files with source code of various K-202 and MERA-400 operating systems that were presumed lost until now: SOK-1, SOWA, CROOK-1, CROOK-2, CROOK-3 and CROOK-4. Other files contain sources of many utilities, as well as binary copies of a CROOK-3 installation taken from a running machine.

This collaboration made it possible to preserve for future generations an important part of Polish IT history. We are not only happy, but also proud that we could be a part of this process.”

An article related to this data recovery work was published on the MERA-400 Facebook page.

This project would have not been possible without what is, in our opinion, the true identity and characteristic of a Museum: the scientific and technical collaboration among researchers and scientist from all around Europe (London, Krakow, Warsaw, Cosenza), and between different institutions / entities (our Museum, the MIAI in Cosenza, the Warsav Museum of Technology, Dyne.org in Amsterdam). We thank all our MusIF / MIAI / Dyne.org members and everyone involved in this project 🙂

For further information about the MERA-400 system, his emulator, software and technical information, please check the MERA-400 website.

Image gallery: