In my experience with archived data, nothing is totally safe. Archives need to
be backed up, but then can you rely on the backup?
I recall desperately copying an archive of floppy disks to Syquest disks,
fitting 100 or so floppies to each disk. Then a few years after copying the
Syquest data to hard disk when they started dying too. Scary moments when the
now irreplaceable drive began failing too.
At least on hard disk I could then regularly copy them to a DAT backup drive in
one hit. Those DAT drives used to last about two years before they became
unreliable, but at least the next model could read the previous model's tapes,
unlike the 1/4" tape drives that preceded them. These days LTO drives live much
longer and are backward compatible too.
I feel this regular copying is the key to safety, and these days it would be
copying to a rotated regularly replaced set of removable hard drives. At least
then one knows immediately when failure begins, and one can replace the drives
before it's too late.
If random loss of bits occurs due to weakening of the tiny magnetic fields,
perhaps just running regular low level checks on the disk will rewrite them and
keep them at full strength? Not possible if the disk is a CD or DVD, of course.
But I've never been involved in audio archiving before. I'm interested in the
comparison of storing digital audio with storing analog tapes. Unless
physically damaged, a tape can always be placed in a player and you can at
least hear *something* even if it's deteriorating.
What can you do if a hard disk deteriorates? With digital data, your computer
could just tell you no, you can't even look at that file, or perhaps even the
whole disk.
Often one can still run a recovery program, which pulls off data anyway, even
if the computer says it's damaged and unreliable. (For hard disks at least. Not
sure about CD, etc).
These programs have to guess where each file begins and ends, and can give you
a pile of files with names like unknown00001.dat. It's then up to you to
examine each one with a hex viewer and try to guess the filetype from patterns
in the headers. For example, a jpeg file will have the letters JFIF in the
first few bytes.
This makes me think your archived audio would be safer if you knew all the
files were, for example, wav files. Ie, it might help not to mix audio and
documents and programs on the same archive disk.
Sometimes the recovery program can't tell for sure where the files start and
end, and there might be several files in the one recovered file. Then knowing
the file format and what the headers mean will help. Is there an audio format
that records the file length at the start?
Sometimes the program can't even be sure what order the blocks of data were in,
and the contents of the recovered files could be disordered chunks, say, 4KB
long. What then? Is your audio file format structured in a way that lets you
get usable information from an isolated chunk? Or is it all useless without the
rest of the file? I suspect wav format might give you much more than, say, mp3,
but I don't know.
Finally, the data might contain parts that are just wrong. Then I suppose the
above applies. It's better if the format lets you play the damaged file and
hear distortion in one part than not be playable at all even though most of the
file is intact.
My conclusion, not based on any experience, is that to match the recoverability
of analog tape, one's archived data should be:
- On a freshly formatted drive, and that no deletions and replacements occur
after copying. This is to guarantee the data is stored sequentially to assist
recovery.
- Of known filetypes that can be identified by manual examination of the
headers. Preferably just the one filetype? Perhaps stuff like Audacity project
files should be archived elsewhere. There are hundreds of them for one audio
file that's being worked on.
- A filetype that allows recovery of parts of the file by manual examination.
I guess it would also help if you knew for sure what was on there, so a printed
Index in the same order as the files are placed on the disk would be helpful.
Any thoughts? Does wav format allow identification of its parts? Can isolated
chunks be played?
Peter Shute
--------------------------
Sent using BlackBerry
|