naturerecordists
[Top] [All Lists]

Re: Re: MD v. DAT

Subject: Re: Re: MD v. DAT
From: Walter Knapp <>
Date: Fri, 15 Aug 2003 18:23:26 -0400
Raimund Specht wrote:

> In fact, the compression is achieved by reducing the bit-depth in
> some frequency bands. This is an adaptive process that is controlled
> by the properties of the sound being recorded (and the properties of
> the human auditory system). If the bandwidth of the incoming signal
> is limited (lets say below 8 kHz), the compression will have little
> effect on the sound quality (even in more complex sounds). However,
> if there are larger signal bandwidths (or rapid frequency
> modulations over large frequency ranges), the compression will lead
> to some degeneration. This would explain why low-frequency impulsive
> frog calls are less affected than higher pitched bird or insect
> sounds. The example posted by Jeremy is a nice illustration, how
> rapid frequency attacks with larger bandwidths may lead to serious
> distortion.

I think we should all note that Raimund does not own MD and has no
experience recording with MD. His "tests" of ATRAC did not test ATRAC,
but another entirely different compression type. In other words he's a
complete novice in what MD does.

This is a deliberately oversimplified description of what ATRAC
compression does, to the point of being highly misleading. I will not
attempt here to describe all it does, but just attempt a little
correction. I do not claim to know all the details of it's internal
operation, I expect no one except the experts at Sony really know all it
does and why.

Bit depth reduction is not applied evenly in ATRAC, it's applied where
it won't show, won't produce artifacts. There are vast amounts of a
recording where this can take place. There are huge amounts of redundant
data in a digital sound recording.

Let's take, for instance, the case of the Marsh Frog recently brought
up. The internal pulse structure of the call I described, for 1/10 of a
second there is sound to describe, then there is a space of nearly equal
length in which no sound occurs, followed by another pulse. ATRAC's bit
depth reduction will occur mostly in those thousands of samples in
between each pulse, we don't need full bit depth to characterize
silence. And we have not even gotten into the space between each call
where there are thousands and thousands of samples that don't differ one
to the next. Virtually all of ATRAC's compression is achieved by such
simple methods, not the elaborations supposed by some.  But it can also
work where there is sound, where the shape of the curve is following a
simple path it often needs little to describe the difference from one
sample to the next. And it can handle a considerable load of highly
complex sounds.

In natural sounds as opposed to test samples deliberately constructed to
  use up ATRAC's abilities, complex multi-frequency transitions are of
very short duration, not continuous, they occupy a minor part of the
recording length, almost insignificant. ATRAC incorporates special
discrimination to detect such short pieces and up it's bit usage to
cover them. It uses more of it's store of bits when it needs to, in
other words. For the vast majority of the time in the recording nothing
much is happening. That's where it saves bits and gets very tight on bit
depth. It's really constructed to preserve the sound information, not
bits. They are definitely not one and the same.

> I agree, that ATRAC does not add 'filler' noise in order to mask
> artifacts. However, the encoder itself will introduce some kind of
> white noise. This is a (unwanted and system-inherent) result of the
> reduced bit-depths (also called quantization noise). This kind of
> noise is clearly visible in Jeremy's recording at t=3D6,3 sec:
> http://www.avisoft-saslab.com/compression/MDtest2MD.gif

It is well to remember that the analysis software being used does much
the same thing as compression, in fact it's really extremely severe and
crude compression of the sample then displayed as a visual image. It
produces it's own artifacts, some of which Raimond misinterprets as
being in the sample. It takes time and experience to learn to interpret
the fine details of sonograms.

I see no consistent white noise generation in ATRAC encoded material.
And I have been using and analyzing it for many years. If it were system
inherent, it would be very consistent, we are talking about a fixed
piece of circuitry. We would always see such generation regardless of
the material. Not just a spot here and there. That did not necessarily
repeat. Scientific investigation demands reproducibility, and that's
what's been lacking. One recording we may be able to demonstrate
something, then the next recording that has material of the same type
won't have it. Try it, record for a few years, analyzing everything. The
longer you go, the less sure you are going to be of the effects of
ATRAC. That's why the most sure folks about the bad effects of ATRAC are
those with no experience using it in actual field recording.

Walt





________________________________________________________________________
________________________________________________________________________

<Prev in Thread] Current Thread [Next in Thread>
Admin

The University of NSW School of Computer and Engineering takes no responsibility for the contents of this archive. It is purely a compilation of material sent by many people to the naturerecordists mailing list. It has not been checked for accuracy nor its content verified in any way. If you wish to get material removed from the archive or have other queries about the archive e-mail Andrew Taylor at this address: andrewt@cse.unsw.EDU.AU