naturerecordists
[Top] [All Lists]

Re: Sound editing & sonogram software on PC

Subject: Re: Sound editing & sonogram software on PC
From: Walter Knapp <>
Date: Wed, 12 Mar 2003 22:33:02 -0500
Doug Von Gausig wrote:
> At 09:23 AM 3/11/2003, Walt wrote:
> 
> 
>>>It is enormously versatile. I've posted a series to illustrate a few of 
>>
>>the
>>
>>>combinations you can get at
>>>
>>
>><http://www.naturesongs.com/recordists/spectral1.pdf.>http://www.naturesongs.com/recordists/spectral1.pdf.
>>
>>What's the band limit? I normally use 4096 with Spark XL, and wish it
>>had higher. If I remember right the website ones were also 4096 ones.
> 
> 
> CoolEdit allows resolution to 16,384 bands, which takes forever to 
> calculate, of course. I usually stick to 256 or 512 bands, 150 dB - thus 
> gives me the most useful spectrogram for editing purposes. I do all my 
> editing in spectrogram view, though you can also use the waveform view, 
> which some prefer.

What I'm asking about has nothing to do with dB range. It's the FFT 
Blocksize. I may be mistaken about what CoolEdit means by bands. And the 
math behind a sonogram is something I have only a very basic understanding.

When you do a sonogram a Fast Fourier Transform is done to convert your 
info from data about amplitude to data about amplitude for each 
frequency. This cannot be done on a single sample, so the FFT is done on 
a block of samples. The results you get will depend on just how many 
samples a block includes. If only a few the resolution will be poor, 
more will improve the resolution, up to a point. Note the FFT process is 
repeated over and over, moving by a increment through the samples. That 
increment does not have to be the same as the blocksize. It's usually a 
highly overlapped series.

And someone more up on the math will probably find something wrong with 
that explanation.

All sonograms use pretty much the same method, but how that's programmed 
varies a lot. Programmers limit how much we can set and how much they 
make fixed in the program. And just what instructions the program uses 
to do the calculations can vary, with different results. Even knowing 
the math perfectly won't tell you everything about what you will get 
from a sonogram program. You really do have to run them through their 
paces. And to make it harder, programmers are not even consistent in 
what they call the settings.

What I find with my sonogram programs is that blocksizes of 4096, which 
is the highest I have available make the best use of available screen 
resolution. And that's the setting I've used for the display sonograms. 
I'd like to try higher settings with some program to see what I'd get. 
What is most obvious is that the blurry state of the lower frequencies 
varies with blocksize. At lower blocksizes that blur will extend to a 
much higher frequency. At 4096 it is pretty well resolved down to 
somewhere in the 500hz range.

Let's see if I can do a quick example, a picture will make it 
clearer.... Here tiz:
http://frogrecordist.home.mindspring.com/naturerecordists/FFT.size.jpg
This was done by running a single soundfile (the SavannahNWR one) and 
pausing and changing the fft size setting for each segment. I've labeled 
the 4 settings available in Spark XL.

If the bands in Cooledit refer to the number of individual frequency 
bands to calculate, that's a different parameter. If I remember right, I 
used 512 bands for the website ones, which works out to less than a 
pixel width per band.

I'll have to hunt up the band info for Spark XL. My impression is it's 
using a fixed band count the same as the pixel count. It's sonogram 
window is a fixed size. Of course the range of each band count will 
depend on the endpoint settings you are using and if it's log or linear. 
Spark XL's range max for frequency is 20hz to 22 khz.

Note that in Spark XL to get really good value out of analyzing a small 
frequency range you would use the highest FFT blocksize. And could 
definitely use higher blocksizes for the lower frequencies.

As far as dB scaling of the colors, Soundhack has a fixed correspondence 
between color and dB level. And covers 90 dB, though a bit of that is 
various shades of purple that are hard to distinguish. Spark XL has a 
maximum of 0dB to 100dB, but you can adjust the two endpoints to have 
more detailed analysis of a specific range. The ones I've posted I've 
just left the default.

For filtering I always have the sono on the output end of the filter 
stack. And maybe put another copy elsewhere, though I don't do that 
often. The realtime sono display while playing with filters puts quite a 
load on the CPU to keep up. My 400mhz G4 cannot quite keep up if I max 
out all the quality settings, speed, etc. on the sono. I usually max the 
FFT, but not the speed and "quality" setting. The dual 1 ghz G4 can keep 
up with any setting.

For simpler recording, cut and paste, simple gain adjusts, etc I use 
Peak, and it's waveform display. It does not have a sonogram display of 
it's own. I could do all this too in Spark XL, but it's much more 
awkward for the simple stuff.

Soundhack is fairly slow doing the sono, just as well as I have to pause 
the processing to do a screendump accurately. Of course it does not do 
realtime.

Walt




________________________________________________________________________
________________________________________________________________________

<Prev in Thread] Current Thread [Next in Thread>
Admin

The University of NSW School of Computer and Engineering takes no responsibility for the contents of this archive. It is purely a compilation of material sent by many people to the naturerecordists mailing list. It has not been checked for accuracy nor its content verified in any way. If you wish to get material removed from the archive or have other queries about the archive e-mail Andrew Taylor at this address: andrewt@cse.unsw.EDU.AU