It's about a year since I asked here about methods of producing scrolling
spectrogram movies for Youtube. At the time, the only solutions I was offered
were to use screen capture software to capture the display of an audio program
like Audacity, etc, or Acousmographe, which can produce a Flash movie.
I've been experimenting, and I have problems with both these methods. Screen
capture always seems to result in scrolling that is at least a little bit
jerky, and Acousmographe's Flash movies can't be uploaded to Youtube. A year
ago they could, but didn't work properly. And they're jerky anyway.
So I've been experimenting with generating my own movies from dumped
spectrogram images using Avisynth, an open source video frame server. I've
uploaded a 3 minute test video here:
http://www.youtube.com/watch?v=mFP4WnMmqvU, and more recently a 3 hour video
test at http://www.youtube.com/watch?v=_KdYtECCMgU.
I'm wondering what others think of this as an alternative to Soundcloud, etc,
for sharing recordings. Soundcloud has very good commenting facilities, but
doesn't do spectrograms. Freesound's commenting isn't as good, but it can
display a spectrogram. Unfortunately the spectrogram can't scroll, so the
horizontal scale is too large to distinguish much on a long nature recording.
Youtube doesn't have the nice timed commenting that Soundcloud has, where you
just click on the waveform to insert a timed comment at that point, but I've
discovered that it does have timed comments. If you type a time into a comment,
e.g. "Sparrow at 2:18", the "2:18" becomes a link which takes you to the 2m18s
point in the video, or "2:18:00" would take you to 2h18m0s.
Times can also be appended to URLs, so one can email to the list links to
particualr points in a recording, eg
http://www.youtube.com/watch?v=_KdYtECCMgU#t=1h50m00s takes you straight to the
1h50m mark.
Regarding the jerkiness of the scrolling, generating my own movies from scratch
has led me to discover why they're jerky, or rather, to inquire on a forum and
have it explained to me. The programs are simply trying to move the cursor (or
spectrogram) at a non integer number of pixels per frame.
If the spectrogram is represented at say 27 pixels per second, and the video is
30 frames per second, the cursor or spectrogram needs to be moved along by 0.9
pixels in every frame. Instead it'll round it off to 1 pixel for a few frames,
then not move at all for one frame in order to stop it getting ahead of the
audio. This tiny variation is very noticeable.
My solution was initially to vary the horizontal scale of the spectrogram to
make it scroll at exactly one frame per pixel, but I found that a bit
restrictive, and ended up using a function that could interpolate the pixels to
simulate 0.9 pixels per frame.
Anyway, it's a great pity that the various programs that can generate
spectrograms don't take this into account. It should be possible to choose a
horizontal scale that forces the right speed for capturing, but I found that
most of them aren't very good at setting precise scales. It's even more of a
pity that none can create a movie directly without all this rigamarole. If
anyone knows of one that does, please let me know.
Peter Shute
|