*** apologies for any cross-postings ***
Dear colleagues,
We are happy to announce the release of two new bioacoustic datasets to support
research on detection and classification of bird sounds:
BirdVox-full-night: https://wp.nyu.edu/birdvox/birdvox-full-night/
62 hours of continuous audio from 6 sensors, with 35402 flight calls annotated
in time and frequency. 5.7GB in FLAC format + 6 CSV tables for metadata.
BirdVox-70k: https://zenodo.org/record/1226427#.Wt46UWaZO8o
A derivative work of BirdVox full-night, containing 70804 audio clips of duration
500 ms. Half of the clips are positive (contain one flight call at the center of the clip), the other half are negative (containing background noise or a non-flight-call acoustic event). 1.26GB in HDF5 format, containing both data and annotations.
Further details about these datasets, including experimental results, are provided
in the following paper:
BirdVox-full-night:
a dataset and benchmark for avian flight call detection
V. Lostanlen, J. Salamon, A. Farnsworth, S. Kelling, and J. P. Bello
In IEEE International Conference on Acoustics, Speech, and Signal Processing
(ICASSP), Calgary, Canada, Apr. 2018.
The full list of datasets released by the BirdVox projects to date -- summary
below my signature -- which currently includes 4 additional datasets to the ones listed above, is available here: https://wp.nyu.edu/birdvox/resources/#datasets
The BirdVox project consists of Justin Salamon and Juan Pablo Bello from NYU
+ Andrew Farnsworth, Steve Kelling, and myself, from the Cornell Lab of Ornithology.
Sincerely,
Vincent Lostanlen, postdoctoral researcher at the Cornell Lab of Ornithology
and visiting scholar at New York University.
—
All datasets released by BirdVox are released under Creative Commons Internation
Attribution License (CC-BY 4.0). wp.nyu.edu/birdvox
CLO-43SD:
a dataset for multi-class species identification in avian flight calls. 5,428
labeled audio clips of flight calls from 43 different species of North American woodwarblers (in the family Parulidae). The clips came from a variety of recording conditions, including clean recordings obtained using highly-directional shotgun microphones,
recordings obtained from noisier field recordings using omnidirectional microphones, and recordings obtained from birds in captivity. Please cite our PLOS ONE 2016 paper when using this dataset for research.
CLO-WTSP:
a dataset for species-specific flight call identification for the White-Throated Sparrow. 16,703
labeled audio clips captured by remote acoustic sensors deployed in Ithaca, NY and NYC over the fall 2014 and spring 2015 migration seasons. Each clip is labeled to indicate whether it contains a flight call from the target species White-Throated Sparrow (WTSP),
a flight call from a non-target species, or no flight call at all. Please cite our PLOS ONE 2016 paper when using this dataset for research.
CLO-SWTH: a dataset for species-specific flight call identification for the Swainson’s Thrush
179,111 labeled audio clips captured by remote acoustic sensors deployed in
Ithaca, NY and NYC over the fall 2014 and spring 2015 migration seasons. Each clip is labeled to indicate whether it contains a flight call from the target species Swainson’s Thrush (SWTH), a flight call from a non-target species, or no flight call at all.
Please cite our PLOS ONE 2016 paper when using this dataset for research.
BirdVox-full-night:
a dataset for species-agnostic avian flight call detection in continuous recordings. 62
hours of continuous audio from 6 sensors, with 35402 flight calls annotated in time and frequency. 5.7GB in FLAC format + 6 CSV tables for metadata. Please cite our ICASSP 2018 paper when using this dataset for research.
BirdVox-70k:
a dataset for species-agnostic flight call detection. A
derivative work of BirdVox full-night, containing 70804 audio clips of duration 500 ms. Half of the clips are positive (contain one flight call at the center of the clip), the other half are negative (containing background noise or a non-flight-call acoustic
event). 1.26GB in HDF5 format, containing both data and annotations. Please cite our ICASSP 2018 paper when using this dataset for research.
BirdVox-DCASE-20k:
a dataset for bird audio detection in 10-second clips. A
derivative work of BirdVox-full-night, containing almost as much data but formatted into ten-second excerpts rather than ten-hour full night recordings. Out of the 20,000 recordings, 10,017 (50.09%) contain at least one vocalization (either song, call, or
chatter) from a bird (not necessarily passerines). In addition, the BirdVox-DCASE-20k dataset is provided as a development set in the context of the “Bird Audio Detection” challenge, organized by DCASE (Detection and Classification of Acoustic Scenes and Events)
and the IEEE Signal Processing Society. 17.6GB in WAV format + 1 CSV table for metadata. Please cite our ICASSP 2018 paper when using this dataset for research.