canberrabirds

Canberra Bird Notes - new search facility

To: 'Julian Robinson' <>
Subject: Canberra Bird Notes - new search facility
From: Philip Veerman <>
Date: Mon, 5 Mar 2018 13:11:36 +0000

Yes thanks for your answer. I was using “Debus” as an example, of course the name appears other than that once (for other issues). My typing “only” below, was wrong. Actually I have reviewed two of his books, in CBN, years apart. Yes there are many other mentions in other editions. I wonder if it returns the mention for the first or the ‘most relevant’ appearance in each edition. How would a computer determine ‘most relevant’?  I am guessing it finds the first. Anyway a lot better than nothing.

 

About the 4 year indices and: Their limitation is that you must separately examine each 4-year period index, but they may still be useful for tricky subjects. Sure but let’s face it, for most other journals there is one annual index. So for Emu, that has been going for over a century, I guess there are over 100 of them.

 

Also I mention that the mention on the website that: The first issue of CBN every year is devoted to the Annual Bird Report. That is only a relatively new protocol. From by my quick look, volume 30 (2005), was the start of this arrangement. I am surprised it was that long ago. I don’t think this arrangement was ever the case prior to that year.

 

Of course I have those earlier indices and I can provide them if needed. I think I have the full set of CBN or maybe missing a couple from the 1960s, but better still, they should also be with the two official club copies of the full set of CBN. One complete set of the early CBN were formally bound in green covers. Hopefully this has continued............

I wonder why I don’t have a record of: Index to Selected Volumes

·         Vols. 29-32 (partial) why partial?

·         Vols. 25-28

·         Vols. 21-24

 

I was not aware of these. Were they printed? I don’t have these. Does anyone? I see they are on the website, then why aren’t the earlier ones that were printed there? Anyway V 32 was 2007. What since then? The 4 year index that I arranged, was printed and issued within a month of the last issue included. Kay Hahne did most of the work. The process back then, before technology became so clever, was much more difficult than it is now.

 

Philip

 

 

 

From: Julian Robinson [
Sent: Monday, 5 March, 2018 7:20 PM
To: 'Philip Veerman'
Cc: 'canberrabirds chatline'
Subject: RE: [canberrabirds] Canberra Bird Notes - new search facility

 

1)     I was unaware of these early indexes mentioned below by Philip.  In the website archive ( http://canberrabirds.org.au/publications/canberra-bird-notes/  under Index to Selected Volumes) we already have hand-generated indexes of Vols 21 to ~30 (1996-~2005), so combining that with the ones you mention below would be useful.  Their limitation is that you must separately examine each 4-year period index, but they may still be useful for tricky subjects.  If Philip or anyone can provide a copy of the older indexes I can add them to the website archive.

 

2)      Philip’s point in green below is the biggest limitation of the google search approach. It only returns one result per entire issue.  If you search on a term that appears 20 times in one edition, it only gives you a single return with the context for the first or the ‘most relevant’ appearance and ignores the rest.  In Philip’s example, if you search on [debus] you’ll find 17 returns covering 17 issues but no idea which issue includes the desired book review. We can improve the situation by being more specific in choosing our search terms.  In this case you could instead search on  [debus “book review”] (leaving out the square brackets) but this doesn’t help as much as you think because many editions of CBN have a book review somewhere.  It does mean that it is likely to pick and provide the context notes for the occurrence where ‘debus’ occurs close to ‘book review’.  Thus the correct issue can be picked by reading the returned context notes. Searching on [“book review” debus veerman] is even better and very easy to pick from the context notes.  If you knew the title of the book you hit the jackpot with only a single return from ["Australasian Eagles and Eagle - like Birds" debus]  

 

3)      I don’t know how far back digital copies of CBN go. Perhaps someone knows which issues were scanned?

 

Julian

 

 

 

From: Philip Veerman [
Sent: Monday, 5 March 2018 5:46 PM
To: 'Julian Robinson'
Cc: 'canberrabirds chatline'
Subject: RE: [canberrabirds] Canberra Bird Notes - new search facility

 

This is a very nice development and whilst I have no idea how much work was involved, thanks for doing it. I tried it and it looks good. I will point out also, although Julian’s note sort of indicates as much, it sources only individual issues in which a word was included regardless of articles or context. For example I searched on “Debus” and it finds only the mention of the article about the Little Eagle mentioning him (maybe because that is the first mention of the name, in that issue). However it does not mention him again for the several mentions of the name in my review of his book. Thus if you have in mind the book review, this search does not help a lot to locate that, unless you think that it was already listed for other reasons.  

 

I also noticed in a flip through, that there are blank pages included where I doubt the printed original was blank, although trying again I did not find these. Sorry that does not help a lot.

 

It is also worth pointing out that CBN did issue quite good printed indexes (indices) in blocks of 4 volumes, being V1 to 4, V 5 to 8, V 9 to 12, V 13 to 16 and V 17 to 20. (I did one of them.) These do give page references for species (in taxonomic sequence), to authors and to some general topics. These cover the years 1968 to 1995 but appear to not have been done since then. I don’t know whether these indices are also included in this electronic archive. Would be useful to do so.

 

Julian also wrote: The older editions of CBN were scanned from paper copies and used OCR (optical character recognition) to read the text.  I am curious as to what is the dividing line of “older”.

 

Philip

 

 

 

On 4 March 2018 at 22:14, Julian Robinson <> wrote:

It is now possible to search the entire CBN archive via a search box found at the top of the CBN website page – go to  http://canberrabirds.org.au/ > Publications > Canberra Bird Notes.

 

This is a beta version, meaning it is still being perfected.  To use it, enter your search term in the box and click the magnifying glass.  You will get a page of results, sometimes many results.  They don’t appear in any order so be guided by the date and/or the issue number (at the end of the green link). 

 

When you click one of the results you’ll download that whole issue of CBN.  Your search term is NOT highlighted in the resulting document so you need to search again within the document  – Control-f in windows to ‘find’ and insert the same search term.

 

It seems to work well but is completely reliant on the cleverness of your search words to avoid getting thousands of results or none at all.  Sometimes using apostrophes “like this” around your words helps by restricting to that exact phrase, but be aware that then it will not find plurals or other parts of speech.  You’d need to search for “Fuscous honeyeater” and again for “fuscous honeyeaters” to get them all.  On the other hand if you search for Fuscous honeyeater without quotes, it will find honeyeater or honeyeaters but may return items that just contain one of the words.  A technique to find ‘serious’ mentions of a particular bird is to use the scientific name – this generally works well without false alarms.

 

The older editions of CBN were scanned from paper copies and used OCR (optical character recognition) to read the text.  Errors in the OCR process (commonly where words are concatenated or misspelled) mean that sometimes you may miss an occurrence of something you are looking for.  Unfortunately this will be permanent limitation, though it seems to be quite rare.

 

You may be asked to prove you’re not a robot at various times, be prepared to do what it asks and then “submit”.

 

As mentioned this is a work in progress and not all issues of CBN have yet been indexed by google – at the time of writing 108 of 188 have been indexed. 

 

I would appreciate comments on any problems encountered, or any suggestions.

 

This has been a long time coming – hopefully it will prove to be useful.

 

Julian

Cog website

 

 

 

<Prev in Thread] Current Thread [Next in Thread>
Admin

The University of NSW School of Computer and Engineering takes no responsibility for the contents of this archive. It is purely a compilation of material sent by many people to the Canberra Ornithologists Group mailing list. It has not been checked for accuracy nor its content verified in any way. If you wish to get material removed from the archive or have other queries about the list contact David McDonald, list manager, phone (02) 6231 8904 or email . If you can not contact David McDonald e-mail Andrew Taylor at this address: andrewt@cse.unsw.EDU.AU