naturerecordists
[Top] [All Lists]

Re: Relational Database Design

Subject: Re: Relational Database Design
From: Walter Knapp <>
Date: Thu, 21 Feb 2002 18:59:08 -0500
Marty Michener wrote:

> 
> This relational design is basic to all the big guys - using SQL queries,
> Ingres, Sun, Oracle, etc.
and so on....

Interesting to read, Marty. I'd guessed most all of it from hints you
have given from time to time. Here's a few rough notes comparing
Filemaker methods I use to yours:

> Remember, I said this "was different" but not hard.  Everyone, when
> starting, designs a database allowing "X" species per cut.  This is
> conceptually for the beginner much easier to deal with than a MANY-TO-MANY
> table.  But it has these flaws:
> 1.  It wastes file space much of the time, when only 1 or 2 are used;
> 2.  It cannot accept X+1 species, distorting reality in abundant cases.

You should note a couple things. This was a definite rule in traditional
relational databases. It's much more fuzzy with some of the newer ones.
For instance, Filemaker when you define a field allows you to set that
the field can have up to some number of entries, say like the 8 species
entries I have for my site database. This has been enough for several
years, but tomorrow I could end up with a site with 9. I literally only
have to go in and change the number in my species field definition to 9,
Filemaker handles the rest. And, it does not balloon the file to fit to
do it. It's automatically handling setting up a linked set of data
within one file. It can also be set so a field does link to a entirely
different file in the traditional way if I wished. Actually the hardest
part would be if I want to see the list of all 9 species on the field
display layout without using popups or such like, or, on just one of the
multiple layouts I could have. I'll have to go in and allot space for
that field to expand to 9 Again, it's just changing a repeating number
in the field layout and moving things to make room. The layouts are
drawn just like any graphics.

In the old days of database design it was really critical to not waste
space. I'm not so sure even if our software was allotting all that blank
space it would matter. At the rate my database's file size is growing
over time, I have a few years to even reach a meg. Compared to storing
sound files the database is almost microscopic. Yes, I can remember when
a meg was huge and programming where I had to specify exactly what data
space I would need. So, while we have to be reasonable in designing
databases, I don't think we have to go out of our way very far doing it.

> Obviously, one cannot do all this with Excel.   One needs Foxpro, Access,
> etc.  with the original dBase you had to write, for days and days, a
> program to keep all this relational structure intact through all data
> entry, changes in time, etc..  This is particularly critical: if you
> mistype a species code you "orphan" that record - that is - it is not
> related to any real record in the other table with any real value.  This is
> a huge BAD BAD in relational management.
> 
> My own software actually has each table a separate physical file, e.g.
> SPECIES.DBF.  It has an editor, called SELECT.EXE
> that opens, appends to or edits ANY DBF file, only it can easily be tamed
> without programming.

I'm always amazed anyone would use Excel for databases. Maybe little
ones that do a lot of calculation, but not open ended things.

Again, for Filemaker these lists like this can be within the file, or
called from external. I use lists as much as I can, species, common
names, abundance, all have a limited list of values. In my case, the
entry of species, common names, and abundance is all on popup menus done
with the mouse, faster than I could type it all, and always spelled the
same. I've thought of doing the same for county names, but Georgia has
180 plus counties, states would be doable, only 50 in the menu. For
counties I may use the same list system to feed the spell checking and
make that automatic.

Filemaker can also search for part of what's in a field as can most
databases. I have that problem in the county entry field, because
occasionally I record on the county boundaries and have multiple entries
for county name. A sort for a particular county will find the records
with multiple entries correctly too. As will a search of a field that
can have multiple independent entries, like the species field.

> You are editing and adding to the CUTSOUND table. Please note that if you
> mistype either CUTNO or GSCode, both KEY attributes (that uniquely must
> agree with a single record in another table) you produce a nonsense
> orphaned record.  In most data systems, you would have to write an
> elaborate program that compares each entry to all the valid values in the
> remote table, and then goes back and tells the user: "wrong value".  Set up
> correctly, the SELECT editor does all this and more.

That is why I like a system where it's all in one database, each type of
data is only entered once. This can be set up across multiple databases,
Filemaker can do that too, as can a awful lot of these database
programs. But the rule should always be that each hunk of data is
entered once. You can still mistype it, but it will be the same
everywhere. It's possible to do exclusion searches to find these type
errors. 

Ultimately it pays to learn to type as error free as possible. I got
that down when I was doing EIS work and typing in huge arrays of numbers
for input. One number off and you had wasted the timeshare mainframe
fee. It's more than not typing a error, it's developing a sense for when
you did and checking. My big problem now is that USB keyboards don't
seem to be able to keep up with my fingers always. And my spelling seems
to get worse each day.

> The ALLOWED for the SPECIES table lookups, would be:
> 
> FFSPECIES.DBF#GSCODE#GENUS#SPECIES#ENGLISH&&ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
> 
> Same with the GSCODE, you type in ZIMVIL and it looks in SPECIES.DBF, IN
> COLUMN GSCODE, and returns the line:
> "FOUND:  Zimmerius vilissima Paltry Tyrannulet"

I'd forgotten all this cryptic stuff. I try to keep all my stuff in
understandable english.

So, your allowed in species is just the alphabet? Not a specific list? I
suppose it would have to be as much ground as you have to cover with
what you have recorded. When I get to my next database (a database of
all species of frogs in the world, 4718 species) I'm going to have to
probably do something like that.

I got a copy of the then current Herp Atlas database, a dBase file about
a year ago. It took a month to just figure out all the abbreviations
used. Even the folks working on it seemed to have problems remembering.
And then a few days to suck that into a Filemaker file in english. It's
a little bigger, but not a lot. Goes back to this business we don't have
to use cryptic names anymore, we can handle much larger files. My copy
anyone can read without cheat sheets.

Still, all and all, sounds like you use pretty much what I do
functionally, though more elaborate. And I admire the amount of data you
keep. I'm still adding fields from time to time. Every time I realize I
need some info once in a while. Of course all my old records won't ever
have that data, regardless of how I organize the database it had to be
written into the field notes to be there. Deciding what to keep in the
field notes is the most critical part.

Walt



________________________________________________________________________
________________________________________________________________________

<Prev in Thread] Current Thread [Next in Thread>
Admin

The University of NSW School of Computer and Engineering takes no responsibility for the contents of this archive. It is purely a compilation of material sent by many people to the naturerecordists mailing list. It has not been checked for accuracy nor its content verified in any way. If you wish to get material removed from the archive or have other queries about the archive e-mail Andrew Taylor at this address: andrewt@cse.unsw.EDU.AU