1 of 8

Understanding the relational database

by Joseph Love

Many years ago, when I was a very young programmer, I was introduced to Microsoft Access. Although it isn't the most complicated database in the world, the concepts that I learned in the simple Access program changed my career and even my life. Today, being an Oracle DBA with over 20 years of programming experience, I can remember the several eureka moments where I first began understanding databases. Understanding these principles enabled my mind operate in a realm I never knew existed, and projected me to the next phase in my career. I am now a DBA for an enterprise application in one of the biggest companies on earth, thanks to a few, now simple, concepts.

One of the first things I learned is that a database consists of lists of information. Like a spreadsheet, there are rows and columns, and unlike a spreadsheet, each column is required to contain the same type of data. For instance, if you were creating a spreadsheet of all the Cd's you own, you might use one column for the album name, another for the artist, another for the year, another for the cost of the album, etc. To do all of this, you would simply put in the title (or label) of each column in the first row. Subsequent rows would probably contain the actual information about each album you own. In a database, however, things would be a little different. First, you would have to define a table to store this information. You would add columns to the table, each having a specific data type. For example, you would create a table named "albums", with columns named Album_Title, Artist, Year, and Cost. The Title and Artist columns would be set to a "text" datatype, because they contain letters, and possibly numbers and probably aren't going to be used in mathematical formulas. The Year column, you could set to a numeric data type because it is only going to contain numbers, however, you could potentially use "text" or even "Date" datatype. The cost would definitely be set to a numeric datatype, since you might want to know the average or total cost of all your items (or, you would want to use this data in a formula).

The second thing I learned is that a database allows you to relate relevant information. Now suppose that you now want to make lists of the songs on each album you stored. If you've ever used a spreadsheet, you know that there are several options of achieving this, but all of them come with problems. In a database, this is no problem whatsoever. In a database, we simply create another table called "songs". This table could contain the columns song_title, song_length to store the title of the song and the length (in minutes or seconds) of the track. Again, the song_title would be a text datatype, and the length would probably be numeric. This is where the explanation gets a little tricky (and fun).You'll want another column to refer back to what album the song is on. In our current example, we'll just create a column called "Album_title" in the songs table. We will set this column up to refer back to the album_title column in the albums table. We now have a very basic relational database.

I learned that these relationships can be used to maintain the data on a live system. Now that we set up a relationship between the two tables, we must under stand the ramifications of this. The relationship in this example is that the "album_title" column in the "songs" table refers back to the "album_title" column in the "albums" table. This allows the database to help you to keep your data "valid". It automatically creates rules that will not allow you to delete an album row, without first deleting all of the songs related to that album, thus keeping you from having songs that refer to an album that does not exist. Conversely, you cannot create song records until you have the album set up. For example, if I was going to add an album called "Lullabys" that had 3 tracks called "Go to sleep little baby", "Hush little baby", and "Rockabye baby". I would first create a row in the albums table by filling in a value for each column, and then I would create 3 rows in the songs table, one for each song.

I also learned that you can now easily query this information simply. Now lets suppose you have your little database loaded full of 20 albums and 250 songs. You now would like to know which Album the "Rockabye baby" song was on, as well as other songs on that album. You can see that you simply go to the songs table and search for the song "rockabye baby" and see what album it is on, as well as find all the other songs on that album. You can also use the album title to go back and see what year the album was released and how much you paid for the album.

Some of the more savvy of you may have noticed that I didn't talk about Keys, well, now is the time. In each table in the database, it is pretty important that it have a Primary Key. A Key is simply a unique identifier for a row within a specific table. For example, the Key to our albums table was the album title. The gotcha with keys is that each key has to be unique. So in this example, you can only have one "Lullabys" album. If another artist composed another album titled "Lullabys", you could not put it into the system because we set up the title as the Primary Key. The best way of getting around this table is to simply redesign the table structure (a little). Databases usually have a way of generating keys, or "numbering" each row of your table. For example, we could add a column to the "albums" table and create a field called album_id, set it as the primary key and set the datatype up so that new records automatically get a unique number within the table. For example, your first record might get automatically assigned the number "1" for the value of the "album_id" column. The database automatically handles this for you, and will show you the number it created, but you usually don't have to number the albums themselves. Now, you also change the column "album title" in the "songs" table and rename it to album_id. You simply change the column to album_id, the datatype changed over to numeric, and make sure it refers back to the album_id column of the albums table. We could have done all of this correct in the first place, but I feel the explanation is easier when done in this order- it also shows the pitfalls of what all has to be done when you design a database wrong the first time. Now when entering a new album, you create a record in the albums table, noting the album_id that the database assigns to the record. When you insert the songs, you will have to remember what the number was in order to know what to put into the album_id field in the songs table.

Playing around with databases will really open up your mind to the possibilities. With the simple database we created, you can easily create reports that can show how much you paid for each album, what the average cost per song in your collection is, how many songs you have in each year, etc. You can get really fancy and add new tables for "Genre", "format" (CD/Record/cassette, etc.), etc. Once you wrap your mind around the possibilities, you realize that they really are endless.

Helium, Inc.
200 Brickstone Square Andover, MA 01810 USA