Databases are used every day, sometimes without us realizing. They are everywhere. The most basic example is a telephone book or a library card index. They can store all sorts of information, from phone numbers to map grids and references, score cards from sports games, report cards, etc. They allow quick searching and are great for keeping historical data, for example weather history. Most websites run using a combination of a database and a CMS (content management system).
At the most basic level, databases contain tables. Database tables are used to hold information, for example you may have a database table about ‘authors’ or ‘news’.
Inside a table, you have fields. Table fields tells us what information is being kept in the database. In an authors table, for example, you may keep the persons name and email address.
The actual data is kept in rows.
The layout of a database table, fields and rows looks much like a spreadsheet in grid format.
A whole database is a number of spreadsheet pages all linked together in a particular way.
Databases allow adding, deleting, updating and retrieving of data quite easily, as well as sorting and searching.
Database design can be quite tricky and time consuming, so before you start anything you need to think about what you’re trying to do and exactly what you want to achieve.
Using a CMS as an example, we’ll look at some of the basic features and see how to put a database together. This process doesn’t just apply to a CMS design, it applies to any type of application that uses a database. Some things of course apply to any application you want to build, whether it uses a database or not.
The first step is to get a list of features together so you can see what you need to think about.
– Do you want many authors or contributors to your site (or is it just you writing everything)?
– Do you want just a news section, or do you want blogs as well?
– Do you want different categories?
– Do you want people to write comments?
Now that you have a basic feature list, you can look at how things fit together so you can design your database. Each step will affect other steps, so it’s important not to rush designing your database – it can be very difficult to “fix” it later and could end up being quite costly too!
In this example, you have different categories so you can group your content together (a sports site might separate popular sports like football, baseball, basketball and so on).
You also have a news area.
At the basic level, news will have a title and the actual content. Looking at extra features you might want to include:
– Will news items be in multiple categories or will they only be allowed in one particular category?
– Will multiple authors contribute to a news item or will it be just one?
– Will news need to be authorised before going live?
– Will news expire after a certain amount of time?
Comments will need a reference to the news item they are commenting on, the actual comment (of course!) and when the comment was made. You might want to keep the name of the person making the comment, their email address and a url for their website. For our example, only existing authors will be able to comment.
The answers will tell us exactly how your CMS will work and how the database will need to be designed, so it’s quite important to get as much information as you can at the start. Even though you haven’t started yet, you may also need to consider “feature requests”. If you hear a suggestion about putting news in multiple categories (you may hear “I’d like to be able to do this but I don’t need to right now”), build the database with this in mind. It may add a bit of extra work now, but it’s easier to do at the start than it is later on when you have to redesign everything!
How you design the database can affect speed as well. A poorly designed database is very difficult to use, add features to, or speed up. If you have to do complicated queries or have to perform complex calculations to work out searching, it’s going to be hard to speed this up.
As you can see, database design can get quite difficult, the key is not to rush it and try to think about everything you need to do before you start. Brainstorm with others about what you might need, get them to take a look at your feature lists and questions and see if you’re missing anything.
Once that has been decided, you can start looking at the table structure.
Using the news table as a starting point, we need to keep the following information:
– author (we’re only going to allow one author)
– category (we’re only going to allow news in one category)
The first thing to look at is how can I uniquely identify a particular row?
You need to be able to uniquely identify a particular row if you want to update or delete it, we don’t want to accidentally remove the wrong entry.
A ‘title’ for a news item doesn’t uniquely identify a row because you can have multiple items with “New Product Update” or “Special Discount” and so on.
Going through the news items, we can’t use any of these fields to uniquely identify a row. What next?
We can make one up!
In mysql we can use an ‘auto_increment’ field, in postgresql it’s a ‘serial’ field. This field will allow us to uniquely identify a row and will automatically change so we can basically “forget about it”. It is a number that gets updated each time a row is added to the database table.
This becomes our primary key (by definition a primary key simply allows us to uniquely identify a row in a table).
Other examples of primary keys are social security numbers and ISBN’s for books. Only one person has a particular SSN and only one book has a particular ISBN.
A phone number would be no good as a primary key – you could have multiple people listed under the same number. An email address could be used as a primary key but it depends on the rest of your database structure (for example, family members might share the same email address, so using this depends on what you are trying to achieve).
So now our table will have this information:
– newsid (our primary key)
Going through the same process for categories we end up with:
– categoryid (our primary key)
Our authors table structure looks like:
– authorid (our primary key)
– author’s name
– author’s email address
– author’s website url
Our comments table structure looks like:
– commentid (our primary key)
Looking at this graphically, we end up with something like this:
So we can see how everything links together (this is called a “relational database” since there are relationships between each of the tables).
As you can see, there is quite a lot to properly designing a database, and we haven’t even started creating any tables yet!