Why Use Databases?
Why use Databases?
Web-enabled databases provide the opportunity to fundamentally change the way web sites are produced, managed, and delivered. Currently, many people (especially small and medium-sized businesses) are trying to figure out how to effectively use the Web. In most cases, the prime focus is on using the Web to create, manage, find, and deliver information that is stored in a database. This has led to an explosion in the number of products for providing connections between the Web and existing database products, as well as completely new Web-oriented database tools.
The number of products for connecting databases and the web seem to be increasing faster than any other segment of the Internet software market and with good reason -- knowledge is power.
Databases provide a direct way of tapping the knowledge that already exists in an organization and providing it to interested parties with the click of a mouse. Sites that harness that knowledge and deliver it over the web are a powerful presence compared to their less sophisticated brethren.
Providing web access to a database
Before going to all of the trouble required to use databases on the web, lets take a deep breath and ask “Why do I want to use a database on the web?”. There are a number of good reasons for needing to use databases on the Web which rest on two basic premises:
- the web is a great medium for delivering information
- databases are the perfect medium for managing information.
So a database can be used to manage the "back-end" mechanics of an information delivery system while the web is used to handle the "front-end" user interface. There are a number of ways that databases can be used on the web, but I like to break it down into three types of applications: dynamic publishing, information transactions, and data storage and analysis. I think that these broad categories provide an excellent framework for thinking about web database applications. The rest of this section describes the three basic types of web database applications.
Dynamic publishing
For the traditional web designer, databases provide the perfect method for dynamically publishing web content. Normally, each web page of a site is an individual entity. It is individually designed and saved as a distinct file. In many sites, a large number of pages are based on a standard template which helps to simplify or automate the design process. This means that a lot of time is spent entering or cutting and pasting text and filenames into the same standard template and saving a number of files. A page from an on-line catalog is a good example of a template: it probably contains a picture of an item and its accompanying description, catalog number, and price.
Tools exist to make this editing process easier, but it is a much more elegant and over the long term, efficient solution to store the image filename and text description in a database and have the computer “fill in the blanks” for the image filename and description in the standard template. This is the essence of dynamic web publishing.
Information transactions
The transaction is an apt metaphor for a number of business-oriented functions. Nearly every interaction between a client and a business involves moving discrete chunks of information. Some typical transactions examples include:
- registering for a course
- ordering an item
- turning in an assignment
- taking an examination
In each of these examples, discrete chunks of information are provided by the client and another set of data chunks are given to the client by the server. At the heart of this information exchange is some sort of transaction, usually accomplished by filling out a form. It is a fairly simple operation to translate a paper form into an equivalent web document. Combined with a database, such forms make one step information transactions with the computer filling the role of operator eminently feasible.
Data storage and analysis
Business is driving a lot of the development on the web. Honestly, a lot of companies are having trouble figuring out how to make money on the web, but the core of any business is how it manages its information assets. Everything from inventory items to customer addresses to orders and suppliers is information that is probably stored in a database already. The web can provide an interface into many current business systems, especially for businesses using commercial databases as their “back-end”.
Data in this category is typically static, or changed very rarely. It exists mainly as a resource for analysis or historical purposes. Examples include:
- filled orders
- accounts paid
- closing stock prices
- images in a media archive
- address books
This data is normally accessed in a variety of ways to answer specific questions. In enterprise-level computing, these are often called data warehouses and the process of exploring them is called data mining. Regardless of scale however, this is the most common type of traditional database application.
Terminology
Database Terminology
To be able to talk about database design properly, its essential to learn the proper terminology. Like any specialty, database professionals have their own language. Its important to speak that language regardless of whether its your job to perform those tasks. An outside database consultant hopefully can translate what he or she is doing into language you can understand, but its much better for everyone involved to be on the same linguistic page. And in case you decide after reading this book and solving your database problem that you want to do your own web database consulting work, understanding and speaking the accepted terminology is a great benefit.
There are a number of terms that describe the parts of a database. These structural terms are standard regardless of whether the database is implemented in any particular piece of software and in general apply to paper databases as well.
A field is the most basic structural unit of a database. it is a container for a piece of data. In most cases, only a single logical piece of data fits in each field. Of course the definition of logical can vary from application to application. For example, a mailing address could be a field in a database of business contacts. In another application, the street address, city, state, and ZIP code may all be in different fields that can be combined to form the mailing address.
A column (or attribute) is a series of fields which are all of the same type. An example would be the First name of your customers.
A row (or record) is commonly defined as a set of fields describing a larger unit is normally called a record or a row. The fields in a record provide a complete description of each item in a collection. A record is a unique instance of data about an object or event.
A table (or relation) is the formal name given to the group of records that contain a set of rows. A table normally represents a distinct object (business clients or library books), or an event (product orders or stock prices).
A database is basically a collection of tables. It also often includes forms for entering data, rules for checking and validating data that has been entered, and the format for creating informative reports from the data in the database.
A key is simply a unique identifier for each row in a data table. Even though each individual record represents a separate piece of data, some of those records may look identical. A key provides a completelyunambiguous way to distinguish between distinct records, and more importantly, serves as a pointer to that one particular record of the entire table. The government uses social security numbers to uniquely identify each individual in the country. Names, addresses, or any other piece of data on a citizen may be identical, but no one has the same social security number so it serves as a unique identifier for any American.
A foreign key forms the basis of a 1:M relationship between two tables. The foreign key can be found within the M table, and maps to the primary key found in the 1 table. To illustrate, the primary key in the professor table (probably a unique identification number) would be introduced as the foreign key within the classes entity, since it would be necessary to map a particular professor to several classes.
A composite key is utilized when one column doesn't provide enough information to uniquely identify a record in a table. Instead of one column, 2 or more columns are utilized to create the key.
An index is sometimes confused with a key, but they are two completely different things. Indexes are used to speed up the performance of queries and are implemented in a variety of ways by different software packages. For example, in a database of employee information queries may often performed by Web Databases: last name, social security number, and phone number. Indexing essentially rearranges a field in consecutive order so records do not have to be searched sequentially. The index essentially relates an ordered field with the primary key for that record. This greatly enhances the speed of common queries.
A view is a virtual table – virtual in the sense that the table is not physically represented in the database as the viewer sees it. The view is dynamically constructed from the database as needed.This is a very essential point since the main use of data-backed web sites is to construct various views of the data in the database. In a number of common situations, multiple groups of users want to look at the same data table in different ways.
Relations
By understanding the various relationships both between the data items and how they relate to items in other tables you can more accurately relate information beter. There are three types of data relationships that you should be aware of:
one-to-one (1:1) - A one-to-one relationship signifies that each instance of a given entity relates to exactly one instance of another entity. For example, each student would have exactly one grade record, and each grade record would be specific to one student.
one-to-many (1:M) - A one-to-many relationship signifies that each instance of a given entity relates to one or more instances of another entity. For example, one professor entity could be found teaching several classes, and each class could in turn be mapped to one professor.
many-to-many (M:M) - A many-to-many relationship signifies that many instances of a given entity relate to many instances of another entity. To illustrate, a schedule could be comprised of many classes, and a class could be found within many schedules.
Introduction to Normalization
To become a true database power user, you need to understand a key database design concept called normalization. Normalization is the process of ensuring that your data conforms to sets of rules called (logically enough) normal forms. When you normalize your data, you eliminate redundant data and ensure that the information in each table is properly linked.
Normalizing enables your databases to work faster, it saves storage space (hey, disk space may be cheap, but it isn't free), and it also helps protect your data against corruption. If you're getting nervous about this, don't worry. You don't need to be a professional to normalize your data, and you can even try this exercise at home. We'll use Excel because prototyping your databases in Excel allows you to "make all your mistakes on paper," in a sense, before you go through the harder work of creating a new database, defining data types, and so on in Access.
This explains how to make your data conform to the first, second, and third normal forms.
- First normal form
- Places your data in columns and rows.
- Second normal form
- Adds a key—a column of values that uniquely identify each row in each table —to your data.
- Third normal form
- Ensures that each table contains unique data. In other words, it ensures that a table of customer identification data does not contain order data, and so on.
- It contains repeated and combined data
- For instance, the rows contain data on each Department over and over, thus wasting space. Also, let's imagine we know people will want to access this database by referring to some of this info individually, such as Floor number. It can't be in the same field as other info.
- It contains unnecessary information
- What do the Chicken prices mean?
- It has a structure that leads to bloated data
- As the years progress and the teachers teach different classes, you have to add new columns for each class, thus increasing file size. Generally, you want to add rows as a table expands, not columns.
- It saves space.
- It allows for fast retrieval of information.
- It describes the data clearly. In other words, you never have to wonder what you're seeing.
- It ensures data integrity. For example, it ensures that a field titled First_Name never contains an address or postal code.
- Database designers have used the normal forms for a long time, and they're a standard, proven technology. We'll start by placing the data in first normal form.
Now we can walk you through the process of using the normal forms.
A quick look at denormalized data
One of the best ways to understand normalized data is to look at denormalized data—data that does not adhere to the normal forms. Follow these steps:
You'll see several design flaws that denormalize the data in the first worksheet:
Normalize your data
You normalize your data for several reasons:
First Normal Form
When you put data in first normal form, you place all of it in columns and rows and you separate each piece of info into its own field, particularly if that info is likely to be searched for. For example, in the denormalized data that you saw in the previous section, the Dept fields each contain several bits of data - you have to add new columns for each new class as teachers keep teaching new classes. First normal form eliminates that kind of potential confusion.
So what's the procedure? Whenever you want to try putting any data into first normal form, first make sure it's in table form, with columns, headers, and rows.
Then think about what's valid information:
Only place relevant pieces of information together into one single table.
Second Normal Form
Each data record must be identifiable by a primary key. If there is no obvious key create one. Remember it might be a good idea to create a composite key.
Ensure every table has a primary key, even if it is a composite key.
Third Normal Form
Transitive dependencies -- columns that are directly dependent on another column (not the primary key) -- must be removed and put into their own table
Use foreign keys to relate the tables together.
Class or Relationship Diagram
A class diagram is a visual model of the data and associations of the tables in your database. These diagrams have many options, but the basic features that must be included are the table names in boxes and the associations (relationships) connecting them.
Below is an example of the data that we have been working on as a Class Diagram:
Lets separate out the information:
- Table Names are located in the gray boxes.
- The primary keys are indicated with a PK and in Bold The foreign keys are indicated with a FK an a number.
- The rest of the columns are indicated below the primary key. Arrows are used to indicate how each table connects to another table.
Creating Class Diagrams
Class diagrams can be created in an image editing program, however a program called VISIO from Microsoft is extremely intuitive and offers a lot of information if you choose to use it.
Instructions for Making Class Diagrams in Visio
- Open Visio and Select Database from the list of categories. Then click on the image for a Database Model Diagram.
- Drag a Entity shape from the left hand panel onto your stage for each table in your database.
- When you select a Entity on your stage, look at the bottom of your screen for the Database Properties options. Make sure that the Category for Database is "Definition" and then type the name of your table in the "Physical name" textbox. Continue this for the rest of the tables.
- Now select the Columns category in the Database Properties. Add in all of the columns in your specific tables and indicate primary keys. Once you understand more about database structure you can even add items such as required and data types.
- Now drag the relationship arrow onto the stage. Drag each end to a table that has a relationship with the other table. This will update you tables with foriegn keys.
- Now you can go to File - Save As - and select any image option - I suggest the .jpeg or .gif.