Table of Contents
Why Use a Database
The short answer: you have some data and you want to be able to store it in a way that makes it easier to search, sort and filter in meaningful ways. If you want to store external content (pictures, big chunks of text, etc) or more than a simple list, a database is your best choice.
What You Should Already Know
You should already know how to use your computer: how to use applications and files. You should also know what a spreadsheet is since we use Microsoft Excel as an example. And indeed, this article targets budding developers and assumes you have some interest in creating software!
What You Will Learn
You will have a conceptual understanding of: • what is a database • what is a Database Management System (DBMS) • what are Layers of Abstraction (and why should you care?)
Where to Go from Here
If after reading this you are still convinced that Excel is a database, you should learn more about what DBAs expect from a database.
What is a Database
A simple definition of a database is a single unit that stores information in an organized way. This unit can be a physical file on a computer, or it can be a unit of organized information that is stored in memory.
What about Excel?
You might be thinking how this definition could equally apply to a spreadsheet like Microsoft Excel – and you would be right! In fact, there are many business people using Excel files to store information in an organized way when a database is a much better solution for them.
What is a DBMS
A Database Management System is one or more software applications or systems created to work with databases. Every DBMS vendor has their own definition of what a DBMS is – a definition that matches the product they sell (and which often doesn’t match well the offerings of competing DBMS vendors). What Database are you using?
For some strange reason, many computer users use the terms database and DBMS interchangeably. That is like saying Microsoft Excel and .xls files are the same thing. Just remember that a DBMS is software and a database is a file.
What are Layers of Abstraction (and why should I care?)
If you came into development with the intention to create something particularly useful just for yourself and not to become a professional programmer, you are likely to miss out on the separation of fish from fowl.
Back in the 1970's a committee at ANSI (American National Standards Institute) didn't spend their days sitting around kidney shaped tables in turtleneck sweaters eating fondue, and instead gave serious consideration of keeping data separated and contained in a safe and logical way.
ANSI recognized that individual users may need to access a single store of data, but then view that data in a special way that is customized to the needs of the user. This method of customized viewing of data was cleverly called – user views.
Back to ANSI. ANSI came up with a concrete standard for separating databases into layers of abstraction. If you keep a database separated this way, you will reap untold riches and fame:
The Physical Layer
The unit we defined earlier can actually be more than one file on a hard disk. One advantage of doing this is that it is possible to segment portions of the database across multiple storage devices like hard drives. Then each file can be accessed in parallel with the others, increasing performance.
The Logical Layer
Now we are looking within a database file itself and how information is stored there. This is also often referred to as a database schema and how the bits of data are organized across the database.
To user, a database appears as a single structure, even if it`s physically separated into multiple files.
The External Layer
This is the level where users – user views live. Some external layer information can be stored within a database itself for reuse later, or it can be created on a temporary basis and then thrown away when it is no longer needed.
This Thing Called the Internet: When web pages first appeared on the World Wide Web, web designers happily jumbled together display rendering information with the information itself in HTML, the language of web pages. Many web designers today still do this; yet the trend is to separate information into presentation, behavior and data. Web pages and web browsers displaying them then become the user view to DBMS found on internet servers.
If you took the time to read This Thing Called the Internet, then you already have an idea what we mean by presentation, behavior and data. In the world of databases, data and some behaviors are stored inside a database. Presentation is not stored inside a database.
Microsoft and Apple More Alike Than Not
Microsoft Access is an extremely popular DBMS that, by default, stores presentation, behavior and data all within the same file. Apple Computer's FileMaker Pro is the same. However you can separate your data from presentation and behavior in Access and FileMaker Pro.
So Why Should I Care?
You might be thinking at this point – why not use an all-in-one, mix-it-up approach if it's good enough for Microsoft and Apple?
Mixing Makes More Work
So you have written an application with a million lines of code and the code is all in one convenient file. The downside is that you have to search this file to find errors as they crop up. If you store your data separately, it lets you concentrate on what is important at the time.
Simplify for Safety Sake
Computer crashes happen. Application crashes happen. Careless programming can result in damaged applications. If the presentation part of your application is separate from your data, then a crash is less likely to impact your data. Compare with Excel. If Excel crashes, it will take down Excel plus you may lose whatever data was loaded into memory. You can reinstall Excel, but it can be much more challenging to get back your data.
If a file is constantly being written to, especially by multiple users, its more vulnerable to the dangers of unprotected computing. Some files though, if damaged, can be repaired or replaced without serious loss of data that matters. For example, a database index may be rebuilt regularly to improve performance in locating data.
A Valentina database can consist of up to four physical files. If something happened to the index file, the meaningful data survives.
No really, what about Excel?
You just won`t let this one drop, right? Superficially and based on our simple definition above, Excel looks a lot like a DBMS. It is a tool for organizing data into neat rows and columns and then managing it in all sorts of imaginative ways with fancy reports, pivot tables and in-cell calculations. However Excel does not conform to many modern conventions in the world of databases – it’s a paper tiger in the face of even Microsoft Access and a soggy kitten before the data management muscle of Valentina (or Access's big brother, MS SQL Server).