In a previous job when I was responsible for hiring PHP developers, I always used to ask questions about database abstraction in my interviews. It's amazing how often candidates with the best-looking CV's, and impressive looking project lists, were tripped up by their lack of knowledge of what a database abstraction layer is, and were unfamiliar with any of the main database abstraction layers.
I can't really blame them. It's not obvious to newbies learning PHP development from the PHP website, many books, or other online resources, that abstraction layers even exist, or why one should use them. To many who've put in the time of learning the PHP functions, suddenly being faced with a whole new set of functions that don't (on the surface) provide much benefit can be daunting.
Needless to say, none of the candidates who were clueless about abstraction layers got the job. But, to make things easier for others in a similar situation, this month's article covers reasons for using abstraction layers, and takes a look at some of the more popular ones in this rapidily changing field.
Why use a Database Abstraction Layer
Let's start with the reason you'll usually hear trumpeted from the rooftops. It allows one to switch from one DBMS to another with a minimum of pain. That's not entirely true, and certainly not the main reason for using an abstraction layer. Using the PHP MySQL functions means that one would have to change each instance inside your code. If your code is neat, you shouldn't have these littered everywhere, but there still may be more to change than if you'd used an abstraction layer. However, it's highly unlikely that you'll be able to switch one line of code, switching mysql to oracle, for example, inside a config file. There're usually at least some queries that use features unique to a particular DBMS, or that differ from the alternative you're switching to, and there is also likely to be performance tuning unique to that DBMS. There's no other way to migrate these than to manually go through and rewrite each one.
Some claim that developers should never use non-standard SQL for this reason, but that's an unreasonable restriction. I was once involved in a highly painful project to move from MySQL to Informix. The code had been written with a degree of database abstraction (PHPLib), and it was still a painful process. Luckily I managed to convince everyone that MySQL hadn't been properly optimized, and was in fact suitable for the task. Nevertheless, using a database abstraction layer does help portability, just don't make the mistake of thinking it will be easy to switch to another DBMS. The choice of a DBMS should never be lightly taken, and once committed to, not easily backed away from.
The other advantage usually cited is that, by its very nature, abstraction simplifies a complex task. So, instead of having to learn an entire new set of functions, one can simply use the standard abstraction layer's code. However, this too is an overstated advantage. A competent developer should be comfortable with all the PHP database functions - they really don't differ that much. The many abstraction layers however do differ from each other quite substantially. Familiarity, though, is an advantage - it just makes a developer that much more efficient. Java developers coding in PHP may be more efficient using Creole (based on JDBC), as could Microsoft developers using ADOdb. Developers working in an environment with multiple DBMS' will appreciate the single API, and it undoubtedly improved developer productivity.
Metabase was one of the slowest abstraction layers, mainly because it was designed with maximum portability in mind. PDO, written in C and accessing the native API's, is one of the fastest. If you're concerned about performance, the best way is to perform your own benchmarks, set up to simulate your environment and needs as closely as possible.
Popular PHP database abstraction layers
There are three granddaddies out there: Metabase, PEAR:DB and AdoDB. Most code out there at the moment makes use of one of these, with PEAR being the most popular (being bundled with PHP), ADOdb perhaps next, followed by Metabase. However, there are some new ones worth looking at. PEAR:DB and Metabase are in the process of being superceded with other abstraction layers, PEAR:MDB being a merge of PEAR:DB and Metabase, and apparently having a better performance than both. Let's look at some of these.
In existence since 2001, part of the PEAR repository, and bundled with PHP, PEAR:DB is for these reasons the most popular database abstraction library. PEAR stands for PHP Extension and Application Repository, and describes itself as a framework and distribution system for reusable PHP components. It's not known as particularly zippy, and is also being usurped by more recent and better performing abstaction layers. PEAR:DB supports:
In 2003 I wrote an introductory tutorial to ADOdb (part 1 and part 2). Of course a lot has changed since then, but ADOdb is still a popular abstraction layer. It's particularly easy to learn for Microsoft developers, as it uses many of the ADO conventions. It's been around since 2000, and is well-supported, and is also used by numerous popular web applications, such as Mambo and eGroupWare.
ADOdb has quite a few methods that go beyond what's normally in an abstraction layer, such as formatting tables and caching and it supports the most DBMS'. The list currently includes:
Developed by Manuel Lemos, Metabase has a reputation as one of the slower abstraction layers (see the above benchmarks), but then it does support all the way back to PHP 3, as is also designed for maximum portability.
MDB is a merge between the PEAR:DB and Metabase database abstraction layers, as well as an attempt to improve the performance and portability of both. The first stable version was released in April 2004.
PEAR:MDB2 takes the Metabase/PEAR:DB merge that begun with PEAR:MDB further, as well as working on consolidating the resultant bloated API. It also aims to integrate easily with PDO. As of August 2005, PEAR:MDB2 is still in beta.
Based on Java's JDBC, as well as to some degree on PEAR:DB, PEAR:MDB and ADOdb, Creole is a fairly new abstraction layer for PHP5. Not burdened with PHP 4 compatibility issues, Creole offers fully object-oriented API and support for PHP5 Exceptions.
Still in beta, but one to watch for the future, PHP Data Objects will be part of the PHP 5.1 release. It's written in C, and interfaces with the native API's, so will be fast, though not as fully-featured as some of the others.
DBX is another, less talked about, database abstraction layer that, until PHP 5.1, was bundled with PHP. It's also fast, being written in C, but is lightweight, the primary benefit being it allows you to access all supported databases using a single calling convention (for example dbx_connect instead of mysql_connect). It was released around 2002, and now supports the following databases:
PHPLib is now obsolete, and I put it here only because it was popular in its day, and you may still come across references to it. It was never actually designed as a fully-fledged database abstraction layer, rather it just aimed to encapsulate the connection string, and handle errors.
Don't buy into any hype that tells you you're inferior if you don't use abstraction layer X, or none at all. Think carefully about whether you need to use a database abstraction layer, and if so, which one. What sort of functionality, ease-of-use and scalability do you need? There's more than enough choices out there, and no need to reinvent the wheel. As a novice programmer I suggest you try use one, just to see what you've been missing out on. You can't really make up your mind without knowledge of both sides. As an experienced developer, hopefully you have the tools to make a wise decision. Perhaps this article has introduced you to one that may be suitable to your needs. Good luck.