Ericson Smith
Following Tim Perdue's excellent article on the comparison between MySQL and Postgresql, I decided to take a shot at installing and using this database. For most of our work I use MySQL and will continue to do so, because of its ease of use and unrivaled select query speed, and also because there is no point in trying to mess around with production systems that already work fine.
But some new projects suffered greatly from MySQL's table locking feature when I needed to update data (which I do a lot). Here are my adventures in setting up a Postgresql database server.
Our configuration for a dedicated Postgresql server was:

Downloading and Installing

I downloaded and installed the 7.1.2 RPM's from http://postgres.org without any trouble. For a server installation, I only installed: postgresql-server and postgresql-7.1.2 (base).
I then started the server up and running by executing:
/etc/init.d/postgresql start
A small sized database was ported from MySQL (three tables totaling about 5000 records). I created sufficient indexes for postgresql's optimizer to use, and modified our C application to use the postgresql C client interface for a small CGI program that would brutally query this table. This small CGI program receives thousands of queries per minute.

Optimizing

One of the first things I noticed after turning on the CGI program, was that although queries were returned almost as fast as from the previous MySQL based system, the load on the server was much higher -- in fact almost 90-percent! Then I started to go down into the nitty-gritty of things. I had optimized MySQL before by greatly increasing cache and buffer sizes and by throwing more ram towards the problem.
The single biggest thing that you have to do before running Postgresql, is to provide enough shared buffer space. Let me repeat: provide enough buffer space! Let's say you have about 512MB of ram on a dedicated database server, then you need to turn over about 75-percent of it to this shared buffer. Postgresql does best when it can load most or -- even better -- all of a table into its shared memory space. In our case, since our database was fairly small, I decided to allocate 128MB of RAM towards the shared buffer space.
The file /var/lib/pgsql/data/postgresql.conf contains settings for the database server. Postgresql uses system shared memory as a buffer. On a Linux system, you can see how much shared memory was allocated by your system by running the command:
cat /proc/sys/kernel/shmmax
And to view shared memory use on the system:
ipcs
The result will be in bytes. By default RedHat 7.1 allocates 32MB of shared memory, hardly enough for postgresql. I increased this limit to 128MB by doing the command:
echo 128000000 > /proc/sys/kernel/shmmax
Be aware that once you reboot the server, this setting will disappear. You need to place this line in your postgresql startup file, or by editing the /etc/sysctl.conf file for a more permanent setting.
Then in our postgresql.conf I set shared_buffers to 15200. Because Postgresql uses 8K segments, I made a calculation of 128000/8192 plus a 512K overhead. I also set our sort_mem to 32168 (32Megs for a sort memory area). Since connection pooling was in effect, I set max_connections to 64. And fsync was also set to false.
shared_buffers = 15200
sort_mem = 32168
max_connections=64
fsync=false
You can read the manual to tweak other settings, but I never had the need to do so. Note that if you set shared_buffers to more than what your shared memory limit is, postgresql will refuse to start. This confused us for a while, since no logging was taking place. You can tweak the startup file in /etc/init.d for the postmaster to write its output to a log file. Change the fragment from:
/postmaster start > /dev/null 2>
to
/postmaster start > /var/lib/pgsql.log 2>
(or wherever you want to store the log.)
Tailing the log file clearly explained what the problem was.
All sorts of sexy debugging info will show up in this file, which includes SQL syntax errors, the output of EXPLAIN state, emts, connection problems, authentication attempts, and so forth.
I restarted postgresql and brought our CGI online. Our jaws collectively dropped to the floor as postgresql literally flew as soon as it started to use the buffer. Server load by postgresql dropped to just under 10-percent.
One hitch I found with an early version of the system was that it had to build up and tear down a postgresql connection with each request. This was intolerable, so I started to use the connection pooling features of the C library. Server load dropped another few notches with this option. With PHP you will want to use persistent connections (pg_pconnect instead of pg_connect) to fully take advantage of this effect.

Indexes

I cannot emphasize enough the need to have proper indexing in postgresql. One early mistake that I made was to index BIGINT columns. The columns were indexed ok, but postgresql refused to make use them. After two days of tearing out my hair, it came to me that the architecture of the system was 32 bits. Could it be that postgresql refuses to make use of a 64 bit (BIGINT) index? Changing the type to INTEGER quickly solved that problem. Maybe if I had one of those new-fangled 64 bit Itanium processors.

Conclusion

There are many things that you can do with your SQL statements to also improve query response, but these are adequately covered in the interactive postgresql documentation.
Ericson Smith is a web developer at http://did-it.com.