Relational Database Technologies: 2011

Saturday, November 19, 2011

Deadlocks in PostgreSQL

Before discussing on deadlocks, lets see type of locks and their acquiring methodolgy in PostgreSQL.
Types of Locks:

Table-Level Locks and
Row-Level Locks

Table-Level Locks:

AcessShareLock : It acquired automatically by a SELECT statement on the table or tables it retrieves from. This mode blocks ALTER TABLE, DROP TABLE, and VACUUM (AccessExclusiveLock) on the same table
RowShareLock : It acquired automatically by a SELECT...FOR UPDATE clause. It blocks concurrent ExclusiveLock and AccessExclusiveLock on the same table.
RowExclusiveLock: It acquired automatically by an UPDATE, INSERT, or DELETE command. It blocks ALTER TABLE, DROP TABLE, VACUUM, and CREATE INDEX commands (ShareLock, ShareRowExclusiveLock, ExclusiveLock, and AccessExclusiveLock) on the same table.
ShareLock: It acquired automatically by a CREATE INDEX command. It blocks INSERT, UPDATE, DELETE, ALTER TABLE, DROP TABLE, and VACUUM commands. (RowExclusiveLock, ShareRowExclusiveLock, ExclusiveLock, and AccessExclusiveLock) on the same table.
ShareRowExclusiveLock: This lock mode nearly identical to the ExclusiveLock, but which allows concurrent RowShareLock to be acquired.
ExclusiveLock: "Every transaction holds an exclusive lock on its transaction ID for its entire duration. If one transaction finds it necessary to wait specifically for another transaction, it does so by attempting to acquire share lock on the other transaction ID. That will succeed only when the other transaction terminates and releases its locks." (regards, tom lane). Best definition by Tom Lane, I Believe every email from him is a lesson, he is Dr. PostgreSQL :) . ExclusiveLock blocks INSERT, UPDATE, DELETE, CREATE INDEX, ALTER TABLE, DROP TABLE, SELECT...FOR UPDATE and VACUUM commands on the table.(RowShareLock,RowExclusiveLock, ShareLock, ShareRowExclusiveLock, ExclusiveLock, and AccessExclusiveLock)
AccessExclusiveLock: It acquired automatically by a ALTER TABLE, DROP TABLE, or VACUUM command on the table it modifies.This blocks any concurrent command or other lock mode from being acquired on the locked table.

Row-Level Locks:

Two types of row-level locking share and exclusive locks. Don't fall into confusion of LOCK naming, you can differentiate row-lock and table-lock by the column 'lock_type' in pg_locks.

Exclusive lock: It is aquired automatically when a row hit by an update or delete. Lock is held until a transaction commits or rollbacks. To manually acquiring exclusive-lock use SELECT FOR UPDATE.
Share-Lock: It is acquired when a row hit by an SELECT...FOR SHARE.

Note: In either cases of row-level locks, data retreival is not at all effectied. Row-level lock block Writers (ie., Writer will block the Writer)

DeadLocks:

Now Deadlocks, you have seen the lock modes and their lock aquiring methodology, there are situations some of the transactions fall under deadlock. I believe application designing is the culprit forcing transactions to deadlocks. Deadlock mostly caused by ExclusiveLock's i.e., UPDATE or DELETE.

What is deadlock ?

Process A holding lock on object X and waiting for lock on Object Y. Process B holding lock on Object Y and waiting for lock on Object X. At this point the two processes are now in what's called 'deadlock' each is trying to obtain a lock on something owned by the other. They both will wait on each other forever if left in this state. One of them has to give up and release the locks they already have. Now, deadlock detector comes into picture and allow one process to success and another to rollback.

To over come deadlock, design application in such a way that any transaction UPDATE or DELETE should succeed with complete ownership on the table. Lock the table with 'SHARE UPDATE EXCLUSIVE MODE' or 'SELECT...FOR UPDATE' or 'ACCESS EXCLUSIVE MODE' and complete the transaction. In this model, deadlock detector never throw that it has hit by a EXCLUSIVE LOCK's.

You can test the scenario given in the pic above with the resolution, you see that deadlock detector never throws error.

Locking Query:

\set locks 'SELECT w.locktype AS waiting_locktype,w.relation::regclass AS waiting_table,w.transactionid, substr(w_stm.current_query,1,20) AS waiting_query,w.mode AS waiting_mode,w.pid AS waiting_pid,other.locktype AS other_locktype,other.relation::regclass AS other_table,other_stm.current_query AS other_query,other.mode AS other_mode,other.pid AS other_pid,other.granted AS other_granted FROM pg_catalog.pg_locks AS w JOIN pg_catalog.pg_stat_activity AS w_stm ON (w_stm.procpid = w.pid) JOIN pg_catalog.pg_locks AS other ON ((w.\"database\" = other.\"database\" AND w.relation  = other.relation) OR w.transactionid = other.transactionid) JOIN pg_catalog.pg_stat_activity AS other_stm ON (other_stm.procpid = other.pid) WHERE NOT w.granted AND w.pid <> other.pid;;'

Locking information Links

http://www.postgresql.org/docs/9.0/static/sql-lock.html

http://developer.postgresql.org/pgdocs/postgres/explicit-locking.html

Hope you got some idea on PostgreSQL Locks. See you all soon with another good blog.... :)

--Raghav

Thursday, November 10, 2011

~/.psqlrc file for DBA's

In our regular DBA monitoring, we will be using so many combination of pg_catalog queries to reteive information like <IDLE> in transaction , waiting queries, No. of connections, etc. Most of the DBA's, create views to cut short big combination queries and keep handy for later use per requirement.

PostgreSQL, provides a startup file(.psqlrc) which executes before connecting to the database when using with psql utility. Using .psqlrc file you can place all your important queries with one word alias by '\set' command and execute it in psql terminal instead of typing big queries. If you wont see .psqlrc file in 'postgres' user home directory, you can create it explicitly. I tried it and found very helpful.

Points on .psqlrc:

.psqlrc is a startup file, executes when connecting to the cluster.
.psqlrc file will reside in 'postgres' user home directory.
psql options -X or -c, do not read the .psqlrc file.
.psqlrc file is for complete session-level not database level.

My terminal Screenshot:

Lets see how to implement this.

Syntax:

\set <alias-variable-name>  'query'

Note: if your query has single or double quotes then use \' or \" in the query.

Sample Queries to put in .psqlrc file with alias:

vi ~/.psqlrc

\set PAGER OFF

\set waits 'SELECT pg_stat_activity.procpid, pg_stat_activity.current_query, pg_stat_activity.waiting, now() - pg_stat_activity.query_start  as "totaltime", pg_stat_activity.backend_start FROM pg_stat_activity WHERE pg_stat_activity.current_query !~ \'%IDLE%\'::text AND pg_stat_activity.waiting = true;;'

\set locks 'select pid,mode,current_query from pg_locks,pg_stat_activity where granted=false and locktype=\'transactionid\' and pid=procpid order by pid,granted;;'

:wq!

Usage:

postgres=# :waits
 procpid |         current_query         | waiting |    totaltime    |          backend_start
---------+-------------------------------+---------+-----------------+----------------------------------
    9223 | insert into locks VALUES (1); | t       | 00:00:18.901773 | 2011-10-08 00:29:10.065186+05:30
(1 row)

postgres=# :locks
 pid  |   mode    |         current_query
------+-----------+-------------------------------
 9223 | ShareLock | insert into locks VALUES (1);
(1 row)

Was it not helpful. Enjoy... :). Will be back with some more stuff.

--Raghav

Wednesday, October 19, 2011

High Availability Clustering with PostgreSQL

Firstly, I should thank my company for giving me an opportunity to work mostly with PostgreSQL HA stuff. I have worked with very good clients who has implemented Clustering with PostgreSQL. So, my article here is to give little idea on how HA clustering will work with PostgreSQL.

PostgreSQL has built-in functionality for High Availability like Warm Standby,Hot Standby and Streaming Replication. But, missing few features like Switchover/Switchback, failover automation, minimal downtime etc., which are mostly demanded by the companies. Postgres community member's are working on the demands aggressively and hope we see very new PostgreSQL soon with all features bundled. For now, let see Clustering with PostgreSQL.

There are many clustering architecture diagram's in brief which I have shared links below, but what I made here is just an overview of it.

What is High Availablity clustering ?

High availability clustering (HAC)is a feature which provides redundancy and fault tolerance. Its a number of connected devices processing and providing a service. HAC, involves employing both hardware and software technologies, like Server redundancy(including application failover and server clustering), Storage redundancy (including RAID and I/O multipathing), Network redundancy and Power system redundancy.

It's goal is to ensure this service is always available even in the event of a failure. If one server fail's the other servers will continue processing and take on the processing load of the failed server. HA cluster implementation attempt to use redundancy of cluster components to eliminate single points of failure.

Currently Available HA Products

There are many competitive high-availability software products in the market today; deciding which one to purchase can be tough. Following are the list of features you need to look in any HAC product.

Clustering capability ( How many servers can be clustered together?)
Load-balancing capability
Intelligent monitoring
Centralized management capability
Application monitoring
Cost (Most importantly though :) )
Customer support (Most of the products do this)

I have seen two of them, one is RedHat Cluster Suite (which is commonly used HA package for Linux operating system) and another is Steeleye-LifeKeeper.

http://www.redhat.com/rhel/add-ons/high_availability.html

http://www.ha-cc.org/high_availability/components/application_availability/cluster/high_availability_cluster/steeleye_lifekeeper/

Who needs ?

High Availability Clusters are often used by websites serving 24x7x365 not affording any downtime Eg: Amazon.com,Music websites, Customer Service sites etc., or Companies with Critical Databases.

How it works ?

You need minimum two nodes to start with HA. HA clusters usually use a heartbeat private network connection which is used to monitor the health and status of each node in the cluster. In any serious condition, any of the cluster goes down then other node attempts to start services and provides the same service.

Types of HAC :

Active/Passive: In this mode, one node is active (i.e., Primary) and processing service, while other node will be in passive mode meaning its a standby and will only become active if the primary node fails.
Active/Active : In this mode, both nodes are active and traffic is load balanced between both nodes and processing service. If one node fails, the other node will take the full processing load, until the failed node becomes active again.

Note: Active/Active mode is not supported with PostgreSQL.

Heartbeat:

A heartbeat is a sensing mechanism which sends a signal across to the primary node, and if the primary node stops responding to the heartbeat for a predefined amount of time, then a failover occurs automatically.

Failover Automation:

Automatic failover is the process of moving active services from the primary node to the standby node when the primary node fails. Usually the standby node continues its services until the primary node has come back up and running. When a device fails another device takes over this process which is referred to as a failover.

Failover automation is usually implemented on hardware firewalls over networks. You need to configure firewalls on Primary to take over Standby node in case of primary firewall fails.

HAC support with PostgreSQL

Currently, RHCS or LifeKeeper supports Active/Passive clustering with PostgreSQL. There is no Active/Active support for PostgreSQL yet. As I said, PostgreSQL has no built-in functionality of Failover Automation including third party replication tools like Slony-I, Londiste, etc.. To achieve this you may need to trick with OS level Scripting or take the help of Clustering.

Below link will help you to understand more about PostgreSQL Clustering with RHCS by Devrim Gunduz(Postgres Community Member).

http://wiki.postgresql.org/images/5/58/06.5_-_Devrim_Gunduz_-_PostgreSQLClusteringWithRedHatClusterSuite--LT.pdf

Setup service from EnterpriseDB on RHCS:

http://www.enterprisedb.com/services/packaged-services/high-availability

Do post your comments..

--Raghav

Replication in PostgreSQL 9.0

Word "Replication" means a process of sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

Replication is very interesting subject in any databases. In database competition world, PostgreSQL has its own uniqueness in RDBMS Open source for High availability. Latest PostgreSQL 9.1 has in-built support of Synchronous and Asynchronous replication. In-built Asynchronous replications are Warm Standby, Hot Standby and Streaming Replication and with third party tools Slony,Londiste,Mammoth etc. Below chart will help you to understand about available Synchronous and Asynchronouse replication.

WAL Shipping (Hot Standby and Warm Standby):

PostgreSQL has the ability to ship WAL's to another Server i.e, Standby. The Standby server will be running in recovery mode with the pg_standby utility applying the WAL's. Primary Server generates archives (a copy of WAL, usually 16 MB file) and sends them to multiple slaves, later it will be applied by pg_standby utility.

Warm Standby: Primary generates archives and feed them to Slave. Its a WAL Shipping to slave. Slave will be in continous recovery and not accessible for reads.
Hot Standby: Hot Standby is the name for the capability to run queries on a database that is currently performing archive recovery. In Hot Standby slaves can be used for read-only access.

Disadvantages:

Slave applie's WAL's periodically not continously, means only completed XLOG's will be available to slave as WAL archives and those will be applied. So, lag will be the unfilled or uncompleted WAL which has not generated archive. Data loss will be minimum of one WAL(16 MB).

Trigger Based Replication :

In trigger based replication, tools like Slony,Londiste, Mammoth uses ON INSERT, ON UPDATE, ON DELETE triggers on tables to maintain replication between Master and Slave. Slave will hold consistent Snapshots.

Streaming Replication :

Its also called as Binary replication. PostgreSQL, XLOG's records generated at primary will be shipped to Standby via network. Lag in streaming replication is very minimum like single transaction depending on Network Speed and Hot Standby Settings. Multiple Slave can be configured. Streaming replication comes with additional process 'WAL SENDER' at Primary and 'WAL RECEIVER' at Standby.

Advantages:

On Primary Crash, standby can be recovered in very less time.
Standby can be opened and it will be in READ ONLY mode.
It can be used for Reporting Server.
Load balancing can be configured using pgpool-II between Primary and Standby.

Disadvantages:

Standby Server should hold same amount of Memory/Disk/CPU etc., because, in case of Primary crashes the Slave acts as Primary.
Minimal Lag i.e. (one transaction behind Primary)

Slony Replication:

Slony is a asynchronous trigger-based replication. Its a single master to multiple slave replication system for PostgreSQL. Every table or sequence on Master will be replicated via remote triggers to Slave. Updates are committed to one database and are applied to Slave later as EVENTs. Using Slony Switchover and switchback is possible.

Limitations of Slony-I

Tables must have a primary key or a unique.
Only Tables and sequeces are allowed for replication.
Slave databases cannot be modified.

Advantages:

Slony-I supports switchback.
Using Slony-I, we can upgrade PG from one version to another version without any downtime.

Disadvantages:

Slony cannot detect the network failuer, hence causing all the EVENT's created at primary will be queued and are released once Network catch ups.
NO DDL changes allowed on the replication Tables while Slony Daemons running.

Do post your comments, they will be highly appreciated.

--Raghav

Sunday, September 25, 2011

How to get Database Creation Time in PostgreSQL 9.0 ?

In PostgreSQL, database creation time is not stored in any of the pg_catalogs. So question arises, how do we know when was database created.

For every database, a directory is created with database-oid number under $PGDATA/base along with a set of OID's,OID_fsm,OID_vm, PG_VERSION files for each Object(Tables/Indexes/View/ etc.,).

Every OID,OID_fsm,OID_vm, files will be get updated as per the changes made at database-level. However, PG_VERSION file will never get updated on any changes made to the database. So, we going to use timestamp of PG_VERSION file as database creation time. I believe that there will be a chance of changing PG_VERSION timestamp, but I am not sure in which case this changes happen.

To get timestamp of PG_VERSION, I need something which executes OS command at PG Instance-level. So, I used pl/perlu function created by one of my collegue Vibhor Kumar.

http://vibhork.blogspot.com/2011/04/plperl-functions-for-getting-number-of.html

pl/perlu Function

CREATE OR REPLACE FUNCTION execute_shell(text) returns setof text
as
$$
$output=`$_[0] 2>&1`;
@output=split(/[\n\r]+/,$output);
foreach $out (@output)
{ return_next($out);
}
return undef;
$$ language plperlu;

And, one function to get database oid.

CREATE OR REPLACE FUNCTION public.get_pg_version_loc(dbname varchar) RETURNS text AS
$body$
DECLARE
       dbname ALIAS FOR $1;
       data_dir text;
       db_oid text;
       os_execute text;
BEGIN
     SELECT INTO db_oid oid from pg_database where datname = dbname;
     show data_directory into data_dir;
     os_execute := 'stat -c "%y" '||data_dir||'/base/'||db_oid||'/PG_VERSION';
     return os_execute;
END;
$body$
LANGUAGE 'plpgsql';

Output:

=# select datname,execute_shell(get_pg_version_loc(datname::text)) as "DB_Createion_Time"
-# from pg_database where datname not in ('template0','template1');
   datname    |          DB_Createion_Time
--------------+-------------------------------------
 postgres     | 2011-01-10 21:48:37.222016571 +0530
 provider     | 2011-05-26 11:40:14.253434477 +0530
 pgbench_test | 2011-08-14 16:52:21.689198728 +0530
 pgpool       | 2011-08-26 12:30:19.864134713 +0530
(4 rows)

Will be back with more stuff :). Do post your comments if any, they will be highly appreciated.

--Raghav

Tuesday, August 16, 2011

Connection Pooling with Pgbouncer on PostgreSQL 9.0

Connection pooling, Why we go for connection pooling in PostgreSQL, When your application demands for very good number of concurrent connection hits then you need to approach it, because Connection pool sits between your application and the database.

Idea behind connection pool is that you have enough connections to use of all the available resources and any incoming requests are re-used without dropping the database connection and keeping ready for a new connection to use.

pgbouncer is lightweight connection pooler. pgBouncer runs as a single process, not spawning a process per connection, which relies on library named libevent for connection pooling.

pgbouncer setup on PostgreSQL 9.0 is very simple, however there is small change with the latest version you need to create manual pg_auth file. pgbouncer uses pg_auth file for user authentication. Earlier verion of PostgreSQL 9.0, you can find the pg_auth file under $PGDATA/global/pg_auth, now in the latest version that file has been removed and placed in pg_catalog as table 'pg_auth'.

pgbouncer Setup:

1. First, download libevent library for pgbouncer.
Download link for libevent:
http://www.monkey.org/~provos/libevent-2.0.12-stable.tar.gz

tar -xvf libevent-2.0.12-stable.tar.gz
cd libevent-2.0.12-stable
./configure 
make
make install

2. Download the latest pgbouncer tar and configure to your PostgreSQL 9.0.
http://pgfoundry.org/frs/download.php/2912/pgbouncer-1.4.tgz

tar -xvf pgbouncer-1.4
cd pgbouncer-1.4
./configure --prefix=/opt/PostgreSQL/9.0/bin
make
make install

3. Create a libevent-i386.conf file in /etc/ld.so.conf.d directory

vi /etc/ld.so.conf.d/libevent-i386.conf
/usr/local/lib
:wq!

4. Run the ldconfig to apply new changes.

#ldconfig

5. Change the ownership of pgbouncer utility in PostgreSQL binary to postgres user.

chown -R postgres:postgres /opt/PostgreSQL/9.0/bin/bin/pgbouncer

6. Create the pgbouncer_auth file for users authentication.

7. Create pgbouncer.ini file with postgres user permission under /etc directory.

8. Start pgbouncer

-bash-4.1$ ./pgbouncer -d /etc/pgbouncer.ini
2011-08-14 11:42:00.925 1949 LOG File descriptor limit: 1024 (H:1024), max_client_conn: 1000, max fds possible: 1010

9. Connect to the databases using pgbouncer

10. Getting help: Connect to pgbouncer database and get helped.

$ psql -p 6432 -U postgres pgbouncer
pgbouncer=# show help;

For better understanding on pg_auth you can find in below link by 'depesz'.
http://www.depesz.com/index.php/2010/12/04/auto-refreshing-password-file-for-pgbouncer/

Do post your comments which are highly appreciated.

--Raghav

Wednesday, August 10, 2011

pgmemcache vs Infinite Cache

In my recent post on pgmemcache, there were couple of questions asked which were really interesting and made me to work on it. I should thank for it :)

Questions:
1. Is pgmemcache application transparent ?
2. Is there any synchronization between memcached and PostgreSQL Shared buffers ?

Answer:

pgmemcache(memcached) is not application transparent, you need to do changes in the application for pushing or retreiving the data from the cache.

EnterpriseDB, product PostgresPlus Advance Server includes a feature called Infinite Cache, which is based on production proven technology memcached the open source distributed object cache.

About EnterpriseDB, the Enterprise PostgreSQL Company, provides enterprise-class PostgreSQL products of the world's most advanced open source database. The company's Postgres Plus products are ideally suited for transaction-intensive applications requiring superior performance, massive scalability and compatibility with proprietary database products.

Overview

Above diagram helps to understand the architecture of pgmemcache vs infinite cache. In infinite cache, all the pages are first searched in shared_buffers and then in Infinite Cache. Synchronization between shared buffer cache and infinite cache makes application transparency, which is not the case with pgmemcache.

Infinite Cache, is faster and completely application transparent. No special code is needed from developers. Warms up your cache with multiple parallel processes and pre-loads cache at startup reducing warming time.

To avail infinite cache you have to download the PostgresPlus Advance Server which is Oracle Compatible product bundled with Infinite Cache.
Download Link:
http://www.enterprisedb.com/downloads/postgres-postgresql-downloads

Implementation of Infinite cache is as simple as memcached, below link will help in setting up the infinite cache.

http://www.enterprisedb.com/docs/en/8.4/perf/Postgres_Plus_Advanced_Server_Performance_Guide-04.htm

Very informative discussion on PostgreSQL Community Forum:-

http://archives.postgresql.org/pgsql-performance/2011-07/msg00001.php

--Raghav

Sunday, July 24, 2011

pgmemcache Setup and Usage

Preloading or Caching the table in PostgreSQL is a tough task, because PostgreSQL doesnt offer a Single big synchronize-level memory managment. All the memories are independent. Caching is possible with the third party tools like memcached.

pgmemcache is a set of PostgreSQL user-defined functions(API's) that provide an interface to memcached. pgmemcache, pre-requisites recommends to have libmemcached, however its also recommended to install memcached along with it. My presentation consist of installation/caching/monitoring using pgmemcache API's. As am not the Developer or Hacker :), so my way of implementation is in very simple method.

Points:

Stores value in cache on the basis of Key/Value means, keeping table with primary key/unique key is recommended.
No Data redundancy - If memcached goes down or runs out of space, new records and updates will be lost.
Supports all memcached commands (set/get(single/multi)/delete/replace/incr/stats)
After keeping the data into memcached and if you drop the table from backend, memcached won't throw any errors. Its all your management how you maintain it.
No ability to iterate over data or determine what keys have been stored.
You can never bring a memcached server down or add a new one to the pool while people are playing or connected.
If the background updating process stops for any reason, updates do not occur and there is a possiblity that the memcached server could fill up.
Every PostgreSQL backend has to bind to memcached port before accessing the data.
Memcached runs on default port 11211

Pre-requisites:

PostgreSQL 8.4. or above
libevent
memcached
libmemcached
pgmemcache
Monitoring-Tools (monitoring-tools,damemtop,etc.,)

Installation:
Step 1. (libevent)

Libevent API is important when configuring pgmemcache, I prefer to have libraries as first step of installation. So lets start with libevent library configuring in default location.

Download link for libevent:
http://www.monkey.org/~provos/libevent-2.0.12-stable.tar.gz
tar -xvf libevent-2.0.12-stable.tar.gz
cd libevent-2.0.12-stable
./configure 
make
make install

Step 2 (memcached)

Install memcached by enabling the libevent.

Download link for memcached:
http://memcached.googlecode.com/files/memcached-1.4.6.tar.gz
cd /usr/local/src/memcached-1.4.6
------on 32-bit
export LD_LIBRARY_PATH=/usr/lib:/opt/PostgreSQL/9.0/lib:$LD_LIBRARY_PATH
./configure --prefix=/opt/PostgreSQL/9.0/bin/ --with-libevent=/usr/lib
------on 64-bit
export LD_LIBRARY_PATH=/usr/lib64:/opt/PostgreSQL/9.0/lib:$LD_LIBRARY_PATH
./configure --prefix=/opt/PostgreSQL/9.0/bin/ --with-libevent=/usr/lib64
make
make install

Step 3. (libmemcached)

pgmemcache is built on top of libmemcached. Libmemcached looks for memcache binary location, so set the path to memcached binaries before proceeding it.

export PATH=/opt/PostgreSQL/9.0/bin/bin:$PATH
export LD_LIBRARY_PATH=/usr/lib:/opt/PostgreSQL/9.0/lib:$LD_LIBRARY_PATH
Download link:
http://launchpad.net/libmemcached/1.0/0.50/+download/libmemcached-0.50.tar.gz
cd libmemcached-0.50
./configure
make
make install

Step 4 (pgmemcache)

pgmemcache API will help in, interacting with memcached like caching/retreiving data.

Download link:
http://pgfoundry.org/frs/download.php/3018/pgmemcache_2.0.6.tar.bz2
cd pgmemcache
PATH=/opt/PostgreSQL/9.0/bin:$PATH make USE_PGXS=1 install
or 
make
make install

Installation will create pgmemcache.sql file with all API's to interact with memcache under PG contrib location. To create pgmemcache API's, just exectute pgmemcache.sql file in all the database.

psql -p PGPORT -d PGDATABASE -f /opt/PostgreSQL/9.0/share/postgresql/contrib/pgmemcache.sql

pgmemcache API's list:

Note: While executing .sql file you may face error like "ISTFATAL: could not load library "/opt/PostgreSQL/9.0/lib/postgresql/pgmemcache.so": libmemcached.so.8: cannot open shared object file: No such file or directory". Means, PG instance didnt loaded with newly created library. Resolution, set the PATH and LD_LIBRARY_PATH and restart the instance to recognize the libraries.

Eg:-
export PATH=/opt/PostgreSQL/9.0/bin/bin:$PATH
export LD_LIBRARY_PATH=/usr/lib:/opt/PostgreSQL/9.0/lib:$LD_LIBRARY_PATH
$pg_ctl -D $PGDATA restart

If you want to load the pgmemcache as default to your PG instance, edit the postgresql.conf file and change the following parameters and restart the cluster.

shared_preload_libraries='pgmemcache'
custom_variable_classes='pgmemcache'

Configuration:
Step 1.

For caching data, first you need to initialize the memory, once the memory is allotted, later PG backends responsibility to bind and push the data into the cache. Here, I have started my memcache on localhost with 512MB on default port 11211. -d means start the daemon. All my exercise is on localhost.

$./memcached -d -m 512 -u postgres -l localhost -p 11211

Note: To retreive data from the cache, every PostgreSQL backend should first bind and retreive the data.

Step 2.

Bind the instance to the running memcache port. After binding, checkout for the memcached statistics.

Step 3.

Now, its time to cache data into memcached, Memcached uses keys/value to reside data in its memory, so make sure your table has Primary/Unique key so retrieving will be easy. As mentioned, there are very good API's to play around on keeping the value and accessing them, in my example, I use memcache_set() to keep the value and memcache_get() to retrive data.

Once the value is set in the memcached, now its your responsibility to bind your backend to memcached and with the help of pgmemcache API's you can access the data. Each Postgres backend must bind before accessing. Please find the example below.

Getting data from cache

Monitoring

If you are very good in linux you can pull maximum information on memcached memory, however there are few tools which come along with memcached source pack like monitoring-tools,damemtop etc.,. Am using monitoring-tools utility for monitoring memcache.

usage:-
memcached-tool localhost display
memcached-tool localhost dump
memcached-tool localhost stats | grep bytes

Example:

A small effort from my side to setup pgmemcache and understand the basics. Hope it was helpful. Keep posting your comments or suggestion which are highly appreciated.

--Raghav

Wednesday, June 15, 2011

PostgreSQL Upgradation

Its always a challenging task when moving from one version to another version on new Server. My presentation below is to upgrade the Old version of PostgreSQL 8.3 to PostgreSQL 9.0.4 on NEW SERVER. Basically, steps are very simple for upgradation, but need to take some extra care when bouncing the new server before and after restoration. Latest PostgreSQL, has lot of fixes in it, so it is recommended to use the new binaries for entire upgradation process.

Step 1 (On New Server PG 9.0.x):

First step in PostgreSQL is to set SHMMAX/SHMALL at OS-level, because shared_buffers purely depends on these setting, below script will give you the figure how much should be the SHMMAX/SHMALL on the basis of Server's Memory. I have taken the script, written by Greg Smith, which is very handy in setting the SHMMAX/SHMALL.

For better understanding on shmmax/shmall, here is the link
http://www.puschitz.com/TuningLinuxForOracle.shtml#SettingSharedMemory

vi shmsetup.sh
#!/bin/bash
# Output lines suitable for sysctl configuration based 
# on total amount of RAM on the system.  The output 
# will allow up to 50% of physical memory to be allocated 
# into shared memory.
# On Linux, you can use it as follows (as root): 
# 
# ./shmsetup >> /etc/sysctl.conf 
# sysctl -p
#
# Early FreeBSD versions do not support the sysconf interface 
# used here.  The exact version where this works hasn't 
# been confirmed yet.

page_size=`getconf PAGE_SIZE` 
phys_pages=`getconf _PHYS_PAGES`

if [ -z "$page_size" ]; then 
  echo Error:  cannot determine page size 
  exit 1 
fi

if [ -z "$phys_pages" ]; then 
  echo Error:  cannot determine number of memory pages 
  exit 2 
fi

shmall=`expr $phys_pages / 2` 
shmmax=`expr $shmall \* $page_size` 

echo \# Maximum shared segment size in bytes 
echo kernel.shmmax = $shmmax 
echo \# Maximum number of shared memory segments in pages 
echo kernel.shmall = $shmall
:wq

Execute the script :
#chmod +x shmsetup.sh
#./shmsetup.sh

Step 2 (On New Server PG 9.0.x):

Install the latest version of PostgreSQL 9.0.4 on the new server. Link below for 32bit and 64 bit:
http://get.enterprisedb.com/postgresql/postgresql-9.0.4-1-linux.bin
http://get.enterprisedb.com/postgresql/postgresql-9.0.4-1-linux-x64.bin

Most of the cases, as well recommended, to keep the "pg_xlog" in different mount point. You can create a new cluster with initdb command by selecting different "pg_xlog" mount point:-

$initdb -D DATA_DIRECTORY -X PG_XLOG_LOCATION

Note: initdb command will not create the 'pg_log' directory under new cluster, you need to create it explicitly.

After installation and creation of the cluster set the environment variables like PGDATA, PATH, PGDATABASE, PGPORT, PGUSER etc., in ".bash_profile" under postgres user.

Step 3 (On Old Server PG 8.3.x):

As I said, use the new binaries for all the commands you are executing on this server. If you dont have the new binaries on this server, install a copy of new binaries with source installation to any new location without overriding the existing binaries.

Download:-
http://wwwmaster.postgresql.org/redir/198/h/source/v9.0.4/postgresql-9.0.4.tar.gz

#tar xvf postgresql-9.0.4.tar.gz
#cd postgresql-9.0.4
#./configure --prefix=/usr/pg904
#make
#make install
New binaries location will be "/usr/pg904/"

Step 4 (On Old Server PG 8.3.x):

Intial step would be taking dump of global objects like users, tablespaces, etc., using pg_dumpall.

$ /usr/pg904/bin/pg_dumpall -p $PGPORT -g > /pg83_backups/global_dump.sql

Step 5 (On Old Server PG 8.3.x):

Take the dump of all the databases in a cluster using below command. Also generate logs for each dump to analyze if any issue arises in the dumps

$ usr/pg904/bin/pg_dump -Fc -v -U PGUSER -p PGPORT DBNAME -f /pg83_backups/dbname.dmp  >> /pg83_backups/dbname.log 2>>/pg83_backups/dbname.log

if the database is bigger, run in nohup

$ nohup usr/pg904/bin/pg_dump -Fc -v -U PGUSER -p PGPORT DBNAME -f /pg83_backups/dbname.dmp >> /pg83_backups/dbname.log 2>>/pg83_backups/dbname.log &

Step 6 (On Old Server PG 8.3.x):

Move all the dumps(/pg83_backups) to new server.

Step 7 (On New Server PG 9.0.x):

As per our STEP 2, New Server will have the latest binaries of PG 9.0.4 and cluster, to speed up the restoration process we need to tune some of the settings in $PGDATA/postgresql.conf file before and after.

Before restoration settings in postgresql.conf file(memory settings my differ as per the available RAM on the box):-

Memory Settings:
---------------
shared_buffers= (as per the shmmax settings, Maximum 8 gigs on 64 bit, 4 gigs on 32 bit)
work_mem= (in between 40MB - 100 MB)
maintenance_work_mem = (in between 1GB - 5 GB)

Checkpoints Settings:
--------------------
checkpoint_segments=(in between 128 - 256)
checkpoint_timeout=(default is 15mns make to 1h)

Autovacuum settings:
-------------------
autovacuum=off
track_counts=off

Sync to Disk:
------------
fsync=off
full_page_writes=off
synchronous_commit=off

Background Writer settings:
--------------------------
bgwriter_delay=(default 200ms, change to 50ms)

Changes demands to restart the cluster. 

$pg_ctl -D $PGDATA restart
or
$pg_ctl -D $PGDATA stop -m f
$pg_ctl -D $PGDATA start

Step 8 (On New Server PG 9.0.x):

First restoration is the global object.

$PGPATH/psql –d DBNAME -p $PGPORT -U $PGUSER -f /pg83_backups/global_dump.sql

Step 9 (On New Server PG 9.0.x):

Restoring the database can be done parallelly, means from PG 8.4 onwards we have an option -j will create multiple connection to PostgreSQL parallelly and fasten the restoration process.

http://www.postgresql.org/docs/current/static/app-pgrestore.html

Option -j, depends on number of CPUs the NEW Server has, for example if I have 4 core, I can go with -j 4. Each core can spawn one extra process with pg_restore. Use this option as per your CPU cores, you can also get the number of processors information with this command:
$ cat /proc/cpuinfo | grep -i processors | wc -l

Start restoring each database with the dumps to new server, if the database is bigger, run in nohup. Also generate the logs on restore for further analysis on the restoration.

$PGPATH/pg_restore –d DBNAME -Fc –v -p $PGPORT -U PGUSER /pg83_backups/dbname.dmp >>/pg83_backups/restore_dbname.log 2>>/pg83_backups/restore_dbname.log 
or
nohup $PGPATH/pg_restore –d DBNAME -Fc –v -p $PGPORT -U PGUSER /pg83_backups/dbname.dmp >>/pg83_backups/restore_dbname.log 2>>/pg83_backups/restore_dbname.log &

While, restoration is in progress you can monitor in two ways at OS-level using "top -cu postgres" or "ps -ef | grep postgres", at DB-level using "select * from pg_stat_activity".

Step 10 (On New Server PG 9.0.x):

Important step, after successfull restoration, it is recommended to update the catalogs with ANALYZE command.

$$PGPATH/vacuumdb -p $PGPORT -a -Z -v >>/pg83_backups/analyze.log 2>>/pg83_backups/analyze.log 
or
$nohup /usr/local/pgsql/bin/vacuumdb -p 5433 -a -Z -v >>/pg83_backups/analyze.log 2>>/pg83_backups/analyze.log &

Step 11 (On New Server PG 9.0.x):

After ANALYZE, you need to change the settings to normal or as per the demand of application by editing the $PGDATA/postgresql.conf file.

Memory Settings:
---------------
shared_buffers= (as per the shmmax settings, Maximum 8 gigs on 64 bit, 4 gigs on 32 bit)
work_mem= (in between 5MB - 40MB)
maintenance_work_mem = (in between 1GB -- 2 GB)

Checkpoints Settings:
--------------------
checkpoint_segments=(in between 64 - 128)
checkpoint_timeout=(default)

Autovacuum settings:
-------------------
autovacuum=on
track_counts=on

Sync to Disk:
------------
fsync=on
full_page_writes=on
synchronous_commit=on

Background Writer settings:
--------------------------
bgwriter_delay=(50ms)

Step 12 (On New Server PG 9.0.x):

After the above changes restart the cluster.

$pg_ctl -D $PGDATA restart
or
$pg_ctl -D $PGDATA stop -m f
$pg_ctl -D $PGDATA start

You also need to do some changes in $PGDATA/pg_hba.conf file for allowing application connections. Always keep a copy of $PGDATA/*.conf files(PG 8.3.x) on New Server for doing any changes to .conf files.

Do post your comments or suggestions, which are greatly appreciated.

Regards
Raghav

Monday, May 30, 2011

PostgreSQL 9.0 Streaming Replication on Windows

A major milestone in PostgreSQL 9.0 is Streaming Replication(including DDL). Many of you all used configuring SR on Linux, but I would be presenting SR on Windows Platform. PostgreSQL wiki is the best guide for setting up the Streaming Replication.

For setting up SR on Windows, I would recommend to follow the PostgreSQL wiki steps with minor changes what needed for Windows Platform. I would like to show only the changes what you have to look for on Windows Platform in my blog.
http://wiki.postgresql.org/wiki/Streaming_Replication

Step 1. (Before configuring SR, add the port)

On primary, you need to configure the accepting port. Below link will guide for adding port.
http://support.microsoft.com/kb/842242

Note: Adding the port differ's from different Windows Platforms.

Step 2. (Before configuring SR, Create common mount point for Archives)

Create one common mount point where Primary and Standby write/read the Archives. Mount point should own the Postgres user permissions. My common mount point: '10.10.101.111'

Step 3.

On Primary, changes in PostgreSQL.conf.

wal_level = hot_standby 
archive_mode = on 
archive_command = 'copy %p  \\\\10.10.101.111\\pg\\WAL_Archive\\%f'
max_wal_senders = 5
wal_keep_segments = 32

Step 4.

On Standby,

1) Edit the postgresql.conf file and change the below parameters.
        listen_address='*'
        hot_standby = on

2) Add the primary server entry in pg_hba.conf
        host    replication   postgres  primary.IP.address/22  trust

3) Create recovery.conf
        standby_mode = 'on'
        primary_conninfo = 'host=10.10.101.111 port=5432 user=postgres'
        trigger_file = 'C:\\stopreplication\\standby.txt'
        restore_command = 'copy \\\\10.10.101.111\\pg\\WAL_Archive\\%f %p'
     
Note: Create the recovery.conf file by copying any of the .conf files from the /data_directory.

Mentioned steps are the only changes you need to take care when setting up SR on Windows, rest all follow the procedure on PostgreSQL Wiki.

Regards
Raghav

Monday, May 2, 2011

PostgreSQL 9.0 Backup & Recovery

In PostgreSQL, Backup & Recovery are very user friendly comparing with other database. Many of them won't agree to this, ok lets not get into debate. Coming to Backups, PostgreSQL does'nt support INCREMENTAL BACKUP, however there are very consistant backup tools and OS level work-around to achieve this goal.

My pictorial presentation on PostgreSQL Backup and Recovery gives a complete conceptial idea. Looking into Diagram you can make out which backups can be used to restore or recover.

Logical Backup

pg_dump,pg_restore and pg_dumpall utilities used for logical backups. pg_dump and pg_restore will help in taking backups for Database level, Schema level and Table Level. Pg_dumpall used for cluster level dump.

Three formats are supported with pg_dump, Plain SQL format, Custom Format and Tar Format. Custom and Tar Format dumps are compatible with pg_restore utility whereas Plain SQL format dumps are compatible with psql utility for restoration.

Below are the examples for each Backup levels and with related restore commands.

Note: Set the defaults for PGDATABASE, PGUSER, PGPASSWORD and PGPORT in .bash_profile(Environment Variables in Windows)

Plain SQL Format Dump and Restore

$ pg_dump -U username -Fp dbname  >  filename
or
$ pg_dump -U username dbname -f  filename
or
$ pg_dump -Fp -U username dbname -f  filename

For restoring use psql command

$ psql -U username -f filename dbname
or
postgres=# \i SQL-file-name     //in psql terminal with \i option

Custom Format

$ pg_dump -Fc dbname -f filename
$ pg_restore -Fc -U username -d dbname filename.dmp

Tar Format

$ pg_dump -Ft dbname -f filename
$ pg_restore -U username -d dbname filename
or
$ cat tar-file.tar | psql -U username dbname

Note: Schema Level and Tables Level dumps can be performed in the same way by adding related options.

Cluster Level Dump:

$pg_dumpall -p portnumber > filename

For restoring use psql command

$ psql -f filename

There are very best way of taking dumps and restoring methodolgies. In particular, Simon Riggs and Hannu Krosing - "PostgreSQL 9 Administration Cookbook - 2010" book is good way to start with PostgreSQL Backup and Recovery published by www.2ndQuadrant.com.

Physical Backup (File system Backup)

Cold Backup:

In cold backup, its a simple file system backup of /data directory when Postgres Instance is down, mean, to achieve a self-consistent data directory backup, the database server should be shut down before copying. PostgreSQL gives flexibility to keep pg_xlog and pg_tblspce in different mount points via softlink. While copying the /data directory including the soft link's data, use the below command.

tar czf backup.tar.gz $PGDATA
or
cp -r $PGDATA /backup/
or
rsync -a $PGDATA /wherever/data

Hot Backup (Online Backup):

In Hot Backup, cluster will be up and running and the Database should be in Archive Log Mode. Two system functions will notify the instance about starting and stopping the Hot Backup process(pg_start_backup(),pg_stop_backup()). Before going forward with Online Backup, let's discuss on the Database Archive Log mode which is mandatory for Online Backups.

Enabling WAL Archiving:

Coming posts of mine will brief about PITR / Tunning WAL etc., presently we look into WAL Archiving. In PostgreSQL database system, the actual database 'writes' to an addition file called write-ahead log (WAL) to disk. It contains a record of writes that made in the database system. In the case of Crash, database can be repaired/recovered from these records.

Normally, the write-ahead log logs at regular intervals (called Checkpoints) matched against the database and then deleted because it no longer is required. You can also use the WAL as a backup because,there is a record of all writes made to the database.

Concept of WAL Archiving:

The write-ahead log is composed of each 16 MB large, which are called segments. The WALs reside under pg_xlog directory and it is the subdirectory of 'data directory'. The filenames will have numerically named in ascending order by PostgreSQL Instance. To perform a backup on the basis of WAL, one needs a basic backup that is, a complete backup of the data directory, and the WAL Segments between the base backup and the current date.

Configuring the archiving of WAL segments can be chosen by setting the two configuration parameter's archive_command and archive_mode in the postgresql.conf. Making the cluster into Archive-log mode requires RESTART.

archive_mode= on/off (boolean parameter)
archive_command = 'cp –i %p / Archive/Location/ f% '

Note: % p for the file to copy with path used as a file name and % f without a directory entry for the destination file.

For further information about the Archiver Process, refer to the post PostgreSQL 9.0 Memory & Processess.

Online Backup :

To take online backup:

Step 1 : Issue pg_start_backup('lable') in the psql terminal
postgres=# select pg_start_backup('fb');
Step 2 : OS level copy the $PGDATA directory to any Backup Location
$ cp -r $PGDATA  /anylocation
Step 3 : Issue pg_stop_backup() in psql terminal.
postgres=# select pg_stop_backup();

Note: It is not necessary that these two functions should run in the same database connection. The backup mode is global and persistent.

In PostgreSQL, there is no catalog to store the Start and Stop time of the Online backup. However, when online backup is in process, couple of the files created and deleted.

pg_start_backup('label') and pg_stop_backup are the two system functions to perform the Online Backup. With pg_start_backup('label') a file backup_label is created under $PGDATA directory and with pg_stop_backup() a file 'wal-segement-number.backup' file created under $PGDATA/pg_xlog. Backup_label will give the start time and Checkpoint location of WAL Segment, it will also notify the PostgreSQL instance that Cluster is in BACKUP-MODE. 'wal-segment-number.backup' file under $PGDATA/pg_xlog directory describes the start and stop time, Checkpoint location with WAL segement number.

Note: After pg_stop_backup(), backup_label file is deleted by the PostgreSQL instance.

Do post your comments, suggestions.

--Raghav

Friday, April 29, 2011

ERROR: could not load library "/opt/PostgreSQL/9.0/lib/postgresql/plperl.so": libperl.so:

Above error thrown, when I was trying to create language plperlu. It indicates that there is a missing library libperl.so. Check out the steps performed to overcome this issue and successfully creating the plperlu language in PostgreSQL.

Method 1 (Finding the libperl.so and making softlink to that location)

postgres=# create LANGUAGE plperlu;
ERROR:  could not load library "/opt/PostgreSQL/9.0/lib/postgresql/plperl.so": libperl.so: cannot open shared object file: No such file or directory
postgres=# \q

Now, search for the file libperl.so on your box and add that location to your LD_LIBRARY PATH. Dont forget switching user to postgres.

[root@localhost /]# find -name libperl.so
./usr/lib/perl5/5.10.0/i386-linux-thread-multi/CORE/libperl.so
./usr/lib/perl5/CORE/libperl.so
./usr/lib/perl5/5.10.0/i386-linux-thread-multi/CORE

[root@localhost /]# su - postgres

Set the LD_LIBRARY_PATH

-bash-4.1$ export LD_LIBRARY_PATH=/usr/lib/perl5/5.10.0/i386-linux-thread-multi/CORE:$LD_LIBRARY_PATH

Restart the cluster to effect the changes.

-bash-4.1$ pg_ctl restart
waiting for server to shut down....... done
server stopped
server starting

You are now ready to create the language.

-bash-4.1$ psql
psql (9.0.1)
Type "help" for help.

postgres=# create LANGUAGE plperlu;
CREATE LANGUAGE

Method 2 (By installing the latest ActivePerl)

Step 1:
After installing ActivePerl, set the LD_LIBRARY_PATH.

postgres@localhost:/opt/ActivePerl-5.12/lib/CORE> export LD_LIBRARY_PATH=/opt/ActivePerl-5.12/lib/CORE:/opt/PostgreSQL/9.0/lib/postgresql

Step 2:

postgres@localhost:/opt/PostgreSQL/9.0/lib/postgresql> ldd plperl.so 
       linux-vdso.so.1 =>  (0x00007fff8cf79000)
       libperl.so => /opt/ActivePerl-5.12/lib/CORE/libperl.so (0x00007fbd3d654000)
       libnsl.so.1 => /lib64/libnsl.so.1 (0x00007fbd3d40a000)
       libdl.so.2 => /lib64/libdl.so.2 (0x00007fbd3d206000)
       libm.so.6 => /lib64/libm.so.6 (0x00007fbd3cfb0000)
       libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fbd3cd74000)
       libutil.so.1 => /lib64/libutil.so.1 (0x00007fbd3cb71000)
       libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fbd3c954000)
       libc.so.6 => /lib64/libc.so.6 (0x00007fbd3c5f5000)
       /lib64/ld-linux-x86-64.so.2 (0x00007fbd3dad8000)

Step 3:

postgres@localhost:/opt/PostgreSQL/9.0/lib/postgresql> pg_ctl restart

postgres@localhost:/opt/PostgreSQL/9.0/lib/postgresql> psql
psql (9.0.2)
Type "help" for help.

postgres=# create language plperlu;
CREATE LANGUAGE

Post your comments.

--Raghav

Wednesday, April 27, 2011

Size of Partition Table in PostgreSQL 9.0

In PostgreSQL, every table is an object, using pg_relation_size('object_name') will give the size of the object. If you send the partition table in the place of 'object_name', it gives only that object size but not the sizes of child tables.

Check out the example given below.

postgres=# \dt+
                          List of relations
 Schema |     Name      | Type  |  Owner   |    Size    | Description
--------+---------------+-------+----------+------------+-------------
 public | child1        | table | postgres | 8192 bytes |
 public | child2        | table | postgres | 8192 bytes |
 public | parent        | table | postgres | 0 bytes    |
(3 rows)

pg_relation_size() on parent table will not give the exact size.

postgres=# select pg_size_pretty(pg_relation_size('parent'));
 pg_size_pretty
----------------
 0 bytes
(1 row)

To achieve partition table size, firstly know the concerned child tables and its sizes. Using pg_inherits catalog table will help in getting the information of child tables with sizes and later sum them for exact size. I have tried writing a small function using pg_inherits to get it done.

CREATE OR REPLACE FUNCTION pg_partition_table_size(text) returns numeric as
$$
select sum(to_number(pg_size_pretty(pg_relation_size(inhrelid::regclass)),'999999999')) from pg_inherits where inhparent=$1::regclass;
$$ language sql;

Now, send the partition table to the function.

postgres=# select pg_partition_table_size('parent');
 pg_partition_table_size
-------------------------
                   16384
(1 row)

Is it not useful. Do post your comments, they will be greatly appreciated.

--Raghav

Monday, April 25, 2011

PostgreSQL 9.0 Memory & Processes

Going forward with PostgreSQL Architecture, here I would be discussing about the utility process and memory with informative links. Many of the commiters have already documented insightfully about the process and memory, links provided here for those. Modest presentation from my end about the PostgreSQL Utility Process.

Every PostgreSQL Instance startup, there will be a set of utilty process(including mandatory and optional process) and memory. Two mandatory process (BGWRITER and WAL Writer) and four optional process (Autovacuum launcher,stats collector,syslogger, and Archiver). You can check it out with the command 'ps -ef | grep postgres' given below in figure 10.1.

Figure 10.1

Overview of the Process and memory.

Figure 10.2

Above figure 10.2 shows the processes attached to the PostgreSQL Shared memory.

BGWriter/Writer Process:

BGWRITER or WRITER process is a mandotary process.

All PostgreSQL server process reads data from disk and moves them into Shared Buffer Pool. Shared Buffer pool uses ARC algorithm or LRU(least-recently used) mechanism to select the page it evicts from the pool. BGWRITER spends much of its time sleeping, but every time it wakes, it searches through the shared buffer pool looking for modified pages. After each search, the BGWRITER chooses some number of modified pages, writes them to disk, and evicts those pages from the shared buffer pool. BGWRITER process can be controled with three parameters BGWRITER_DELAY,BGWRITER_LRU_PERCENT and BGWRITER_LRU_MAXPAGES.

http://www.enterprisedb.com/docs/en/9.0/pg/kernel-resources.html

http://www.enterprisedb.com/docs/en/8.4/pg/runtime-config-resource.html

WAL Writer Process:

WAL writer process is a mandatory process.

WAL writer process writes and fsync WAL at convenient Intervals. WAL buffers holds the changes made to the database in the transaction logs, in order to guarantee transaction security. WAL buffers are written out to the disk at every transaction commit, as WAL writer process is responsible to write on to the disk. WAL_WRITER_DELAY parameter for invoking the WAL Writer Process, however there are other parameters which also keeps the WAL Writer busy. Follow below link.

http://www.enterprisedb.com/docs/en/8.4/pg/wal-configuration.html

Stats Collector Process:

Stats collecotr process is optional process, default is ON.

Stats collector process will collect the information about the server activity. It count number of access to the tables and indexes in both disk-block and individual row items. It also tracks the total number of rows in each table, and information about VACUUM and ANALYZE actions for each table. Collection of statistics adds some overhead to query execution, whether to collect or not collect information. Some of the parameter in the postgresql.conf file will control the collection activity of the stats collector process. Following link will brief more about the stats collector process and its related parameters.

http://www.enterprisedb.com/docs/en/9.0/pg/monitoring-stats.html

Autovacuum Launcher Process:

Autovacuuming is a optional Process, default is ON.

For automating the execution of VACUUM and ANALYZE command, Autovacuum Launcher is a daemon process consists of multiple processes called autovacuum workers. Autovacuum launcher is a charge of starting autovacuum worker processes for all databases. Launcher will distribute the work across time, attempting to start one worker on each database for every interval, set by the parameter autovacuum_naptime. One worker will be launched for each database, set by the parameter autovacuum_max_workers. Each worker process will check each table within its database and execute VACUUM or ANALYZE as needed. Following will breif about the AUTOVACUUM LAUNCHER PROCESS parameters.

http://www.enterprisedb.com/docs/en/8.4/pg/runtime-config-autovacuum.html

Syslogger Process / Logger Process :

Figure 10.3

Logging is an optional process, default is OFF.

As per the figure 10.3, it is clearly understood that all the utility process + User backends + Postmaster Daemon attached to syslogger process for logging the information about their activities. Every process information is logged under $PGDATA/pg_log with the file .log.

Note: If the data directory is created with INITDB command, then there wont be pg_log directory under it. Explicit creation is needed.

Debugging more on the process information will cause overhead on the Server. Minimul tunning is always recommended, however, increasing the debug level when required. Link below will brief on logging parameters.

http://www.enterprisedb.com/docs/en/8.4/pg/runtime-config-logging.html

Archiver Process:

Figure 10.4

Achiver process is optional process, default is OFF.

Above Figure 10.4 is made from my observation on the Archiving process in PostgreSQL. Setting up the database in Archive mode means, to capture the WAL data of each segment file once it is filled, and save that data somewhere before the segment file is recycled for reuse.

Diagrammatical explination on Numbering tags.

1. On Database Archivelog mode, once the WAL data is filled in the WAL Segment, that filled segment named file is created under $PGDATA/pg_xlog/archive_status by the WAL Writer naming the file as ".ready". File naming will be "segment-filename.ready".

2. Archiver Process triggers on finding the files which are in ".ready" state created by the WAL Writer process. Archiver process picks the 'segment-file_number' of .ready file and copies the file from $PGDATA/pg_xlog location to its concerned Archive destination given in 'archive_command' parameter(postgresql.conf).

3. On successful completion of copy from source to destination, archiver process renames the "segment-filename.ready" to "segment-filename.done". This completes the archiving process.

It is understood that, if any files named "segement-filename.ready" found in $PGDATA/pg_xlog/archive_status are the pending files still to be copied to Archive destination.

For more information on parameters and Archiving, see the below link.

http://www.enterprisedb.com/docs/en/9.0/pg/continuous-archiving.html

Please do post your comments/suggestion on this article, they would be greatly appreciated.

Regards

Raghav

Friday, April 22, 2011

PostgreSQL 9.0 Architecture

Its my pleasure to be here, publishing my first blog on PostgreSQL Architecture.

For a quite sometime, am working, learning the vast and most happening Database PostgreSQL. As a beginner,thought of giving a try to represent PostgreSQL Architecture in pictorial format. PostgreSQL Architecture includes sevaral things memory,process and storage file system, it is complex to show everything in one Picture. My efforts here to give an overview on PostgreSQL Architecture.

Most of the designing made with the help of Our PostgreSQL Committers (Heikki,Robert Haas,Bruce), from whom I have learned alot about the PostgreSQL internals. Am very much thankful for their cooperation for making me to understand about the PostgreSQL in-and-out. Am not the hacker or architectural designer, its just an article for those who are new to PostgreSQL. Please post your comments, suggestion or if you find any mistakes to correct myself.

PostgreSQL 9.0 Architecture Overview

PostgreSQL instance consists of set of Process and Memory. PostgreSQL uses a simple "process per-user" client/server model. PostgreSQL types of process.

The 'postmaster', supervisory daemon process, 'postmaster' is attached to shmmem segment but refrains from accessing to it.
Utility processes (bgwriter,walwriter,syslogger,archiver,statscollector and autovacuum lancher) and
User Backend process (postgres process itself, Server Process)

When a client request for connection to the database, firstly request is hit to Postmaster daemon process after performing Authentication and authorization it spawns one new backend server process(postgres). From that point on, the frontend process and the backend server communicate without intervention by the postmaster. Hence, the postmaster is always running, waiting for connection requests, whereas frontend and backend processes come and go. The libpq library allows a single frontend to make multiple connections to backend processes.

However, each backend process is a single-threaded process that can only execute one query at a time; so the communication over any one frontend-to-backend connection is single-threaded. Postmaster and postgres servers run with the user ID of the PostgreSQL "superuser".

One Postgres process exists for every open database session. Once authenticated with user connection, it directly connects (with who and for what purpose) with shared memory.

Memory :

Shared Buffers

WAL Buffers

clog Buffers

Other Buffers

PostgreSQL shared memory is very big and all the buffers are not in sync, mean all are independent. Many experts/commiters has given maximum information on web with their experience on PostgreSQL. PostgreSQL documention with this diagram will give a basic understanding on the Architecture. Following links will brief more.

http://www.postgresql.org/docs/9.0/interactive/runtime-config-resource.html

http://www.enterprisedb.com/docs/en/8.4/pg/runtime-config-resource.html

http://www.postgresql.org/files/documentation/books/aw_pgsql/hw_performance/0.html

Utility Process :

Mandatory process: These process are not having an option of Enable/Disable.