Relational Database Technologies: April 2011

Friday, April 29, 2011

ERROR: could not load library "/opt/PostgreSQL/9.0/lib/postgresql/plperl.so": libperl.so:

Above error thrown, when I was trying to create language plperlu. It indicates that there is a missing library libperl.so. Check out the steps performed to overcome this issue and successfully creating the plperlu language in PostgreSQL.

Method 1 (Finding the libperl.so and making softlink to that location)

postgres=# create LANGUAGE plperlu;
ERROR:  could not load library "/opt/PostgreSQL/9.0/lib/postgresql/plperl.so": libperl.so: cannot open shared object file: No such file or directory
postgres=# \q

Now, search for the file libperl.so on your box and add that location to your LD_LIBRARY PATH. Dont forget switching user to postgres.

[root@localhost /]# find -name libperl.so
./usr/lib/perl5/5.10.0/i386-linux-thread-multi/CORE/libperl.so
./usr/lib/perl5/CORE/libperl.so
./usr/lib/perl5/5.10.0/i386-linux-thread-multi/CORE

[root@localhost /]# su - postgres

Set the LD_LIBRARY_PATH

-bash-4.1$ export LD_LIBRARY_PATH=/usr/lib/perl5/5.10.0/i386-linux-thread-multi/CORE:$LD_LIBRARY_PATH

Restart the cluster to effect the changes.

-bash-4.1$ pg_ctl restart
waiting for server to shut down....... done
server stopped
server starting

You are now ready to create the language.

-bash-4.1$ psql
psql (9.0.1)
Type "help" for help.

postgres=# create LANGUAGE plperlu;
CREATE LANGUAGE

Method 2 (By installing the latest ActivePerl)

Step 1:
After installing ActivePerl, set the LD_LIBRARY_PATH.

postgres@localhost:/opt/ActivePerl-5.12/lib/CORE> export LD_LIBRARY_PATH=/opt/ActivePerl-5.12/lib/CORE:/opt/PostgreSQL/9.0/lib/postgresql

Step 2:

postgres@localhost:/opt/PostgreSQL/9.0/lib/postgresql> ldd plperl.so 
       linux-vdso.so.1 =>  (0x00007fff8cf79000)
       libperl.so => /opt/ActivePerl-5.12/lib/CORE/libperl.so (0x00007fbd3d654000)
       libnsl.so.1 => /lib64/libnsl.so.1 (0x00007fbd3d40a000)
       libdl.so.2 => /lib64/libdl.so.2 (0x00007fbd3d206000)
       libm.so.6 => /lib64/libm.so.6 (0x00007fbd3cfb0000)
       libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fbd3cd74000)
       libutil.so.1 => /lib64/libutil.so.1 (0x00007fbd3cb71000)
       libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fbd3c954000)
       libc.so.6 => /lib64/libc.so.6 (0x00007fbd3c5f5000)
       /lib64/ld-linux-x86-64.so.2 (0x00007fbd3dad8000)

Step 3:

postgres@localhost:/opt/PostgreSQL/9.0/lib/postgresql> pg_ctl restart

postgres@localhost:/opt/PostgreSQL/9.0/lib/postgresql> psql
psql (9.0.2)
Type "help" for help.

postgres=# create language plperlu;
CREATE LANGUAGE

Post your comments.

--Raghav

Wednesday, April 27, 2011

Size of Partition Table in PostgreSQL 9.0

In PostgreSQL, every table is an object, using pg_relation_size('object_name') will give the size of the object. If you send the partition table in the place of 'object_name', it gives only that object size but not the sizes of child tables.

Check out the example given below.

postgres=# \dt+
                          List of relations
 Schema |     Name      | Type  |  Owner   |    Size    | Description
--------+---------------+-------+----------+------------+-------------
 public | child1        | table | postgres | 8192 bytes |
 public | child2        | table | postgres | 8192 bytes |
 public | parent        | table | postgres | 0 bytes    |
(3 rows)

pg_relation_size() on parent table will not give the exact size.

postgres=# select pg_size_pretty(pg_relation_size('parent'));
 pg_size_pretty
----------------
 0 bytes
(1 row)

To achieve partition table size, firstly know the concerned child tables and its sizes. Using pg_inherits catalog table will help in getting the information of child tables with sizes and later sum them for exact size. I have tried writing a small function using pg_inherits to get it done.

CREATE OR REPLACE FUNCTION pg_partition_table_size(text) returns numeric as
$$
select sum(to_number(pg_size_pretty(pg_relation_size(inhrelid::regclass)),'999999999')) from pg_inherits where inhparent=$1::regclass;
$$ language sql;

Now, send the partition table to the function.

postgres=# select pg_partition_table_size('parent');
 pg_partition_table_size
-------------------------
                   16384
(1 row)

Is it not useful. Do post your comments, they will be greatly appreciated.

--Raghav

Monday, April 25, 2011

PostgreSQL 9.0 Memory & Processes

Going forward with PostgreSQL Architecture, here I would be discussing about the utility process and memory with informative links. Many of the commiters have already documented insightfully about the process and memory, links provided here for those. Modest presentation from my end about the PostgreSQL Utility Process.

Every PostgreSQL Instance startup, there will be a set of utilty process(including mandatory and optional process) and memory. Two mandatory process (BGWRITER and WAL Writer) and four optional process (Autovacuum launcher,stats collector,syslogger, and Archiver). You can check it out with the command 'ps -ef | grep postgres' given below in figure 10.1.

Figure 10.1

Overview of the Process and memory.

Figure 10.2

Above figure 10.2 shows the processes attached to the PostgreSQL Shared memory.

BGWriter/Writer Process:

BGWRITER or WRITER process is a mandotary process.

All PostgreSQL server process reads data from disk and moves them into Shared Buffer Pool. Shared Buffer pool uses ARC algorithm or LRU(least-recently used) mechanism to select the page it evicts from the pool. BGWRITER spends much of its time sleeping, but every time it wakes, it searches through the shared buffer pool looking for modified pages. After each search, the BGWRITER chooses some number of modified pages, writes them to disk, and evicts those pages from the shared buffer pool. BGWRITER process can be controled with three parameters BGWRITER_DELAY,BGWRITER_LRU_PERCENT and BGWRITER_LRU_MAXPAGES.

http://www.enterprisedb.com/docs/en/9.0/pg/kernel-resources.html

http://www.enterprisedb.com/docs/en/8.4/pg/runtime-config-resource.html

WAL Writer Process:

WAL writer process is a mandatory process.

WAL writer process writes and fsync WAL at convenient Intervals. WAL buffers holds the changes made to the database in the transaction logs, in order to guarantee transaction security. WAL buffers are written out to the disk at every transaction commit, as WAL writer process is responsible to write on to the disk. WAL_WRITER_DELAY parameter for invoking the WAL Writer Process, however there are other parameters which also keeps the WAL Writer busy. Follow below link.

http://www.enterprisedb.com/docs/en/8.4/pg/wal-configuration.html

Stats Collector Process:

Stats collecotr process is optional process, default is ON.

Stats collector process will collect the information about the server activity. It count number of access to the tables and indexes in both disk-block and individual row items. It also tracks the total number of rows in each table, and information about VACUUM and ANALYZE actions for each table. Collection of statistics adds some overhead to query execution, whether to collect or not collect information. Some of the parameter in the postgresql.conf file will control the collection activity of the stats collector process. Following link will brief more about the stats collector process and its related parameters.

http://www.enterprisedb.com/docs/en/9.0/pg/monitoring-stats.html

Autovacuum Launcher Process:

Autovacuuming is a optional Process, default is ON.

For automating the execution of VACUUM and ANALYZE command, Autovacuum Launcher is a daemon process consists of multiple processes called autovacuum workers. Autovacuum launcher is a charge of starting autovacuum worker processes for all databases. Launcher will distribute the work across time, attempting to start one worker on each database for every interval, set by the parameter autovacuum_naptime. One worker will be launched for each database, set by the parameter autovacuum_max_workers. Each worker process will check each table within its database and execute VACUUM or ANALYZE as needed. Following will breif about the AUTOVACUUM LAUNCHER PROCESS parameters.

http://www.enterprisedb.com/docs/en/8.4/pg/runtime-config-autovacuum.html

Syslogger Process / Logger Process :

Figure 10.3

Logging is an optional process, default is OFF.

As per the figure 10.3, it is clearly understood that all the utility process + User backends + Postmaster Daemon attached to syslogger process for logging the information about their activities. Every process information is logged under $PGDATA/pg_log with the file .log.

Note: If the data directory is created with INITDB command, then there wont be pg_log directory under it. Explicit creation is needed.

Debugging more on the process information will cause overhead on the Server. Minimul tunning is always recommended, however, increasing the debug level when required. Link below will brief on logging parameters.

http://www.enterprisedb.com/docs/en/8.4/pg/runtime-config-logging.html

Archiver Process:

Figure 10.4

Achiver process is optional process, default is OFF.

Above Figure 10.4 is made from my observation on the Archiving process in PostgreSQL. Setting up the database in Archive mode means, to capture the WAL data of each segment file once it is filled, and save that data somewhere before the segment file is recycled for reuse.

Diagrammatical explination on Numbering tags.

1. On Database Archivelog mode, once the WAL data is filled in the WAL Segment, that filled segment named file is created under $PGDATA/pg_xlog/archive_status by the WAL Writer naming the file as ".ready". File naming will be "segment-filename.ready".

2. Archiver Process triggers on finding the files which are in ".ready" state created by the WAL Writer process. Archiver process picks the 'segment-file_number' of .ready file and copies the file from $PGDATA/pg_xlog location to its concerned Archive destination given in 'archive_command' parameter(postgresql.conf).

3. On successful completion of copy from source to destination, archiver process renames the "segment-filename.ready" to "segment-filename.done". This completes the archiving process.

It is understood that, if any files named "segement-filename.ready" found in $PGDATA/pg_xlog/archive_status are the pending files still to be copied to Archive destination.

For more information on parameters and Archiving, see the below link.

http://www.enterprisedb.com/docs/en/9.0/pg/continuous-archiving.html

Please do post your comments/suggestion on this article, they would be greatly appreciated.

Regards

Raghav

Friday, April 22, 2011

PostgreSQL 9.0 Architecture

Its my pleasure to be here, publishing my first blog on PostgreSQL Architecture.

For a quite sometime, am working, learning the vast and most happening Database PostgreSQL. As a beginner,thought of giving a try to represent PostgreSQL Architecture in pictorial format. PostgreSQL Architecture includes sevaral things memory,process and storage file system, it is complex to show everything in one Picture. My efforts here to give an overview on PostgreSQL Architecture.

Most of the designing made with the help of Our PostgreSQL Committers (Heikki,Robert Haas,Bruce), from whom I have learned alot about the PostgreSQL internals. Am very much thankful for their cooperation for making me to understand about the PostgreSQL in-and-out. Am not the hacker or architectural designer, its just an article for those who are new to PostgreSQL. Please post your comments, suggestion or if you find any mistakes to correct myself.

PostgreSQL 9.0 Architecture Overview

PostgreSQL instance consists of set of Process and Memory. PostgreSQL uses a simple "process per-user" client/server model. PostgreSQL types of process.

The 'postmaster', supervisory daemon process, 'postmaster' is attached to shmmem segment but refrains from accessing to it.
Utility processes (bgwriter,walwriter,syslogger,archiver,statscollector and autovacuum lancher) and
User Backend process (postgres process itself, Server Process)

When a client request for connection to the database, firstly request is hit to Postmaster daemon process after performing Authentication and authorization it spawns one new backend server process(postgres). From that point on, the frontend process and the backend server communicate without intervention by the postmaster. Hence, the postmaster is always running, waiting for connection requests, whereas frontend and backend processes come and go. The libpq library allows a single frontend to make multiple connections to backend processes.

However, each backend process is a single-threaded process that can only execute one query at a time; so the communication over any one frontend-to-backend connection is single-threaded. Postmaster and postgres servers run with the user ID of the PostgreSQL "superuser".

One Postgres process exists for every open database session. Once authenticated with user connection, it directly connects (with who and for what purpose) with shared memory.

Memory :

Shared Buffers

WAL Buffers

clog Buffers

Other Buffers

PostgreSQL shared memory is very big and all the buffers are not in sync, mean all are independent. Many experts/commiters has given maximum information on web with their experience on PostgreSQL. PostgreSQL documention with this diagram will give a basic understanding on the Architecture. Following links will brief more.

http://www.postgresql.org/docs/9.0/interactive/runtime-config-resource.html

http://www.enterprisedb.com/docs/en/8.4/pg/runtime-config-resource.html

http://www.postgresql.org/files/documentation/books/aw_pgsql/hw_performance/0.html

Utility Process :

Mandatory process: These process are not having an option of Enable/Disable.