Tuesday, April 22, 2014

Faster statistics update after upgrade using "vacuumdb --analyze-in-stages" in PostgreSQL 9.4

As all of you know after upgrading the database server from one version to other major version, ANALYZE command should be executed to update the pg_catalogs on newly populated data. On a huge upgraded database, its a challenge for the application to gain its performance back without updating the statistics. In PostgreSQL 9.4, an option "vacuumdb --analyze-in-stages" will make this work faster to produce usable statistics required by the optimizer.  It runs in three stages with different configuration settings(default_statistics_target/vacuum_cost_delay) to analyze the database.

If the database followed any of the up-gradation procedure like pg_dump/pg_restore or pg_upgrade, then its recommended to use "vacuumdb --analyze-in-stages"

Sample output:
bash-4.1$ /usr/local/pgpatch/pg/bin/vacuumdb -p 9999 --analyze-in-stages -d tester
Generating minimal optimizer statistics (1 target)
Generating medium optimizer statistics (10 targets)
Generating default (full) optimizer statistics

Sunday, April 6, 2014

While performing PITR, would it be possible to Pause/Resume in PostgreSQL ?

Yes, truly possible and handled smartly by PostgreSQL. To demo this, first I need to take after the standard technique of Point in Time Recovery in PostgreSQL. Various Books/Articles/Blogs demoed extremely well by extraordinary authors, hence am not going into details of how to do it, however, heading off directly to the subject i.e., how to pause while recovering with same technique. Arguably, I put forth a mathematical expression out of PITR as "PITR = (Last Filesystem Backup(LFB) + WAL Archives generated after LFB + Un-Archived WAL's in current $PGDATA/pg_xlogs)". For better understanding, I have put this into graph, in light of the fact that it clear the thought more: (Sorry, this blog is bit long, unknowingly it happened while going in details of the concept)


PITR steps,which am going to follow with slight changes that I talk about soon:

Step 1. Restore the most recent File System-level backup(FSB) to any location where recovery is planned to perform.
Step 2. If FSB is tar,then untar it, and clean the pg_xlog directory leaving archive_status. If backup has excluded this directory, then create the empty pg_xlog directory in FSB.
Step 3. Copy un-archived WAL's from crashed cluster $PGDATA/pg_xlog into $FSB/pg_xlog (Step 2)
Step 4. Delete the postmaster.pid from FSB directory.
Step 5. Create recovery.conf file in FSB directory.
Step 6. Start the cluster (FSB). 

We should put question, when pausing the recovery required ?. Maybe, to prevent multiple base restorations or roll-forward recovery but check in between or rollback a particular tables data or interest to see how far it has recovered :). Remember, pause in recovery means, its allowing to connect while recovering. To outline this, I have reproduced a situation in chart of a particular table rows improvement until to a mishap.


From above diagram, its agreeable a DEMO table rows were 10,00,000 when file system-level backup($PGDATA) taken and 40,00,000 rows before crash. In my local VM, I have made the situation on groundwork of TIME instead of date.

Pre-Requisite:
1. File System-Level Backup when DEMO tables having 10,00,000 rows.
2. From that point forward, WAL Archives before crash where DEMO table having 40,00,000 rows.
3. WAL Archives Location: /opt/PostgreSQL/9.3/archives.
4. Data Directory : /opt/PostgreSQL/9.3/data (PGDATA)
5. Backup Location : /opt/PostgreSQL/9.3/backups

Keep in mind, working with pause recovery need compulsory changes on main cluster($PGDATA) "wal_level" set to "hot_standby" and on recovery cluster(file system-level backup) "hot_standby" set to "ON". I have made these changes to main cluster, restarted the cluster to take effect and initiated the backup. If you don't mind make a note it simply a demo, so my WAL archives might not be gigantic  number's as they are in few numbers. I have listed WAL archives too here, which were generated from the time of backup to crash.
-bash-4.1$ psql -c "select count(*), now() from demo;"
  count  |              now
---------+-------------------------------
 1000000 | 2014-04-04 15:06:04.036928-07
(1 row)

-bash-4.1$ pg_basebackup -D /opt/PostgreSQL/9.3/backup/data_pitr          -- I have my $PGDATA, $PGUSER, $PGPORT set, so its a straight command in my case
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
Current state of WAL archives and $PGDATA/pg_xlog
-bash-4.1$ ls -lrth /opt/PostgreSQL/9.3/archives
-rw------- 1 postgres postgres 16M Apr  4 16:01 00000001000000000000001C
-rw------- 1 postgres postgres 16M Apr  4 16:01 00000001000000000000001D
-rw------- 1 postgres postgres 289 Apr  4 16:06 00000001000000000000001E.000000C8.backup
-rw------- 1 postgres postgres 16M Apr  4 16:06 00000001000000000000001E

-bash-4.1$ ls -lrth /opt/PostgreSQL/9.3/data/pg_xlog | tail -4
-rw------- 1 postgres postgres 289 Apr  4 16:06 00000001000000000000001E.000000C8.backup
-rw------- 1 postgres postgres 16M Apr  4 16:06 00000001000000000000001E
-rw------- 1 postgres postgres 16M Apr  4 16:06 00000001000000000000001F
drwx------ 2 postgres postgres 4.0K Apr  4 16:13 archive_status
Fine now, we have the backup copy, lets INSERT few records in three parts by noting the time, so it will help to pause recovery and additionally see the WAL's produced from the time of FSB.
-bash-4.1$ psql -c "insert into demo values (generate_series(1,1000000));"
INSERT 0 1000000
-bash-4.1$ psql -c "select count(*),now() from demo;"
  count  |              now
---------+-------------------------------
 2000000 | 2014-04-04 16:06:34.941615-07
(1 row)
-bash-4.1$ psql -c "insert into demo values (generate_series(1,1000000));"
INSERT 0 1000000
-bash-4.1$ psql -c "select count(*),now() from demo;"
  count  |              now
---------+-------------------------------
 3000000 | 2014-04-04 16:10:31.136725-07
(1 row)
-bash-4.1$ psql -c "insert into demo values (generate_series(1,1000000));"
INSERT 0 1000000
-bash-4.1$ psql -c "select count(*),now() from demo;"
  count  |              now
---------+-------------------------------
 4000000 | 2014-04-04 16:13:00.136725-07
(1 row)
Check the number of WAL's produced during the INSERT.
-bash-4.1$ ls -lrth /opt/PostgreSQL/9.3/archives
-rw------- 1 postgres postgres 289 Apr  4 16:06 00000001000000000000001E.000000C8.backup
-rw------- 1 postgres postgres 16M Apr  4 16:06 00000001000000000000001E
-rw------- 1 postgres postgres 16M Apr  4 16:06 00000001000000000000001F
-rw------- 1 postgres postgres 16M Apr  4 16:06 000000010000000000000020
-rw------- 1 postgres postgres 16M Apr  4 16:06 000000010000000000000021
-rw------- 1 postgres postgres 16M Apr  4 16:06 000000010000000000000022
-rw------- 1 postgres postgres 16M Apr  4 16:06 000000010000000000000023
-rw------- 1 postgres postgres 16M Apr  4 16:06 000000010000000000000024
-rw------- 1 postgres postgres 16M Apr  4 16:06 000000010000000000000025
-rw------- 1 postgres postgres 16M Apr  4 16:06 000000010000000000000026
-rw------- 1 postgres postgres 16M Apr  4 16:10 000000010000000000000027
-rw------- 1 postgres postgres 16M Apr  4 16:10 000000010000000000000028
-rw------- 1 postgres postgres 16M Apr  4 16:10 000000010000000000000029
-rw------- 1 postgres postgres 16M Apr  4 16:10 00000001000000000000002A
-rw------- 1 postgres postgres 16M Apr  4 16:13 00000001000000000000002B
Assume at this point mishap happened and you have to do recovery using FSB + WAL archives + Unarchived WAL's(if any). During recovery, I want to pause three time to see each recovery of 20,00,000, 30,00,000 and 40,00,000 rows of DEMO table by connecting to the database in READ-ONLY mode. For each resume of recovery need a restart of recovery cluster by bumping to new timeline in recovery.conf/recovery_target_time. Also,in $FSB/postgresql.conf, we have to set hot_standby=on. Here's my recovery.conf file:
-bash-4.1$ more recovery.conf
pause_at_recovery_target = true
#recovery_target_time = '2014-04-04 16:06:34'   # For 2 lakh records
#recovery_target_time = '2014-04-04 16:10:31'   # For 3 lakh records
#recovery_target_time = '2014-04-04 16:13:00'   # For 4 lakh records
restore_command = 'cp /opt/PostgreSQL/9.3/archives/%f %p'
Let's start recovery for 20,00,000 records:
-bash-4.1$ /opt/PostgreSQL/9.3/bin/pg_ctl -D /opt/PostgreSQL/9.3/data_pitr/ start
server starting

Now in logs:

-bash-4.1$ more postgresql-2014-04-04_162524.log
2014-04-04 16:25:24 PDT-24187---[] LOG:  starting point-in-time recovery to 2014-02-06 18:48:56-08
2014-04-04 16:25:24 PDT-24187---[] LOG:  restored log file "00000001000000000000001E" from archive
2014-04-04 16:25:24 PDT-24187---[] LOG:  redo starts at 0/1E0000C8
2014-04-04 16:25:24 PDT-24187---[] LOG:  consistent recovery state reached at 0/1E000190
2014-04-04 16:25:24 PDT-24185---[] LOG:  database system is ready to accept read only connections
2014-04-04 16:25:24 PDT-24187---[] LOG:  restored log file "00000001000000000000001F" from archive
2014-04-04 16:25:24 PDT-24187---[] LOG:  restored log file "000000010000000000000020" from archive
2014-04-04 16:25:25 PDT-24187---[] LOG:  restored log file "000000010000000000000021" from archive
2014-04-04 16:25:25 PDT-24187---[] LOG:  restored log file "000000010000000000000022" from archive
2014-04-04 16:25:25 PDT-24187---[] LOG:  recovery stopping before commit of transaction 1833, time 2014-04-04 16:06:23.893487-07
2014-04-04 16:25:25 PDT-24187---[] LOG:  recovery has paused
2014-04-04 16:25:25 PDT-24187---[] HINT:  Execute pg_xlog_replay_resume() to continue
Cool, see in logs it has paused and a smart HINT asking to resume. Here, if the recovery was satisfactory, you can resume it by calling "select pg_xlog_replay_resume();"(You can check it out). Lets not resume now, however check the number of rows recovered by connecting to the server.
-bash-4.1$ psql -c "select count(*),pg_is_in_recovery() from demo;"
  count  | pg_is_in_recovery
---------+-------------------
 2000000 | t
(1 row)
Good, it has reached to the point and paused where I requested. Lets move one more step ahead for recovering 30,00,000 rows.  Now, set the next timeline in recovery.conf/recovery_target_time and restart the cluster.
2014-04-04 16:28:40 PDT-24409---[] LOG:  restored log file "00000001000000000000002A" from archive
2014-04-04 16:28:40 PDT-24409---[] LOG:  recovery stopping before commit of transaction 1836, time 2014-04-04 16:10:40.141175-07
2014-04-04 16:28:40 PDT-24409---[] LOG:  recovery has paused
2014-04-04 16:28:40 PDT-24409---[] HINT:  Execute pg_xlog_replay_resume() to continue.

-bash-4.1$ psql -c "select count(*),pg_is_in_recovery() from demo;"
  count  | pg_is_in_recovery
---------+-------------------
 3000000 | t
(1 row)
Nice..., let's give the last attempt to pause at 40,00,000 rows.
2014-04-04 20:09:07 PDT-4723---[] LOG:  restored log file "00000001000000000000002B" from archive
cp: cannot stat `/opt/PostgreSQL/9.3/archives/00000001000000000000002C': No such file or directory
2014-04-04 20:09:07 PDT-4723---[] LOG:  redo done at 0/2B0059A0
2014-04-04 20:09:07 PDT-4723---[] LOG:  last completed transaction was at log time 2014-04-04 16:11:12.264512-07
2014-04-04 20:09:07 PDT-4723---[] LOG:  restored log file "00000001000000000000002B" from archive
2014-04-04 20:09:07 PDT-4723---[] LOG:  restored log file "00000002.history" from archive
2014-04-04 20:09:07 PDT-4723---[] LOG:  restored log file "00000003.history" from archive
2014-04-04 20:09:07 PDT-4723---[] LOG:  restored log file "00000004.history" from archive
cp: cannot stat `/opt/PostgreSQL/9.3/archives/00000005.history': No such file or directory
2014-04-04 20:09:07 PDT-4723---[] LOG:  selected new timeline ID: 5
cp: cannot stat `/opt/PostgreSQL/9.3/archives/00000001.history': No such file or directory
2014-04-04 20:09:07 PDT-4723---[] LOG:  archive recovery complete
2014-04-04 20:09:08 PDT-4721---[] LOG:  database system is ready to accept connections
2014-04-04 20:09:08 PDT-4764---[] LOG:  autovacuum launcher started

-bash-4.1$ psql -c "select count(*),pg_is_in_recovery() from demo;"
  count  | pg_is_in_recovery
---------+-------------------
 4000000 | f
(1 row)
Oops, what happened, why it has not paused and what its complaining?. Keep in mind, if no WAL archives present at the time of recovery_target_time then it won't pause and expect as it has arrived to the last point and open the database for READ/WRITE. In logs, without much stretch make out it was hunting down for file "00000001000000000000002C" which's not available, because at that time cluster has crashed. Some may not acknowledge this behaviour but its fact and makes sense when no WAL archives present then there's no reason for pausing the recovery. If at all required to pause even after no WAL archives, then make use of standby_mode='on' (HOT_STANDBY), in this method it won't come out of recovery but wait for WAL Archives.

Hope it was useful.

--Raghav

Saturday, August 24, 2013

How to change all objects ownership in a particular schema in PostgreSQL ?

Few suggesion's here (Thanks), inspired me to compose a bash script for changing all object's (TABLES / SEQUENCES / VIEWS / FUNCTIONS / AGGREGATES / TYPES) ownership in a particular schema in one go. No special code included in a script, I basically picked the technique suggested and simplified the implementation method via script. Actually, REASSIGN OWNED BY command does most of the work smoothly, however, it changes database-wide objects ownership regardless of any schema. Two eventualities, where you may not use REASSIGN OWNED BY:

1. If the user by mistake creates all his objects with super-user(postgres), and later intend to change to other user, then REASSIGN OWNED BY will not work and it merely error out as:
postgres=# reassign owned by postgres to user1;
ERROR:  cannot reassign ownership of objects owned by role postgres because they are required by the database system
2. If user wish to change just only one schema objects ownership.

Either cases of changing objects, from "postgres" user to other user or just changing only one schema objects, we need to loop through each object by collecting object details from pg_catalog's & information_schema and calling ALTER TABLE / FUNCTION / AGGREGATE / TYPE etc.

I liked the technique of tweaking pg_dump output using OS commands(sed/egrep), because it known that by nature the pg_dump writes ALTER .. OWNER TO of every object (TABLES / SEQUENCES / VIEWS / FUNCTIONS / AGGREGATES / TYPES) in its output. Grep'ing those statements from pg_dump stdout by replacing new USER/SCHEMANAME with sed and then passing back those statements to psql client will fix the things even if the object owned by Postgres user. I used same approach in script and allowed user to pass NEW USER NAME and SCHEMA NAME, so to replace it in ALTER...OWNER TO.. statement.

Script usage and output:
sh change_owner.sh  -n new_rolename -S schema_name

-bash-4.1$ sh change_owner.sh -n user1 -S public

 Summary:
        Tables/Sequences/Views : 16
        Functions              : 43
        Aggregates             : 1
        Type                   : 2

You can download the script from here, and there's also README to help you on the usage.

--Raghav

Wednesday, August 7, 2013

New Oracle-Compatibility features in PostgresPlus Advanced Server 9.3Beta

In short about EnterpriseDB product PostgresPlus Advanced Server, its built upon PostgreSQL community version with Oracle-Compatibility features and its ability to run with existing Oracle applications without any re-coding(typically required in migration) and allow enterprises to enjoy the benefit of the Open source database with PostgresPlus Advanced Server.

As PostgresPlus Advanced Server 9.3 Beta built upon community PostgreSQL 9.3 Beta feature, thence varied Oracle-Compatiblity features introduced in BETA unleash, like Oracle-style syntaxes, packages, SQL function etc. Below are few of the examples executed in PPAS 9.3 BETA:-

Materialized Views Syntax:
CREATE MATERIALIZED VIEW name [build_clause][create_mv_refresh] AS subquery
Where build_clause is:
BUILD {IMMEDIATE | DEFERRED}
Where create_mv_refresh is:
REFRESH [COMPLETE] [ON DEMAND]

  • Build deferred - In this option, data not populated in MV at the creation time, instead its populate later by using REFRESH MATERIALIZED VIEW.
  • Build Immediate - In this option, data populated in MV at the creation time from tables (Default)

edb=# create materialized view mymview1 
build immediate 
as select ename,sum(sal) from emp group by ename; 
SELECT 14 
edb=# create materialized view mymview2 
build deferred 
as select ename,sum(sal) from emp group by ename; 
SELECT 0
As you'll be able to figure it out from above output that "BUILD IMMEDIATE" populates all rows at the create time, whereas "BUILD DEFERRED" simply created a dummy object that ought to be later populated using REFRESH MATERIALIZED VIEW.
edb=# refresh materialized view mymview2 ;
REFRESH MATERIALIZED VIEW
edb=# select count(*) from mymview2;
 count
-------
    14
(1 row)
REFRESH [COMPLETE] [ON DEMAND] modes are also supported at creation time of Materialized View.

More Oracle SQL functions support:
REGEXP_SUBSTR()
edb=# SELECT REGEXP_SUBSTR('EDB documentation AT http://www.enterprisedb.com/','http://([[:alnum:]]+){3,4}/?') as RESULT FROM dual;
   result
------------
 http://www
(1 row)
REGEXP_COUNT()
edb=# select regexp_count('I am in PG-Mailing list','i',1,'i');
 regexp_count
--------------
            5
(1 row)
REGEXP_INSTR()
edb=# SELECT REGEXP_INSTR('PostgreSQL','P') "position" FROM dual;
 position
----------
        1
(1 row)
RAWTOHEX() / HEXTORAW ()
edb=# CREATE TABLE test (raw_col RAW(10));
CREATE TABLE
edb=# INSERT INTO test VALUES (HEXTORAW('7D'));
INSERT 0 1
edb=# select * from test ;
 raw_col
---------
 \x7d
(1 row)

edb=# select rawtohex(raw_col) from test ;
 rawtohex
----------
 7d
(1 row)

More Oracle Packages support:

DBMS_SCHEDULER
DBMS_RANDOM
DBMS_CRYPTO
DBMS_MVIEW
DBMS_LOCK.sleep
UTL_HTTP
UTL_ENCODE
UTL_URL

For more details you can refer release notes:
http://get.enterprisedb.com/releasenotes/PgPlus_AS_93Beta1_Release_Notes_20130802.pdf

--Raghav

Monday, July 29, 2013

PostgreSQL Compressed Archive Logs in Windows

Many have blogged and offered glaringly on how to compress archive logs in PostgreSQL, I desire to share one of a short version from my end which's on Windows.

If archive storage is concern, then you can choose compressed archive logging feature in PostgreSQL.

"archive_command(string)" in $PGDATA/postgresql.conf, is like a shell command to execute what's passed in string section to copy the completed source file (WAL file segment in $PGDATA/pg_xlog) to destination(ARCHIVE LOCATION). "string" can be anything like shell script(batch in Windows) itself, OS compression utilites, and a special tool pg_compresslog. In Windows, cmd.exe will execute the command passed in archive_command "string".

Since we are applying on Windows platform, pre-requesites are:
  1. Archive directory should have full postgres user access. ("C:\Program Files\PostgreSQL\9.2\archives" in my case)
  2. Window version GZIP utility. Though there are numerious good windows variant compression utilities, I choosed gzip because its supported both on Linux & Windows.
  3. Gzip.exe should have access to Postgres User and also in PATH. ("C:\Program Files\GnuWin32\bin" in my case).
Assuming all pre-requisites are in place and next step should be editing the $PGDATA/postgresql.conf file and changing the archiving related parameters and restart the cluster:
wal_level=archive
archive_mode=on
archive_command = '"C:\\Program Files\\GnuWin32\\bin\\gzip.exe -1 " < "%p" > "C:\\Program Files\\PostgreSQL\\9.2\\archives\\%f.gz"'

c:\Program Files\PostgreSQL\9.2\bin>pg_ctl.exe -D ..\data start      (You can also start from services.msc)
As per PG documentation, changes has been made and restarted the cluster, anticipating from hereon my archives will be compressed one. Lets look at the logs:
2013-07-26 16:07:22 IST LOG:  archive command failed with exit code 1
2013-07-26 16:07:22 IST DETAIL:  The failed archive command was: """C:\Program Files\GnuWin32\bin\gzip.exe" -1 < "pg_xlog\000000010000000000000002" > "C:\Program Files\PostgreSQL\9.2\archives\000000010000000000000002.gz"
'""C:\Program' is not recognized as an internal or external command,
operable program or batch file
Hmmm Ok, archiver process has failed with an interesting error '""C:\Program' is not recognized as an internal or external command ", which signify cmd.exe utility utilized to parse the string has failed to execute the string we passed. First, lets know how its parsing the string from command line and then fix the archive_command(string).
c:\Program Files\PostgreSQL\9.2\bin>cmd.exe /C "C:/Program Files/PostgreSQL/9.2/bin/psql.exe --version"
'C:/Program' is not recognized as an internal or external command,
operable program or batch file.

"/C" mean, Carry out the command specified by the string and then terminate.
Check the above simple command passed in string to find the "psql version" and the error it has thrown which's similar to the one I have in logs. Lets check what "cmd.exe /?" say about parsing the string.
If /C or /K is specified, then the remainder of the command line after
the switch is processed as a command line, where the following logic is
used to process quote (") characters:

    1.  If all of the following conditions are met, then quote characters
        on the command line are preserved:

        - no /S switch
        - exactly two quote characters
        - no special characters between the two quote characters,
          where special is one of: &<>()@^|
        - there are one or more whitespace characters between the
          two quote characters
        - the string between the two quote characters is the name
          of an executable file.

    2.  Otherwise, old behavior is to see if the first character is
        a quote character and if so, strip the leading character and
        remove the last quote character on the command line, preserving
        any text after the last quote character.
As per cmd.exe help, its clear that we should not have special character (which I have in gzip.exe > , < , -1) between the two quoted characters, so to fix, you should close the entire string again in another set of double quotes. Lets retake our previous command with extra double quoted.

c:\Program Files\PostgreSQL\9.2\bin>cmd.exe /C ""C:/Program Files/PostgreSQL/9.2/bin/psql.exe" --version"

psql (PostgreSQL) 9.2.2

Fine, its fixed now. Let me put all in points where you can bypass errors while setting archive_command:
  1. archive_command(string) must be in single quotes
  2. Each part of the command should be double quoted like command,source path, and destination path.
  3. Command options SHOULD NOT be double quoted
  4. Maintain one extra double quote on entire string within start/end of single quote.
Now, let me correct my archive_command....
archive_command = '""C:\\Program Files\\GnuWin32\\bin\\gzip.exe" -1 < "%p" > "C:\\Program Files\\PostgreSQL\\9.2\\archives\\%f.gz""'
Though RELOAD will effect the changes, but I recommend to take RESTART. Now you should have all .gz archive files:
c:\Program Files\PostgreSQL\9.2\archives>dir
 Volume in drive C has no label.
 Volume Serial Number is 4ABE-037A

 Directory of c:\Program Files\PostgreSQL\9.2\archives

07/27/2013  09:05 PM         3,613,153 00000001000000000000005A.gz
07/27/2013  09:05 PM         3,611,096 00000001000000000000005B.gz
07/27/2013  09:05 PM         3,612,856 00000001000000000000005C.gz
              89 File(s)    367,755,965 bytes
               2 Dir(s)  45,557,706,752 bytes free
Nice....
Similarly, you can furthermore use the GZIP.EXE -d for decompressing in "restore_command" while doing PITR on compressed archives.
restore_command='""C:\\Program Files\\GnuWin32\\bin\\gzip.exe" -d < "C:\\Program Files\\PostgreSQL\\9.2\\archives\\%f.gz" > "%p""'
Comments & Corrections are welcomed.

--Raghav