Monday, July 29, 2013

PostgreSQL Compressed Archive Logs in Windows

Many have blogged and offered glaringly on how to compress archive logs in PostgreSQL, I desire to share one of a short version from my end which's on Windows.

If archive storage is concern, then you can choose compressed archive logging feature in PostgreSQL.

"archive_command(string)" in $PGDATA/postgresql.conf, is like a shell command to execute what's passed in string section to copy the completed source file (WAL file segment in $PGDATA/pg_xlog) to destination(ARCHIVE LOCATION). "string" can be anything like shell script(batch in Windows) itself, OS compression utilites, and a special tool pg_compresslog. In Windows, cmd.exe will execute the command passed in archive_command "string".

Since we are applying on Windows platform, pre-requesites are:
  1. Archive directory should have full postgres user access. ("C:\Program Files\PostgreSQL\9.2\archives" in my case)
  2. Window version GZIP utility. Though there are numerious good windows variant compression utilities, I choosed gzip because its supported both on Linux & Windows.
  3. Gzip.exe should have access to Postgres User and also in PATH. ("C:\Program Files\GnuWin32\bin" in my case).
Assuming all pre-requisites are in place and next step should be editing the $PGDATA/postgresql.conf file and changing the archiving related parameters and restart the cluster:
wal_level=archive
archive_mode=on
archive_command = '"C:\\Program Files\\GnuWin32\\bin\\gzip.exe -1 " < "%p" > "C:\\Program Files\\PostgreSQL\\9.2\\archives\\%f.gz"'

c:\Program Files\PostgreSQL\9.2\bin>pg_ctl.exe -D ..\data start      (You can also start from services.msc)
As per PG documentation, changes has been made and restarted the cluster, anticipating from hereon my archives will be compressed one. Lets look at the logs:
2013-07-26 16:07:22 IST LOG:  archive command failed with exit code 1
2013-07-26 16:07:22 IST DETAIL:  The failed archive command was: """C:\Program Files\GnuWin32\bin\gzip.exe" -1 < "pg_xlog\000000010000000000000002" > "C:\Program Files\PostgreSQL\9.2\archives\000000010000000000000002.gz"
'""C:\Program' is not recognized as an internal or external command,
operable program or batch file
Hmmm Ok, archiver process has failed with an interesting error '""C:\Program' is not recognized as an internal or external command ", which signify cmd.exe utility utilized to parse the string has failed to execute the string we passed. First, lets know how its parsing the string from command line and then fix the archive_command(string).
c:\Program Files\PostgreSQL\9.2\bin>cmd.exe /C "C:/Program Files/PostgreSQL/9.2/bin/psql.exe --version"
'C:/Program' is not recognized as an internal or external command,
operable program or batch file.

"/C" mean, Carry out the command specified by the string and then terminate.
Check the above simple command passed in string to find the "psql version" and the error it has thrown which's similar to the one I have in logs. Lets check what "cmd.exe /?" say about parsing the string.
If /C or /K is specified, then the remainder of the command line after
the switch is processed as a command line, where the following logic is
used to process quote (") characters:

    1.  If all of the following conditions are met, then quote characters
        on the command line are preserved:

        - no /S switch
        - exactly two quote characters
        - no special characters between the two quote characters,
          where special is one of: &<>()@^|
        - there are one or more whitespace characters between the
          two quote characters
        - the string between the two quote characters is the name
          of an executable file.

    2.  Otherwise, old behavior is to see if the first character is
        a quote character and if so, strip the leading character and
        remove the last quote character on the command line, preserving
        any text after the last quote character.
As per cmd.exe help, its clear that we should not have special character (which I have in gzip.exe > , < , -1) between the two quoted characters, so to fix, you should close the entire string again in another set of double quotes. Lets retake our previous command with extra double quoted.

c:\Program Files\PostgreSQL\9.2\bin>cmd.exe /C ""C:/Program Files/PostgreSQL/9.2/bin/psql.exe" --version"

psql (PostgreSQL) 9.2.2

Fine, its fixed now. Let me put all in points where you can bypass errors while setting archive_command:
  1. archive_command(string) must be in single quotes
  2. Each part of the command should be double quoted like command,source path, and destination path.
  3. Command options SHOULD NOT be double quoted
  4. Maintain one extra double quote on entire string within start/end of single quote.
Now, let me correct my archive_command....
archive_command = '""C:\\Program Files\\GnuWin32\\bin\\gzip.exe" -1 < "%p" > "C:\\Program Files\\PostgreSQL\\9.2\\archives\\%f.gz""'
Though RELOAD will effect the changes, but I recommend to take RESTART. Now you should have all .gz archive files:
c:\Program Files\PostgreSQL\9.2\archives>dir
 Volume in drive C has no label.
 Volume Serial Number is 4ABE-037A

 Directory of c:\Program Files\PostgreSQL\9.2\archives

07/27/2013  09:05 PM         3,613,153 00000001000000000000005A.gz
07/27/2013  09:05 PM         3,611,096 00000001000000000000005B.gz
07/27/2013  09:05 PM         3,612,856 00000001000000000000005C.gz
              89 File(s)    367,755,965 bytes
               2 Dir(s)  45,557,706,752 bytes free
Nice....
Similarly, you can furthermore use the GZIP.EXE -d for decompressing in "restore_command" while doing PITR on compressed archives.
restore_command='""C:\\Program Files\\GnuWin32\\bin\\gzip.exe" -d < "C:\\Program Files\\PostgreSQL\\9.2\\archives\\%f.gz" > "%p""'
Comments & Corrections are welcomed.

--Raghav

3 comments :

Chaitany Kulkarni said...

Nice. I will test it on fedora.

Unknown said...

Hi ,
Thank you for the info.
1) You dont speal about archive_timeout. what is the effect of timeout when used with gzip?
2) I do the same thing but the archived zip file is allways zero byte.
Thank you

SAKET NAIDU said...

It's very helpful for beginner's....

Post a Comment

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License