pg.lisp -- socket level interface to the PostgreSQL RDBMS for Common Lisp

 Author: Eric Marsden <eric.marsden@free.fr>
 Version: 0.24

     Copyright (C) 1999,2000,2001,2002,2003,2004,2005,2006  Eric Marsden

     This library is free software; you can redistribute it and/or
     modify it under the terms of the GNU Library General Public
     License as published by the Free Software Foundation; either
     version 2 of the License, or (at your option) any later version.

     This library is distributed in the hope that it will be useful,
     but WITHOUT ANY WARRANTY; without even the implied warranty of
     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
     Library General Public License for more details.

     You should have received a copy of the GNU Library General Public
     License along with this library; if not, write to the Free
     Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

 For download information, mailing lists for suggestions and bug
 reports, see 

     <URL:http://www.common-lisp.net/project/pg/>


== Overview =========================================================

 This module lets you access the PostgreSQL object-relational DBMS
 from Common Lisp. The code implements the client part of the
 socket-level frontend/backend protocol, rather than providing a
 wrapper around the libpq library. The module is capable of type
 coercions from a range of SQL types to the equivalent Lisp type.
 The only non portable code is the use of 'socket-connect' and
 (optional) some way of accessing the Unix crypt() function.

 Works with CMUCL, SBCL, CLISP, OpenMCL, ABCL, ACL, Lispworks, and MCL.
 CormanLisp has socket support but not for binary I/O.


== Entry points =======================================================

 (with-pg-connection ((con &rest open-args) &body body)
     A macro which opens a connection to database DBNAME, executes the
     BODY forms then disconnects. See function `pg-connect' for details
     of the connection arguments OPEN-ARGS.

 (with-pg-transaction con &body body)
     A macro which executes the BODY forms wrapped in an SQL transaction.
     CON is a connection to the database. If an error occurs during the
     execution of the forms, a ROLLBACK instruction is executed.

 (pg-connect dbname user &key password host port) -> connection
     Connect to the database DBNAME on HOST (defaults to localhost) at
     PORT (defaults to 5432), and log in as USER. If HOST designates
     an absolute pathname (its first character is #\/), attempt to
     connect to the localhost using a Unix domain socket that resides
     in that directory (for example "/var/run/postgresql/"); otherwise
     HOST designates a hostname and the connection is established
     using TCP/IP. Connections to unix sockets are not supported on
     all implementations. If the database requires a password, send
     PASSWORD (as clear text unless the backend demands crypt()
     authentication). Set the output date type to 'ISO', and
     initialize our type parser tables.

 (pg-exec connection &rest sql) -> pgresult
     Concatenate the SQL strings and send to the backend. Retrieve
     all the information returned by the database and return it in
     an opaque record PGRESULT.

 (pg-result pgresult what &rest args) -> info
     Extract information from the PGRESULT. WHAT can be one of
          * :connection
          * :status
          * :attributes
          * :tuples
          * :tuple tupleNumber
          * :oid
     `connection' allows you to retrieve the database connection.
     `status' is a string returned by the backend to indicate the
     status of the command; it is normally "SELECT" for a select
     command, "DELETE 1" if the deletion affected a single row, etc.
     `attributes' is a list of tuples providing metadata: the first
     component of each tuple is the attribute's name as a string,
     the second an integer representing its PostgreSQL type, and the
     third an integer representing the size of that type. `tuples'
     returns all the data retrieved from the database, as a list of
     lists, each list corresponding to one row of data returned by
     the backend. `tuple num' can be used to extract a specific
     tuple. `oid' allows you to retrieve the OID returned by the
     backend if the command was an insertion; the OID is a unique
     identifier for that row in the database (this is
     PostgreSQL-specific, please refer to the documentation for more
     details).

 (pg-for-each connection select-form callback)
     Calls CALLBACK on each tuple returned by SELECT-FORM. Declares
     a cursor for SELECT-FORM, then fetches tuples using repeated
     executions of FETCH 1, until no results are left. The cursor is
     then closed. The work is performed within a transaction. When
     you have a large amount of data to handle, this usage is more
     efficient than fetching all the tuples in one go.

 (pg-disconnect connection &key abort) -> nil
     Close the database connection. If the keyword argument ABORT is
     non-NIL, the database connection is closed immediately, without
     first attempting to send a disconnect packet to the PostgreSQL
     backend. 


=== Support for prepared statements ====================================

 (pg-supports-pbe conn) -> boolean
     Returns T iff the connection to the database is able to support
     prepared statements. This is only true of connections using
     version 3 of the frontend/backend protocol. 

 (pg-prepare conn statement-name sql &optional parameter-types)
     Prepares an execution plan for a query (a prepared statement).
     The prepared statement may contain arguments that are refered to
     as $1, $2 etc; if arguments are present their types must be
     declared via the list PARAMETER-TYPES. Each element of
     PARAMETER-TYPES should be a string that defines the type of its
     corresponding parameter (see PG::*TYPE-PARSERS* for examples of
     type names used by PostgreSQL).

     Using execution plans is more efficient than multiple calls to
     PG-EXEC, since the parsing and query optimizing phase only occurs
     once, at preparation time. It also helps to protect against "SQL
     injection" attacks, by ensuring that arguments to an SQL query
     cannot be interpreted as a part of the SQL request.

 (pg-bind conn portal-name statement-name typed-arguments)
     Binds the execution plan that was previously prepared as
     STATEMENT-NAME to PORTAL-NAME, with TYPED-ARGUMENTS.
     TYPED-ARGUMENTS is a list of tuples of the form '(type value),
     where TYPE is one of :char, :byte, :int16, :int32, :string. 

 (pg-execute conn portal-name &optional maximal-return-rows)
     Executes the execution plan that was previously bound to
     PORTAL-NAME. Optionally returns up to MAXIMAL-RETURN-ROWS rows
     (0 means an unlimited number of rows).  

 (pg-close-statement conn statement-name)
     Releases the command execution plan (prepared statement)
     STATEMENT-NAME. This also releases any open portals for that
     prepared statement.

 (pg-close-portal conn portal-name)
     Releases the portal PORTAL-NAME.

Example using prepared statements:

    (defun delete-item (db-connection int-value string-value)
      (pg-prepare db-connection "delete-statement"
    		  "DELETE FROM items WHERE int_column = $1 AND string_column = $2"
    		  `("int4" "varchar"))
      (unwind-protect
           (progn (pg-bind db-connection "delete-portal" "delete-statement"
    		       `((:int32 ,int-value) (:string ,string-value)))
    	      (pg-execute db-connection "delete-portal"))
        ;; NB: portal is closed automatically when statement is closed
        (pg-close-statement db-connection "delete-statement")))


=== Introspection support ==============================================

 (pg-databases connection) -> list of strings
     Return a list of the databases available at this site (a
     database is a set of tables; in a virgin PostgreSQL
     installation there is a single database named "template1").

 (pg-tables connection) -> list of strings
     Return a list of the tables present in the database to which we
     are currently connected. Only include user tables: system
     tables are excluded.

 (pg-columns connection table) -> list of strings
     Return a list of the columns (or attributes) in TABLE, which
     must be a table in the database to which we are currently
     connected. We only include the column names; if you want more
     detailed information (attribute types, for example), it can be
     obtained from `pg-result' on a SELECT statement for that table.


=== Support for large objects (BLOBs) =================================

 (pglo-create conn . args) -> oid
     Create a new large object (BLOB, or binary large object in
     other DBMSes parlance) in the database to which we are
     connected via CONN. Returns an OID (which is represented as an
     integer) which will allow you to use the large object.
     Optional ARGS are a Unix-style mode string which determines the
     permissions of the newly created large object, one of "r" for
     read-only permission, "w" for write-only, "rw" for read+write.
     Default is "r".

     Large-object functions MUST be used within a transaction (see
     the macro `with-pg-transaction').

 (pglo-open conn oid . args) -> fd
     Open a large object whose unique identifier is OID (an integer)
     in the database to which we are connected via CONN. Optional ARGS
     is a Unix-style mode string as for pglo-create; which defaults to
     "r" read-only permissions. Returns a file descriptor (an integer)
     which can be used in other large-object functions.

 (pglo-close conn fd)
     Close the file descriptor FD which was associated with a large
     object. Note that this does not delete the large object; use
     PGLO-UNLINK for that.

 (pglo-read conn fd bytes) -> string
     Read BYTES from the file descriptor FD which is associated with a
     large object. Return a string which should be BYTES characters
     long.

 (pglo-write connection fd buf)
     Write the bytes contained in the string BUF to the large object
     associated with the file descriptor FD.

 (pglo-lseek conn fd offset whence)
     Do the equivalent of a lseek(2) on the file descriptor FD which
     is associated with a large object; ie reposition the read/write
     file offset for that large object to OFFSET (an integer). WHENCE
     has the same significance as in lseek(); it should be one of
     SEEK_SET (set the offset to the absolute position), SEEK_CUR (set
     the offset relative to the current offset) or SEEK_END (set the
     offset relative to the end of the file). WHENCE should be an
     integer whose values can be obtained from the header file
     <unistd.h> (probably 0, 1 and 2 respectively).

 (pglo-tell conn oid) -> integer
     Do the equivalent of an ftell(3) on the file associated with
     the large object whose unique identifier is OID. Returns the
     current position of the file offset for the object's associated
     file descriptor, as an integer.

 (pglo-unlink conn oid)
     Remove the large object whose unique identifier is OID from the
     system (in the current implementation of large objects in
     PostgreSQL, each large object is associated with an object in
     the filesystem).

 (pglo-import conn filename) -> oid
     Create a new large object and initialize it to the data contained
     in the file whose name is FILENAME. Returns an OID (as an
     integer). Note that is operation is only syntactic sugar around
     the basic large-object operations listed above.

 (pglo-export conn oid filename)
     Create a new file named FILENAME and fill it with the contents
     of the large object whose unique identifier is OID. This
     operation is also syntactic sugar.

 Boolean variable `*PG-DISABLE-TYPE-COERCION*' which can be set to
 non-nil (before initiating a connection) to disable the library's
 type coercion facility. Default is t.



 SECURITY NOTE: please note that your postmaster has to be started
 with the `-i' option in order for it to accept TCP/IP connections
 (typically this is not the default setting). See the PostgreSQL
 documentation at <URL:http://www.PostgreSQL.org/> for more
 information. 

 Setting up PostgreSQL to accept TCP/IP connections has security
 implications; please consult the documentation for details. You can
 connect to the database using Unix domain sockets if you wish to
 avoid setting up PostgreSQL to listen on a TCP socket.

 pg.lisp is able to use the crypt authentication method to avoid
 sending the password in cleartext over the wire (this assumes access
 to the `crypt' function via the FFI -- see sysdep.lisp). It can also
 use md5 passwords (which are used with the WITH ENCRYPTED PASSWORD
 form of the CREATE USER command), thanks to Pierre Mai's portable md5
 library. It does not support the Kerberos authentication method, nor
 OpenSSL connections (though this should not be difficult if your
 Common Lisp implementation is able to open SSL streams).

 It is also possible to use the port forwarding capabilities of ssh to
 establish a connection to the backend over TCP/IP, which provides
 both a secure authentication mechanism and encryption (and optionally
 compression) of data passing through the tunnel. Here's how to do it
 (thanks to Gene Selkov, Jr. <selkovjr@mcs.anl.gov> for the
 description):

 1. Establish a tunnel to the backend machine, like this:

 	ssh -L 3333:backend.dom:5432 postgres@backend.dom

    The first number in the -L argument, 3333, is the port number of
    your end of the tunnel. The second number, 5432, is the remote
    end of the tunnel -- the port number your backend is using. The
    name or the address in between the port numbers belongs to the
    server machine, as does the last argument to ssh that also includes
    the optional user name. Without the user name, ssh will try the
    name you are currently logged on as on the client machine. You can
    use any user name the server machine will accept, not necessarily
    those related to postgres.

 2. Now that you have a running ssh session, you can point pg.lisp to
    the local host at the port number which you specified in step 1.
    For example,

         (pg-connect "dbname" "user" :port 3333)

    You can omit the port argument if you chose 5432 as the local
    end of the tunnel, since pg.lisp defaults to this value.


At various times, this code has been tested or reported to work with

   * CMUCL 18d, 18e, 19a, 19c on Solaris/SPARC and Linux/x86
   * SBCL 0.9.2 to 0.9.16 on Linux/x86
   * CLISP 2.30 on LinuxPPC and SPARC
   * OpenMCL 0.13.x and 0.14.x on LinuxPPC
   * Armed Bear Common Lisp 
   * ACL 6.1 trial/x86
   * Lispworks 4.3 on Linux and Windows
   * PostgreSQL versions 6.5, 7.0, 7.1.2, 7.2, 7.3, 7.4, 8.0, 8.1


You may be interested in using "pg-psql" by Harley Gorrell, which
provides a psql-like listener interface to PostgreSQL (together with
tabulated output), on top of this library. See

  <URL:http://www.mahalito.net/~harley/cl/pg-psql.lisp>