;; Copyright (c) 2011 nklein software ;; MIT License. See included LICENSE.txt file for licensing details.
Patrick Stein mailto:email@example.com
The USerial library is a general purpose library for serializing items into byte buffers and unserializing items back out of byte buffers. The "Buffer Handling" section below describes the various ways one can manipulate USerial buffers. The "Serializing and Unserializing" section below describes the different tools the USerial library provides for creating serializers and the serializers that the library provides out of the box. The "Sample Application: Game Protocol" section below describes how one might put all of these functions to use in preparing network packets for a simple game.
To serialize, one needs a place to put the data. To unserialize, one needs a place from which to fetch the data. Some libraries choose to implement such things as streams. The USerial library serializes to and unserializes from memory buffers because the primary goal for this library is to facility assembly and disassembly of datagram packets.
The USerial library uses adjustable arrays of unsigned bytes with fill pointers. The fill pointer is used to track the current position in the buffer for serializing or unserializing. The buffers are automatically resized to accomodate the serialized data.
The basic types and constants used for buffer-related operations are described in the "Buffer-related Types and Constants" section below.
The USerial library provides a function for allocating a new buffer. This function is described in the "Creating Buffers" section below.
Many USerial library routines that use a buffer declare the buffer parameter as key parameter. There is a macro one can use to execute a body of statemets with a particular buffer as the default for calls in which the buffer parameter is omitted. This macro is described in the "Using a Buffer" section below.
The only exported USerial library routines that take a buffer
parameter directly (as opposed to with a key parameter) are the
with-buffer macro and the
There are a variety of functions provided to allow one to query and manipulate the size of USerial buffers. These functions are described in the "Manipulating and Querying Buffer Sizes" section below.
There are some basic functions for adding an unsigned byte to a buffer and retrieving an unsigned byte from a buffer. Those functions are described below in the "Adding and Retrieving Bytes" section.
The USerial buffers are adjustable arrays of unsigned bytes with fill pointers. The array is adjustable so that it can be easily grown as needed to accomodate serialized data. It has a fill pointer that is used to track the current length of serialized data (as distinguished from the current allocated capacity of the array) or the current point from which data will be unserialized.
(deftype buffer () '(array (unsigned-byte 8) (*)))
When one creates a USerial buffer, one can provide the initial capacity for the buffer. If no initial capacity is given for the buffer, a default size is used.
(defconstant +default-buffer-capacity+ 32768)
When one is adding bytes to a buffer, it would be very inefficient to
reallocate the buffer each time an additional byte of space is needed.
To this end, when the USerial library needs to increase the size of a
buffer it adds at least the minimum of the current buffer size and
(defconstant +default-buffer-expand+ 8192)
For example, if the buffer were currently 256 bytes when the buffer needed to grow by a byte, it would be expanded to 512 bytes. If the buffer were currently 10,000 bytes when the buffer needed to grow by a byte, it would be expanded to 18,192 bytes.
The buffer allocator itself is the
make-buffer function. It takes
an optional parameter specifying the initial capacity of the buffer.
(ftype (function (&optional (integer 1 *)) buffer) make-buffer) (defun make-buffer (&optional initial-capacity) ...)
As the buffer will be resized as needed, this parameter need not be set high enough to accomodate any and all serializations. It is provided merely to keep from having to reallocate the buffer several times if one can provide a decent, probable upper bound on the serialized size of the contents.
Most of the buffer manipulation and serialization functions declare the buffer as optional. The following macro allows one to specify the buffer to use for these functions when the buffer parameter is omitted.
(defmacro with-buffer (buffer &body body))
This macro assigns the dynamic variable
*buffer* to be
the given buffer for the duration of the body.
(declaim (special *buffer*))
When serializing a buffer, the
buffer-length function returns
the current length of the serialized data within the buffer. When
unserializing a buffer, the
buffer-length function returns the
current length of the serialized data which has already been
unserialized from the buffer.
(ftype (function (&key (:buffer buffer)) (integer 0 *)) buffer-length) (defun buffer-length (&key (buffer *buffer*)) ...)
The current allocated size of a buffer can be queried with the
buffer-capacity function. One can
(setf ...) the
buffer-capacity if needed to explicitly modify the amount of
buffer space allocated.
(ftype (function (&key (:buffer buffer)) (integer 0 *)) buffer-capacity) (defun buffer-capacity (&key (buffer *buffer*)) ...) (setf (buffer-capacity &key (buffer *buffer*)) (integer 0 *))
One can advance the current position within the buffer either to save space for later serialization or to skip over bytes during unserialization.
(ftype (function (&key (:amount (integer 0 *)) (:buffer buffer)) buffer) buffer-advance) (defun buffer-advance (&optional (amount 1) (buffer *buffer*))) ...)
If not specified, the
buffer-advance function advances by a single
One can reset the current position within the buffer back to the beginning to begin unserializing a serialized buffer, to fill in places that one skipped during the first stage of serialization, or to re-use the same buffer for the next serialization.
(ftype (function (&key (:buffer buffer)) buffer) buffer-rewind) (defun buffer-rewind (&key (buffer *buffer*)) ...)
At its base, the buffer class is an adjustable array of unsigned bytes. To add a byte to a buffer, one can use the following function. This function will expand the buffer if needed, place the given byte at the current fill pointer and advance the fill pointer.
(ftype (function (uchar &key (:buffer buffer)) buffer) buffer-add-byte) (defun buffer-add-byte (byte &key (buffer *buffer*))
Similarly, to retrieve an unsigned byte from a buffer, one can use the following function. This function will retrieve the byte at the current fill pointer and advance the fill pointer.
(ftype (function (&key (:buffer buffer)) (values uchar buffer)) buffer-get-byte) (defun buffer-get-byte (&key (buffer *buffer*))
The ultimate purpose of the USerial library is to allow one to serialize and unserialize data. To this end, the library defines two generic functions that dispatch on a keyword parameter. These generic functions are described in the "Serializing and Unserializing" section below.
There are some macros that facilitate serializing and unserializing sequences of items. These macros are described in the "Serializing and Unserializing Multiple Items" section below.
There are other macros which facilitate defining new serialize and unserialize methods for common situations. These macros are described in the "Defining New Serializers" section below.
There are a variety of pre-defined
These are describe in the "Pre-defined Serializers"
The generic function used to serialize items takes a keyword as its first parameter, a value as its second parameter, and an optional buffer. The keyword is used to dispatch the appropriate implementation of the function for the given value. The serialize methods serialize the value into the buffer and return the buffer.
(ftype (function (symbol &key (:buffer buffer) &allow-other-keys) buffer) serialize) (defgeneric serialize (keyword value &key (buffer *buffer*) &allow-other-keys))
The generic function used to unserialize items takes a keyword as its first parameter and an optional buffer. The keyword is used to dispatch the appropriate implementation of the function. The unserialize methods unserialize a value from the buffer and return the value and buffer.
(ftype (function (symbol &key (:buffer buffer) &allow-other-keys) (values t buffer)) unserialize) (defgeneric unserialize (keyword &key (buffer *buffer*) &allow-other-keys))
For most purposes, one wants to serialize more than one thing into a given buffer. The USerial library provides some convenience macros so that one is not forced to explicitly call serialize or unserialize for each item. Here is an example of explicitly calling the serialize method for each item.
(with-buffer (make-buffer 1024) (serialize :opcode :login) (serialize :string login-name) (serialize :string password) (serialize :login-flags '(:hidden)))
The first such macro is
serialize*. With this macro, one specifies
a keyword-value list and an optional buffer. With it, the above example
could be serialized as follows.
(serialize* (:opcode :login :string login-name :string password :login-flags '(:hidden)) :buffer (make-buffer 1024))
To unserialize from the resulting buffer, one could explicitly call unserialize for each item in the buffer storing each item explicitly into a place.
(let (opcode login-name password flags) (with-buffer buffer (setf opcode (unserialize :opcode) login-name (unserialize :string) password (unserialize :string) flags (unserialize :login-flags))) ...)
To do the same sort of thing more directly, one can use the
unserialize* macro. This macro allows one to unserialize
from a given buffer into given places using given keywords
on which to dispatch.
(let (opcode login-name password flags) (unserialize* (:opcode opcode :string login-name :string password :login-flags flags) :buffer buffer) ...)
Another way one might have used explicit calls to unserialize
is to replace the
let construct in the above with a
let* and unserialize each variable as it is created.
(with-buffer buffer (let* ((opcode (unserialize :opcode)) (login-name (unserialize :string)) (password (unserialize :string)) (flags (unserialize :login-flags))) ...))
To condense the above, one can use the
unserialize-let* macro. It
takes a list of keyword/variable-names, a buffer (which is not
optional), and a body of statements to execute while the named
variables are in scope. Note: the buffer argument here is required.
(unserialize-let* (:opcode opcode :string login-name :string password :login-flags flags) buffer ...)
Suppose one wanted to unserialize into a list (as this is Lisp after all). One could explicitly call unserialize for each item in the list.
(with-buffer buffer (list (unserialize :opcode) (unserialize :string) (unserialize :string) (unserialize :login-flags)))
To eliminate a great deal of typing the word
unserialize, one can
unserialize-list* macro. The macro takes a list of keywords
and an optional buffer. It returns a list as the first value and the
buffer as the second value.
(unserialize-list* (:opcode :string :string :login-flags) :buffer buffer)
Almost every protocol requires the encoding and decoding of integer values. To make it easy to create as many of these types as one's application requires, the USerial library defines a macro which creates a serialize and unserialize method for an integer that is a given number of bytes long. The macro takes two arguments: the key used to specify the method and an integer number of bytes.
(defmacro make-int-serializer (key bytes))
For example, to make serialize and unserialize methods for signed bytes and signed quadwords, one could simply call:
(make-int-serializer :signed-byte 1) (make-int-serializer :signed-quadword 8)
Similarly, if one wanted to create serialize and unserialize methods for unsigned bytes and unsigned doublewords, one could use the following macro:
(defmacro make-uint-serializer (key bytes)) (make-uint-serializer :unsigned-byte 1) (make-uint-serializer :unsigned-doubleword 4)
bytes argument to the
make-uint-serializer macros must be a constant value available
at the time the macro is expanded.
To serialize floating point numbers, one must have a function
that encodes floating point numbers into an integer representation
and a function that decodes the integer representation back into
a floating point number. Then, one can use the
macro which takes a key used to specify the method, a lisp type for the
floating point number, a constant number of bytes for the encoded
values, an encoder, and a decoder.
(make-float-serializer (key type bytes encoder decoder))
For example, the following would create serializers that encode rational numbers (technically not floating point, I know) as 48-bit fixed point numbers with 16-bits devoted to the fractional portion and 32-bits devoted to the integer portion.
(make-float-serializer :fixed-32/16 rational 6 #'(lambda (rr) (round (* rr 65536))) #'(lambda (ii) (/ ii 65536)))
The USerial library defines macros for helping one encode bit fields (to represent choices where more than one possibility at a time is acceptable) and enumerations (to represent choices where only a single selection can be made). These macros take a keyword used to specify the method and a list of choices.
(make-bitfield-serializer :wants (:coffee :tea :sega)) (make-enum-serializer :direction (:left :right :up :down))
With the bit field serializer, one can specify a single option or a list of zero or more options. With the enumeration serializer, must specify a single option.
(serialize :wants :tea) (serialize :wants nil) (serialize :wants '(:tea :sega)) (serialize :direction :up)
When unserializing, the bit field will always return a list even when there
is a single item in it as in the
:tea example above.
To facilitate serializing and deserializing classes and structs, the USerial library provides macros which create serializers and unserializers for items based on slots or accessors. These macros take a key used to specify the methods, a factory form used by the unserialize method to create a new instance of the class or struct, and a plist of key/name pairs where the name is a slot name for the slot serializers or an accessor name for the accessor serializers and the key with each name specifies how to serialize the value in that slot.
An example will help to clarify the previous paragraph. Suppose one had a simple struct listing a person's name, age, and favorite color.
(defstruct person name age color)
One could create the following serialize and unserialize pairs to allow encoding the data for internal use (where all data is available) or for public use (where the age is kept secret).
(make-slot-serializer :person-internal (make-person) (:string name :uint8 age :string color)) (make-accessor-serializer :person-public (make-person :age :unknown) (:string person-name :string person-color))
Here is a simple session showing the above in action. The following code first defines a function which serializes a value using a given key to a new buffer, rewinds the buffer, and unserializes from the buffer using the key.
CL-USER> (defun u-s (key value) (with-buffer (make-buffer) (serialize key value) (buffer-rewind) (nth-value 0 (unserialize key)))) U-S CL-USER> (defvar *p* (make-person :name "Patrick" :age 40 :color "Green")) *P* CL-USER> (u-s :person-internal *p*) #S(PERSON :NAME "Patrick" :AGE 40 :COLOR "Green") CL-USER> (u-s :person-public *p*) #S(PERSON :NAME "Patrick" :AGE :UNKNOWN :COLOR "Green")
The USerial library defines some commonly required serializers.
For signed integers, the USerial library defines the following
serializers (and unserializers):
:int8 for signed bytes,
for signed 16-bit integers,
:int32 for signed 32-bit integers, and
:int64 for signed 64-bit integers.
For unsigned integers, the USerial library defines the
:uint8 for unsigned bytes,
for unsigned 16-bit integers,
:uint24 for unsigned 24-bit integers,
:uint32 for unsigned 32-bit integers,
:uint48 for unsigned
48-bit integers, and
:uint64 for unsigned 64-bit integers.
For floating point numbers, the USerial library defines the
serializer for encoding
single-float values as 32-bit IEEE floating
point numbers and the
:float64 serializer for encoding
double-float values as 64-bit IEEE floating point numbers. The
USerial library uses the ieee-float library to encode
and decode floating point numbers.
For arbitrary byte sequences, the USerial library defines the
For strings, the USerial library defines the
:string serializer for
encoding strings as UTF-8 encoded sequences of arbitrary length. The
USerial library uses the trivial-utf-8 library to
encode and decode UTF-8 strings.
For enumerated types, the USerial library defines the
serializer for encoding an option that will be either
For various applications, it may be useful to log serialized messages.
The userial library provides a simple way to do that with the
serialize-log macro. The
serialize-log macro takes a
category and arguments for
serialize* and invokes a logger
(if one is available) with the serialized information. [Currently,
cl-log is the only supported logging system since it is the only
one that I am certain will accept binary messages.]
For example, if one wanted to log an integer and two strings in the
:packet one might do the following:
(serialize-log :packet :int32 the-int :string s1 :string s2)
This example shows how one might use the tools above to serialize the data that would need to be exchanged between a client and server to implement a two-player game similar to Milton-Bradley's Battleship game.
For this game, there will be a server and two clients. Each client will begin the game by placing his ships on an (2K+1)x(2K+1) board. The board will have coordinates ranging from -K through +K in both the X and Y axis. Ships will have to be placed either horizontally or vertically at integer coordinates. All ships are three units in length. It takes only one missile shot to sink a ship.
Once the ships are placed, regular play begins. During his turn during regular play, a client can either ping or fire. Each client begins with a defined amount of energy available with which to ping and a defined number of missiles.
If the client chooses to ping, the client chooses the radius of the ping and its center of origin. The server will calculate the distance from the center of origin to each enemy ship within the specified radius from the origin, round those distances to the nearest integer, and reply to the client with that list.
If the client chooses to fire, the client chooses the location upon which to fire. The server will respond to the client to tell him whether the shot was a hit or a miss.
To facilitate handling of received messages, each message will begin with an opcode identifying the message type. Some messages will be sent only from the client to the server. Others will be sent only from the server to the client.
(make-enum-serializer :client-opcodes (:login :place-ship :ping :fire)) (make-enum-serializer :server-opcodes (:welcome :ack :sunk :shot-results))
The message-receiving portion on the server side could then do something like this:
(defun handle-message-from-client (message) (ecase (unserialize :client-opcodes :buffer message) (:login (handle-login-message message)) (:place-ship (handle-place-ship-message message)) (:ping (handle-ping-message message)) (:fire (handle-ping-message message))))
To begin a game, the client sends a message to the server with
:login. The message declares the player's name,
which board sizes the client will play, and an optional name of
an opponent that the client is waiting to play.
(make-bitfield-serializer :playable-board-sizes (:small :medium :large :huge)) (defun make-login-message (name &key opponent small medium large huge) (let ((sizes (append (when small '(:small)) (when medium '(:medium)) (when large '(:large)) (when huge '(:huge))))) (let ((message (make-buffer))) (with-buffer message (serialize* (:client-opcode :login :string name :playable-board-sizes sizes :boolean (if opponent t nil))) (when opponent (serialize :string opponent))) message)))
On the receiving side, the server might do something like the following (given that it already read the opcode from the message as it had in the previous section).
(defun handle-login-message (message) (unserialize-let* (:string name :playable-board-sizes sizes :boolean has-opponent) message (assert (plusp (length name))) (assert (plusp (length sizes))) (cond (has-opponent (unserialize-let* (:string opponent) message (match-or-queue name sizes opponent))) (t (match-or-queue name sizes)))))
When the server finds a match for the requested game, it composes welcome messages to each client. The welcome message contains the size of the board in squares, the number of ships each player has, the amount of ping energy each player has, the number of missiles each player has, and the name of the opponent.
(defun make-welcome-message (squares ships energy missiles opponent) (serialize* (:server-opcode :welcome :uint8 squares :uint8 ships :float32 energy :uint16 missiles :string opponent) :buffer (make-buffer)))
Suppose the client had a class it was using to track the current state of the game. The client could then use a slot-serializer or accessor-serializer to parse the incoming welcome message.
(make-accessor-serializer :game-state-from-welcome (make-game-state) (:uint8 game-state-board-size :uint8 game-state-ships :float32 game-state-energy :uint16 game-state-missiles :string game-state-opponent))
The client could then handle the welcome message as follows (assuming the opcode has already been unserialized from the message buffer):
(defun handle-welcome-message (message) (unserialize-let* (:game-state-from-welcome game-state) message ;; do anything with this game state here )
To place ships, a client specifies the center coordinate of the ship and whether the ship is oriented horizontally or vertically.
(make-enum-serializer :orientation (:horizontal :vertical)) (defun make-place-ship-message (x y orientation) (serialize* (:client-opcode :place-ship :int8 x :int8 y :orientation orientation) :buffer (make-buffer)))
The server could read the coordinates and orientation into local variables before calling a method to add the ship to the map.
(defun handle-place-ship-message (message) (let (x y orientation) (unserialize* (:int8 x :int8 y :orientation orientation) :buffer message) (add-ship-to-map x y :is-vertical (eql orientation :vertical))))
To perform a ping move, a client encodes a radius and a center for the ping.
(serialize* (:client-opcode :ping :float32 radius :int8 x :int 8 y) :buffer (make-packet))
Here, the server will decode the ping request into a list to send to its routine to calculate the reply.
(apply #'calculate-ping-response (unserialize-list* (:float32 :int8 :int8) :buffer message))
Supposing the return from
calculate-ping-response is a list of
distances to ships, the ack message could be encoded like this:
(with-packet ack-message (serialize* (:server-opcodes :ack :float32 remaining-ping-energy :uint16 (length hits))) (mapcar #'(lambda (d) (serialize :uint8 d)) hits))
To send a fire message, the client just sends the coordinates of the location upon which to fire.
(serialize* (:client-opcodes :fire :int8 x :int8 y) :buffer (make-buffer))
If the server determines the shot was a hit, it must send a sunk message to the opponent. Either way, a shot results message must be sent to the client.
(make-enum-serializer :shot-result (:hit :miss)) (defun make-sunk-message (x y) (serialize* (:server-opcodes :sunk :int8 x :int8 y) :buffer (make-buffer))) (defun make-shot-results-message (hit) (serialize* (:server-opcodes :shot-results :shot-result (if hit :hit :miss)) :buffer (make-buffer)))