Calispel

J.P. Larocque


Table of Contents

1. Introduction
1.1. Obtaining Calispel
1.2. Copying
1.3. Contact Information
2. Examples
3. Reference
3.1. The CHANNEL Class
3.2. ? and !: Basic I/O Functions
3.3. PRI-ALT and FAIR-ALT: Alternation Among Several Operations
3.4. Dynamic Alternation
4. Design Decisions
4.1. Omitted Features
4.1.1. Thread Management Utilities
4.1.2. Pattern Matching
4.2. Internal Warts
4.2.1. The Global Lock
4.2.2. Outsmarting the Garbage Collector

1. Introduction

Calispel is a Common Lisp library for thread-safe message-passing channels, in the style of the occam programming language.

Calispel channels let one thread communicate with another, facilitating unidirectional communication of any Lisp object. Channels may be unbuffered, where a sender waits for a receiver (or vice versa) before either operation can continue, or channels may be buffered with flexible policy options.

Because sending and receiving on a channel may block, either operation can time out after a specified amount of time.

A syntax for alternation is provided (like ALT in occam, or Unix select()): given a sequence of operations, any or all of which may block, alternation selects the first operation that doesn't block and executes associated code. Alternation can also time out, executing an "otherwise" clause if no operation becomes available within a set amount of time.

Many CSP- and occam-style channel libraries offer features like parallel execution (i.e. occam PAR). Calispel is a message-passing library, and as such leaves the role of threading abstractions and utilities left to be filled by perfectly good, complementary libraries such as Bordeaux-Threads and Eager Future.

1.1. Obtaining Calispel

The latest version of Calispel, with accompanying documentation, can be found at: http://www.thoughtcrime.us/software/calispel/

The most recent release is 0.1, released 2009-10-19. It depends on: jpl-queues 0.1, cl-jpl-util 0.2, Eager Future 0.1, Bordeaux Threads

I sign all my software with OpenPGP, key ID 0xE979FFBEA002D20F, fingerprint A87B 1C5A 28C4 03BD 54BA CE8E E979 FFBE A002 D20F. (Releases were previously signed with 0x80555CED7394F948, which has been revoked, but not compromised. See my OpenPGP transition statement.)

1.2. Copying

The software and this document are licensed under permissive, BSD-like terms, copied from the ISC license:

Copyright © 2009, Jean-Paul Guy Larocque

Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

This software was directly derived from csp.tgz, dated 2006-07-03, and published by Roger Peppe. No copyright notice giving attribution to Roger Peppe or any specific licensing terms seem to have been included in that version.

That software was derived from channel.c of Plan 9 libthread:

Copyright © 2005 Russ Cox, Massachusetts Institute of Technology

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

That software contains parts derived from an earlier library by Rob Pike, Sape Mullender, and Russ Cox:

Copyright © 2003 by Lucent Technologies.

Permission to use, copy, modify, and distribute this software for any purpose without fee is hereby granted, provided that this entire notice is included in all copies of any software which is or includes a copy or modification of this software and in all copies of the supporting documentation for such software.

THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT ANY EXPRESS OR IMPLIED WARRANTY. IN PARTICULAR, NEITHER THE AUTHORS NOR LUCENT TECHNOLOGIES MAKE ANY REPRESENTATION OR WARRANTY OF ANY KIND CONCERNING THE MERCHANTABILITY OF THIS SOFTWARE OR ITS FITNESS FOR ANY PARTICULAR PURPOSE.

1.3. Contact Information

The author welcomes feedback, questions, help requests, and bug reports via e-mail: J.P. Larocque <jpl-software at thoughtcrime.us>

2. Examples

Create a channel with no buffering:

(defparameter *chan*
  (make-instance 'calispel:channel))

In another thread, sleep for 1 second, then send the number 42 to the channel. In the current thread, receive from the channel. At first, there will be no value available, so ? will wait until the other thread sends the value.

(progn
  (eager-future:pexec
    (sleep 1)
    (calispel:! *chan* 42))
  (calispel:? *chan*))
=> 42
   T

(42 is the value received, and T indicates that the receive was successful—it did not time out.)

Sending to the channel will also block without a waiting receiver, because channels are unbuffered by default. This will attempt to send to the channel, then time out after 2 seconds:

(calispel:! *chan* 'foo 2)
=> NIL

(NIL indicates that the send was not successful—it timed out.)

Create a new channel that is buffered:

(defparameter *chan*
  (make-instance 'calispel:channel
                 :buffer (make-instance 'jpl-queues:bounded-fifo-queue :capacity 2)))

This channel will accept up to two values that have not yet been received before sends will block:

(loop for i from 1
      while (calispel:! *chan* i 0)
      finally (format t "~&Stopped before ~:R value.~&" i))
>> Stopped before third value.

Now let's print them back out:

(loop
  (multiple-value-bind (value success?)
      (calispel:? *chan* 0)
    (when success?
      (format t "~&Value: ~S~&" value))
    (unless success?
      (return))))
>> Value: 1
Value: 2

Suppose that we have many channels that we're interested in receiving from or sending to. We can use alternation to select the first operation that is available, and then perform some action associated with the operation:

(let ((chan1 (make-instance 'calispel:channel)) ; chan1 goes unused
      (chan2 (make-instance 'calispel:channel)))
  (eager-future:pexec
    (calispel:! chan2 42))
  (calispel:pri-alt
    ((calispel:? chan1)
     ;; Nothing is sent to CHAN1, so it can't be ready.
     (format t "~&Got a value from CHAN1, but that should never happen.~&"))
    ((calispel:? chan2 value)
     ;; CHAN2 has either had something sent to it, or it soon will,
     ;; so this will execute.
     (format t "~&Got value from CHAN2: ~S~&" value))))
>> Got value from CHAN2: 42

What if there's more than one operation that is immediately possible? PRI-ALT chooses the first one available...

(let ((chan1 (make-instance 'calispel:channel))
      (chan2 (make-instance 'calispel:channel)))
  (eager-future:pexec
    (calispel:! chan1 'foo))
  (eager-future:pexec
    (calispel:! chan2 'bar))
  (sleep 1) ; Wait for both CHAN1 and CHAN2 to become ready.
  (calispel:pri-alt
    ((calispel:? chan1 value)
     (format t "~&Got value from CHAN1: ~S~&" value))
    ((calispel:? chan2 value)
     (format t "~&Got value from CHAN2: ~S~&" value))))
>> Got value from CHAN1: FOO

...whereas FAIR-ALT chooses any of the available operations:

(let ((chan1 (make-instance 'calispel:channel))
      (chan2 (make-instance 'calispel:channel)))
  (eager-future:pexec
    (calispel:! chan1 'foo))
  (eager-future:pexec
    (calispel:! chan2 'bar))
  (sleep 1) ; Wait for both CHAN1 and CHAN2 to become ready.
  (calispel:fair-alt
    ((calispel:? chan1 value)
     (format t "~&Got value from CHAN1: ~S~&" value))
    ((calispel:? chan2 value)
     (format t "~&Got value from CHAN2: ~S~&" value))))
>> Got value from CHAN1: FOO
(or, determined randomly)
>> Got value from CHAN2: BAR

Just like ? and !, PRI-ALT and FAIR-ALT allow time outs to be specified. An OTHERWISE clause is executed if no operation can be immediately performed, effectively putting a time out of 0 on all the operations:

(let ((chan1 (make-instance 'calispel:channel))
      (chan2 (make-instance 'calispel:channel)))
  (eager-future:pexec
    (sleep 1)
    (calispel:! chan1 'foo))
  (calispel:pri-alt
    ((calispel:? chan1 value)
     (format t "~&Got value from CHAN1: ~S~&" value))
    ((calispel:? chan2 value)
     (format t "~&Got value from CHAN2: ~S~&" value))
    (otherwise (format t "~&Timed-out.~&"))))
>> Timed-out.

You can also wait up to a certain amount of time before executing the OTHERWISE clause:

(let ((chan1 (make-instance 'calispel:channel))
      (chan2 (make-instance 'calispel:channel)))
  (eager-future:pexec
    (sleep 1)
    (calispel:! chan1 'foo))
  (calispel:pri-alt
    ((calispel:? chan1 value)
     (format t "~&Got value from CHAN1: ~S~&" value))
    ((calispel:? chan2 value)
     (format t "~&Got value from CHAN2: ~S~&" value))
    ((otherwise :timeout 5)
     (format t "~&Timed-out.~&"))))
>> Got value from CHAN1: FOO

(Try increasing the SLEEP delay to 6 to see that the PRI-ALT will still time out.)

3. Reference

3.1. The CHANNEL Class

Syntax. 

(MAKE-INSTANCE 'CHANNEL &key BUFFER)
=> (A CHANNEL instance.)

A channel is a medium that communicates messages from one thread to another.

All channels have a buffer. The default buffer doesn't do anything—it's always full and always empty. It has no storage.

BUFFER specifies the jpl-queues queue to buffer messages with.

Sending to a channel blocks when there is no other thread waiting to receive from it and there is no room in the buffer (i.e. JPL-QUEUES:FULL? returns true). Receiving from a channel blocks when there is no other thread waiting to send to it and there are no objects in the buffer (i.e. JPL-QUEUES:EMPTY? returns true).

To improve throughput with better parallelism, a meaningful buffer is recommended so that threads can perform useful work instead of waiting on other threads. Any jpl-queues queue may be used, but note:

  • The queue need not be "synchronized" (an instance of JPL-QUEUES:SYNCHRONIZED-QUEUE): Calispel has its own synchronization, so external synchronization will only add overhead.
  • The queue may not be shared with any other channels or be used for anything else, even if it's "synchronized." (Pedantic exception: if the queue strictly has no state, then it doesn't matter if it's shared. The default "null" queue has no state, and it is shared.)

3.2. ? and !: Basic I/O Functions

Syntax. 

(? CHANNEL &optional TIMEOUT)
=> VALUE
   RECEIVED-OK?
(! CHANNEL VALUE &optional TIMEOUT)
=> SENT-OK?

? receives a value from CHANNEL, waiting up to TIMEOUT seconds (a non-negative REAL number; or indefinitely if unspecified or NIL). If a value can be received before the time out, the value and T (indicating success) are returned. Otherwise, NIL and NIL (indicating failure) are returned.

! sends VALUE to CHANNEL, waiting up to TIMEOUT seconds (a non-negative REAL number; or indefinitely if unspecified or NIL). If the value can be sent before the time out, T (indicating success) is returned. Otherwise, NIL (indicating failure) is returned.

3.3. PRI-ALT and FAIR-ALT: Alternation Among Several Operations

Syntax. 

(PRI-ALT operation-clause* [otherwise-clause])
(FAIR-ALT operation-clause* [otherwise-clause])
=> (For either macro: the result of the final evaluated form,
    or no values if no clause was executed.)

operation-clause ::= (operation form*)
otherwise-clause ::= ({otherwise | (otherwise [:timeout timeout])} form*)
operation        ::= (? channel [lambda-list [condition]]) ; receive
                   | (! channel value [condition])         ; send

Performs one of the given channel operations, choosing one from the set of operations that first becomes available, then evaluates each of the forms associated with the selected operation. If no operation can immediately be made, waits until an operation is available (optionally up to a given timeout).

When there are multiple operations that can be immediately carried-out, PRI-ALT selects the first one listed, whereas FAIR-ALT chooses one at random.

channel

Evaluated to produce a CHANNEL to send to or receive from. The channel forms associated with operations that do not pass the condition are not evaluated.

lambda-list

Either a symbol naming a variable to be bound to the value received from the channel, or a destructuring lambda list[1] naming a set of variables to be bound to the destructured value received from the channel. The bindings are visible to the associated forms. If the value cannot be destructured according to the lambda list, an error is signalled. Note that multiple receive clauses for the same channel with different destructuring lambda-lists cannot be used for pattern matching.

value

An expression whose primary value is used as the message to send to the channel. All value expressions are evaluated before selecting an operation, except for those associated with operations that do not pass the condition.

condition

Evaluated to produce a generalized boolean indicating whether the associated operation-clause should receive further consideration. When condition is not given or its resulting value is true, the associated operation is kept for consideration. When the resulting value is false, the operation is removed from consideration (as if its associated channel never becomes ready for sending/receiving).

form

Evaluated in sequence when the associated clause is executed. The values of the evaluation of the last form of the effective clause become the result of the macro.

timeout

Evaluated to produce the duration, as a non-negative REAL number of seconds, to wait for an effective operation to become available before resorting to the otherwise-clause. The result may also be NIL to specify no time out. When an otherwise-clause exists, the default time out is 0, meaning that if none of the channels in the operation-clauses are immediately available, the otherwise-clause forms are executed immediately. When there is no otherwise-clause, the default time out is NIL.

It is useful to specify a timeout expression that conditionally evaluates to NIL, in order to disable the time out and inhibit the execution of the otherwise-clause (provided that there are channel operations to wait for that haven't been excluded by false conditions).

If there are no effective operations (because all the conditions evaluated to false, or because no operations were specified), then the otherwise-clause (if any) is executed immediately (even if the specified time out is NIL).

Stylistically and for future compatibility, avoid side-effects in channel, value, condition, and timeout expressions.

3.4. Dynamic Alternation

It is possible to dynamically construct a set of operations to alternate upon.

The general procedure is to instantiate OPERATION for each kind of operation you wish to perform. For sending operations, you will need to give the value to send with :VALUE. Pass the OPERATION instances, as a list, to OPERATION-ALTERNATE. OPERATION-ALTERNATE will either immediately execute one of the OPERATION instances, or block until another thread executes an operation which allows one of the given OPERATION instances to execute. The selected OPERATION instance, after having been executed, is returned. If the selected operation was a receive operation, the value received will be available with the VALUE accessor.

Please see the documentation in core.lisp for OPERATION-ALTERNATE and the OPERATION class.

4. Design Decisions

Calispel is derived from Roger Peppe's csp.tgz, which in-turn was derived from channel.c of Plan 9 libthread. The core design of Calispel was inherited from these libraries.

This section mostly documents some of the shortcomings of Calispel. Some of those shortcomings are internal warts, and others are features not included in the library.

4.1. Omitted Features

4.1.1. Thread Management Utilities

A lot of Lisp libraries for CSP- and occam-style channels, and message passing in general, seem to prescribe their own way of creating and managing threads. Nevermind that if they're using threads portably, they're probably using Bordeaux Threads.

That doesn't make sense to me. Calispel leaves threading utilities and abstractions to their own libraries. Calispel doesn't prescribe the one right way to manage threads (so long as the threads behave in the way described in the Bordeaux Threads specification), just as Bordeaux Threads and Eager Future don't prescribe how to communicate between threads (although Eager Future does make it darn easy to communicate the one-time result of a thread back to its parent). Calispel and these libraries are complementary.

4.1.2. Pattern Matching

It would be convenient for pattern matching to be supported in PRI-ALT and FAIR-ALT. Pattern matching would try matching an object received on a channel against several patterns in turn, executing the clause forms with a matching pattern, with variables of the pattern bound for the execution. It would look a little something like this:

(calispel:pri-alt
  ((? chan (:foo))
   #| got a message: (:FOO) |#)
  ((? chan (:bar baz quux))
   #| got the above message; BAZ and QUUX are bound variables |#))

Calispel does not support pattern matching:

  • Pattern-matching with run-time clause grouping by channel is expensive: ~25% slower without stupid tricks and ~15% slower with.

  • PM can be made faster by using syntax to group together clauses on the same channel, enabling static macroexpansion-time optimization. But doing so requires non-trivial changes to the syntax of the ALT macros. e.g.:

    (calispel:pri-alt
      (? chan
       ((:foo)
        ...)
       ((:bar baz quux)
        ...))
      (! other-chan value)
      ...)
  • PM a contentious issue. As of 2009-10-25, I count seven PM libraries: arnesi, cl-match, cl-unification, fare-matcher, mcpat, toadstool, and pcond. By implementing PM, I have to choose one of these, and if I do, I'll drive away people that disagree with the style of that particular PMer. And even then, I find them all ugly: do any of them support destructuring lambda-list syntax, the destructuring language built-into Common Lisp (but sadly lacking a reasonable interface)?

  • Of the libraries that provide standalone (not embedded in source code) documentation, they are either too confusing (cl-unification) or have non-trivial licensing (cl-match: LLGPL). Good documentation is a critical goal of the Calispel project, and easy-to-understand licensing matters a lot, too.

Like all things: I'm seeking your input! If pattern matching matters to you, please drop me a line. I'm interested in which pattern matchers you prefer, why you prefer them, and whether there are certain approaches to pattern matching you really don't like. Changing the Calispel syntax (for the next major release) and the lack of documentation on the part of PM libraries can both be overcome, but I don't want potential users become disillusioned when they look at Calispel and see that it is tied to a pattern matching library that they really don't like.

4.2. Internal Warts

4.2.1. The Global Lock

Calispel has a global lock which is used to serialize all operations on channels. Specifically, the union of these activities is prevented from occurring in parallel:

  • Calls to ? and !.
  • The beginning and end of the execution of PRI-ALT and FAIR-ALT. (Note that the lock is not held while evaluating anything given by the client (e.g. a value to send), or for the act of waiting itself.)

A consequence of this is that, on a (say) 4-core machine with threads A, B, C, and D each individually getting a core to themselves, with A and B communicating on one channel and C and D communicating on a different channel, the communication between A and B will slow down communication between C and D, and vice-versa.

It sounds like a good idea to do away with it, enabling better parallelism. But this is hard.

It's very simple to put locks on each channel individually as long as there aren't atomic operations on sets of channels. But the ALT macros do require such atomic operations: given a set of operations, perform exactly one of them—the first one that becomes available.

A comment in channel.c of Plan 9 libthread has this to say about the lock (edited for Calispel terminology):

One can go through a lot of effort to avoid this global lock. You have to put locks in all the CHANNELs and all the OPERATIONs. At the beginning of OPERATION-ALTERNATE you have to lock all the CHANNELs, but then to try to actually execute an OPERATION you have to lock the other guy's OPERATIONs, so that other people aren't trying to use him in some other alternation at the same time.

It's just not worth the extra effort.

At least one special case has to be handled (as alluded to above):

  • Thread 1 is blocking on an ALTERNATION with two OPERATIONs. Operation 1 is to receive from Channel 1. Operation 2 is to receive from Channel 2.
  • Thread 2 is in OPERATION-ALTERNATE with one operation (Operation 3): send to Channel 1.
  • Thread 3 is in OPERATION-ALTERNATE with one operation (Operation 4): send to Channel 2.
  • Thread 2 and Thread 3 are running simultaneously.

It must be guaranteed that only either O1 or O2 is selected for the alternation in T1. What's going to happen if we're not careful?

  • At the same time, T2 dequeues O1 from C1's receiving operation queue, and T3 dequeues O2 from C2's receiving operation queue.

So, we need some magical (and probably quite complicated) way of locking just enough of the data structures to prevent the conflict in the last bullet point, or we need a way for one thread to see there's a conflict and "retry" back to OPERATION-ALTERNATE, looking for another ready operation (also probably quite complicated).

No matter how you slice it, the solution will be hard, and proving its correctness even harder. What performance gain (if any, with all that fine-grained locking) do you get? Is it worth it?

And finally, you'd also have to give consideration to *RANDOM-STATE*, to ensure the current RANDOM-STATE (shared among threads) won't be accessed and updated by RANDOM concurrently. RANDOM is used directly in this library, and in jpl-queues.

4.2.2. Outsmarting the Garbage Collector

For any call to ? or !, Calispel allocates (conses) an OPERATION instance (among other things). For each enabled send/receive operation, PRI-ALT and FAIR-ALT will also cons an OPERATION (among other things).

There are a couple of apparent opportunities to avoid consing.

  • Could we recycle OPERATION instances? We could grab one from a per-thread pool when needed and return it once we're done with it.

    This is a little bit like writing our own memory allocator and performing manual memory management, but it's tied specifically to this use case.

    I tried it. It worked, but the speedup couldn't be measured.

    Consing in SBCL went down by 45%. Time spend in GC was reduced by 24%. But then I noticed that, for a 20-second test, only 0.244 seconds were spent in GC.

    The only opportunities to save time with this optimization could have been time spent in GC or time spent initializing instances. In any case, this proved to be a premature optimization with significantly increased code complexity for no measurable gain, so it was aborted.

  • When an ALT macro is used by only a single thread, couldn't the macro preallocate the required OPERATIONs? Wouldn't that especially help when that ALT macro form is in a (or is the entirety of a) main loop?

    I don't remember just how bad the result was, but I think it ran a little more slowly the the original code.

Lesson learned: don't try to beat the garbage collector at its own game (unless you really do have a good reason to).



[1] See: Common Lisp HyperSpec, sec. 3.4.5 Destructuring Lambda Lists