ToC

DocOverview

CGDoc

RelNotes

FAQ

Index

PermutedIndex

Allegro CL version 11.0

Symmetric Multiprocessing in Allegro CL

1.0 Symmetric Multiprocessing introduction

This document describes the Symmetric Multiprocessing (SMP) implementation in Allegro CL on certain platforms. See also the multiprocessing.html document, which describes the older Lisp-based uniprocessor implementation but most of the functionality applies to SMP as well.

Starting in Allegro CL 9.0, Allegro CL on certain platforms will have images that will allow using multiple independent processors. This may result in dramatic speedups in applications as the additional processor resources are used. There is also a separate non-SMP distribution which, like earlier versions of Allegro CL, runs on one hardware processor only.

Note that the SMP version works on a machine that has only one processor and using the SMP version may be more efficient on such machines even though (obviously) only one process can run at a time.

But there is a downside. Using multiple processors simultaneously can make programs that run correctly on a single processor fail, either by producing incorrect results or by signaling unexpected errors or even by failing altogether. Here are three areas where such failures can occur:

There is no automatic way to make code being executed in one process run to completion while all other processes do nothing. In a Lisp using one processor only to execute Lisp code, wrapping a section of code with without-interrupts meant that when that code was run, it would run to completion while all other processes waited. Using multiple processors means that other processes run at the same time. (The code will still not process interrupts, but that no longer means other Lisp code is not running.) See An example of the difference between SMP Lisp and non-SMP Lisp for more information on this point.
When an object is modified by one process and simultaneously accessed by another, the accessing process can get an incorrect result or can even fail, perhaps causing Lisp itself to fail. By 'incorrect result' we do not just mean an out of date result; we mean there is a possibility of getting a result that was never correct. See Thread safety for more information.
Certain operators that are supposed to guarantee certain results can no longer be guaranteed to work as advertised. pushnew, for example, if called without some process-locking protection in two separate processes, can push the same value onto a list from each process resulting in a duplicate in the list. (Each process completes the test for whether the object is new before the other has added the item to the list, and so each then adds the item.)

Some of these problems (but not number 1 so long as the Lisp heap was not modified by foreign code) actually already exist in a non-SMP multiprocessing Lisp, as Allegro CL has been for many years. But because the system-initiated process-switch latency was so slow (on the order of a second) compared to the execution speed, the errors either never occurred or were so rare that the problem was never properly identified or even noticed. But in an SMP Lisp, where processes are just running simultaneously and there is no system process switch latency, the rare or never happening events are suddenly much, much more likely and subtle coding errors which could be ignored before become unignorable.

Please realize first that essentially all problems arise from object modification, although the failure can come from the process reading a value (because the modification process can result in inconsistencies); and second that tools such as locks (see Process locks in multiprocessing.html) and gates (see Gates in multiprocessing.html) are available to ensure that processes do not interfere with each other. If you use SMP, it is your programs responsibility to ensure that modifications are consistent and do not cause problems.

The important point is that programs which ran without error (or had such a small chance of error that they for all practical purposes ran without error) may fail in an SMP environment, either because of mysterious and unexpected errors or (and this is likely worse) by not signaling an error and instead producing very incorrect results.

This document describes the problems and also the new tools which will assist in making your program SMP-safe. It discusses the general problem and the macros and data elements introduced in Allegro CL 9.0 to control concurrent access. Many of the macros have keyword arguments that can be used to control how the macro expands in a non-SMP Lisp. This makes it easier to write source code that works efficiently whether it is compiled for running in an SMP Lisp or a non-SMP Lisp.

In a non-SMP Lisp only one thread of control at a time is ever running Lisp code. Lisp execution is gated by a mutex and low-level runtime code selects one lisp process to execute at a time. A Lisp process can voluntarily yield control, either explicitly or by waiting on some event. A running Lisp process can also be preempted, temporarily ceding control to another process in response to some event.

In a non-SMP Lisp interleaved execution of multiple processes is coarse-grained. A process can not lose control at arbitrary points in its execution. Running Lisp code checks periodically, at what might be called safe points, to see if an interruption in the normal flow of sequence has been requested. If so, the state of the running process is saved, the state of the preempting process restored, and that second process allowed to execute.

Lisp-level handling of asynchronous signals is restricted to happen at these same safe points. This is true in SMP-enabled Lisps as well as in non-SMP Lisps. Low-level, non-Lisp support code keeps a record of signals seen but not yet processed. This code responds to asynchronous signals like SIGALRM by recording the event in an unprocessed-signals buffer and setting a global flag to show that a signal has been seen, and then returning to the lisp processing that was happening when the signal was recognized. At the next safe point, the state of the global flag will be seen and Lisp-level processing of the signal will occur. In a non-SMP Lisp this could result in a process switch.

In a non-SMP Lisp, controlling concurrent access to shared objects often reduces to controlling when these process-switch and signal-handling events can occur. By ensuring that a block of Lisp code will execute without interruption, an application running in a non-SMP Lisp can provide arbitrarily complex operations that will look atomic to multiple processes. This is obviously inadequate in an SMP Lisp, where multiple threads of control are running Lisp code simultaneously.

Most symbols naming SMP functionality are in the multiprocessing (nickname mp) package or in the excl package. Some functionality is loaded into all images and other is in the :smputil module. You should load this module with a form like

(eval-when (compile load eval) (require :smputil))

(Just (require :smputil) is sufficient when typed to the top-level.)

1.2 An example of the difference between SMP Lisp and non-SMP Lisp

Consider the following example. We have a variable VAR whose initial value is the fixnum 8. Various processes are running in Lisp and independently incrementing or decrementing the value of VAR with calls to incf and decf. Our first attempt is:

;; In process-1:
(incf var)

;; In process-2:
(incf var)

;; In process-3:
(decf var)

But this might cause us trouble, because (incf var) macroexpands to (setq var (+ var 1)). It is possible that (+ 1 var) is evaluated and then there is a process switch, before that value is stored as the value of VAR, so when the process gets control again, it might store what has become the wrong value.

But in a non-SMP Lisp, we can prevent a process switch with without-interrupts, so we recode as follows:

;; In process-1:
(without-interrupts (incf var))

;; In process-2:
(without-interrupts (incf var))

;; In process-3:
(without-interrupts (decf var))

In non-SMP Lisp, once process-1, process-2, and process-3 have completed (or at least executed the forms shown), the value of VAR will be 9 (8 + 1 + 1 - 1). We do not actually know in what order the additions and subtraction will occur, but the without-interrupts guarantees that each will complete once it has started without any process switch and so we can be confident of the final result.

In an SMP Lisp, the without-interrupts macro does not have the effect of preventing a process switch because all processes can be running simultaneously on different processors, and so the whole notion of 'process switch' is problematic. So the following can happen:

1. process-1 reads the value 8 from VAR and places that value in one
of its registers.

2. Before process-1 does its calculation and stores a new value in
VAR, process-2 reads the value (still 8) from VAR and places it in one
of its registers.

3. process-1 adds 1 to 8 and stores the result (9) in VAR.

4. After process-1 stores its new value but before process-2 stores a
new value, process-3 reads the value (9) from VAR and places it in one
of its registers.

5. process-2 adds 1 to 8 and stores the result (9) as the value of
VAR.

6. process-3 subtracts 1 from 9 and stores the result (8) as the value
of VAR.

When all this completes, the value of VAR is 8, not 9. The value could indeed end up as 7 (process-3 reads first and stores last), 8 (as shown), 9 or 10 (left as an exercise to the reader).

So code which depended on without-interrupts to ensure that certain operations will not be interfered with and that therefore certain final results could be depended upon will no longer work as expected.

The fundamental issue is that on a non-SMP Lisp, process locking (ensuring that code in a single Lisp process is guaranteed to execute to completion without interruption) gave you object locking (ensuring that only one process could read or set a value) as a side effect. In an SMP Lisp, that is no longer true.

New macros (see New macros and related functionality) provide object locking to allow the example here and related examples to run as desired. The macros incf-atomic and decf-atomic are specific to this example and others to related examples. Of course, programmers can also use more complex mechanisms to lock objects and processes and avoid undesired concurrency, although these techniques often have a significant overhead.

Here is the code that guarantees that all the incf's and decf's will run as desired (but in an undefined order). This works in SMP and non-SMP Lisps.

;; In process-1:
(incf-atomic var)

;; In process-2:
(incf-atomic var)

;; In process-3:
(decf-atomic var)

1.3 Non-SMP images on platforms that support SMP

All platforms that will support SMP Lisp will also have non-SMP images available. Current programs that depend on the fact that Lisp code is run on a single processors will thus continue to work using the non-SMP images without any changes relating to SMP (beyond trivial suppression of compiler warnings, done by evaluating (setq excl::*warn-smp-usage* nil)).

Therefore there is no need for users who do not wish to use SMP to make any changes to their programs. The rest of this document is directed at users who do wish to use SMP. They will have to determine whether their code must be modified (usually, some changes must be made) using the information in this document.

2.0 Deprecated macros

Three macros are used in non-SMP Lisps to provide concurrency control: excl::atomically, without-interrupts, and without-scheduling. (The symbol excl::atomically was not exported and the associated macro never documented, but some users were aware of it; if you are unfamiliar with excl::atomically, you need not worry about it as its use is now deprecated. You just have less code to modify).

These macros are all very light-weight in processing overhead and are much cheaper than using a process-lock. They have the additional advantage that they do not require the multiprocessing package be loaded. This allows a programmer to write code that runs correctly in a multi-process application, but still runs efficiently in a single-process application. All three rely on the coarse-grained scheduling of non-SMP multiprocessing. All three represent problem situations for multiprocessing under SMP. They are all deprecated in 8.2 and later Lisps, although they are still available and still perform the same functions in a non-SMP lisp as they did before 8.2.

New macros provide the same functions as the deprecated macros, in ways that are compatible with an smp environment. The new macros are described in New macros and related functionality. The deprecated macros generate compile-time warnings (which can be globally muffled by evaluating (setq excl::*warn-smp-usage* nil)).

2.1 Deprecated macro: excl::atomically

The excl::atomically macro was never documented and the symbol naming it was never exported. However, some users made use of it. Users who never used atomically can skip this section.

excl::atomically is wrapped around a sequence of forms:

(excl::atomically . BODY)

This acted exactly like

(progn . BODY)

as far as code generation was concerned, but the compiler would produce a diagnostic if BODY involved any operations that might result in a process switch or a garbage collection. Such opportunities are inherent in almost any out-of-line call, in object allocation, in anything that might signal an error, and in several other low-level actions. If BODY contained none of these, then the compiler would accept the form and the programmer could be confident that BODY would execute atomically as far as Lisp processes were concerned.

A major use of excl::atomically was to wrap a section of code in which it was required that garbage collection not relocate some object or objects. This requirement was independent of multiprocessing concerns, and happened most often in low level code that was trying to process an array's elements very efficiently.

In some cases, the excl::atomically form was used not to assure atomicity, or to guarantee gc-free running, but to wrap a section of code that needed to be as fast as possible. If some infelicitous change to the compiler caused it to start generating out-of-line calls where in-line code was expected, the presence of the excl::atomically wrapper made sure there would be a compile-time warning.

The new macro fast-and-clean replaces excl::atomically. In an SMP Lisp, (fast-and-clean ...), which is the functional equivalent of (excl::atomically ...), cannot be guaranteed to be atomic because of the nature of SMP and atomicity. It will, however, prevent a gc from happening while the form is executing, because gc's can only happen when a process allocates something or at one of the same "safe-point"s that allow interrupts (described below in Deprecated macro: without-interrupts. The fast-and-clean form assures us we have neither of these. Even so, if control of garbage collection is the issue, using the with-pinned-objects macro is preferred.

2.2 Deprecated macro: without-interrupts

The without-interrupts macro works as follows:

(without-interrupts . BODY)

acts almost exactly like

(let ((excl::*without-interrupts* t)) BODY)

excl::*without-interrupts* is a special variable (named by an unexported symbol and so not documented) which gates the processing of asynchronous signals. When an asynchronous signal triggers Lisp's low-level signal handler, the signal is queued and a flag is set to indicate the situation. This happens without any Lisp code executing, and does not interrupt the running Lisp code. While executing, Lisp code polls the flag periodically, checking for the signal-has-happened situation.

These checks are made at safe-points in the code, where the Lisp system is ready to allow execution of arbitrary signal-handling code in an interrupt-the-application mode; handling the signals at the Lisp level could result in a process-interrupt, or cause a process switch in a non-SMP Lisp. If excl::*without-interrupts* is nil, then the safe-point check happens normally, and any queued signals get handled. If excl::*without-interrupts* is non-nil, then queued signals are ignored. No process-interrupt will be handled, and in a non-SMP Lisp, no automatic process switch can happen when excl::*without-interrupts* is non-nil, although an explicit process-wait or other scheduling operation would break through this barrier.

Ignored signals are not discarded, and signals caught while excl::*without-interrupts* is non-nil will still be added to the queue of pending signals, to be processed at some safe-point after leaving the without-interrupts regime. The way in which without-interrupts differs from the simple let form shown above has to do with details of safe and efficient processing, ensuring that we pay as little as possible in computation overhead to ignore the pending signals and that when excl::*without-interrupts* returns to a nil value, we handle any pending signals at the next safe-point.

However, even though interrupts are delayed and process switches from the code running within a without-interrupts are also inhibited, object locking, a side effect in a non-SMP Lisp is no longer guaranteed and gc's may occur even if the code within the without-interrupts form does no consing and so does not itself trigger a gc.

The new macro with-delayed-interrupts replaces without-interrupts. In a non-SMP Lisp, with-delayed-interrupts expands to the same code as without-interrupts (but does not produce a compiler warning). In an SMP Lisp, interrupts are delayed, so the code within the macro runs to completion once it is started, but objects are not locked as other processes can run.

2.3 Deprecated macro: sys:without-scheduling

without-scheduling worked as follows:

(without-scheduling . BODY)

worked like

(let ((*disallow-scheduling* t))
  (progn . BODY))

*disallow-scheduling* is a special variable that gates automatic process switches in non-SMP Lisps. It is checked when other factors indicate it is time to stop the currently running process and let another one have control for a while. If it is nil, then the switch will happen; if non-nil, the switch will be prohibited and the current process will continue running. *disallow-scheduling* finds some specialized uses, as when it is important to honor with-timeout requests (which require handling timer signals, which would be ignored by without-interrupts) without allowing a process switch to occur.

There is no exact replacement for sys:without-scheduling since the model for processes is changed and the concept of process switching is nearly without meaning. If you use sys:without-scheduling, the question is why? If the purpose was object locking (ensuring that values are not read or written by other processes while code runs), you have to use the new object locking routines. If the purpose was to ensure that the process ran to completion but also allowed for signal processing (for sys:with-timeout, for example), that is no longer supported.

We provide new macros for efficient object-level synchronization. Some of these involve locking objects, and others are atomic read-modify-write primitives. We also provide a set of macros to perform those functions of the deprecated macros that did not serve to synchronize access to specific objects.

A Lisp that actually supports SMP has :smp on its *features* list. A Lisp that supports these new macros has :smp-macros on its *features* list. Thus a 9.0 smp-enabled Lisp has both :smp and :smp-macros on its *features* list, while an 8.2 Lisp, a patched 8.1 Lisp, or a non-SMP 9.0 Lisp have the feature :smp-macros but not :smp. See Reader macros and cl:*features* in implementation.html for more information on features.

All the symbols naming the functions, macros, classes and structure types added as part of SMP are exported from the excl package.

3.1 Non-synchronizing usages

The first three new macros provide the non-concurrent-access-related uses of the old excl::atomically and without-interrupts macros.

fast-and-clean: replaces excl::atomically. It provides compile-time information about the code it wraps.
with-pinned-objects: this replaces another use of excl::atomically, which was to ensure that objects did not move because of a gc. excl::atomically could ensure a gc did not happen. Gc's only happen when all processes permit it, and the process running with-pinned-objects will not permit a gc and will not fail because it does not allocate.
with-delayed-interrupts: this replaces without-interrupts. It evaluates and compiles exactly as without-interrupts did. In an SMP Lisp it provides no cross-process protection of shared objects. Its entire purpose in that environment is to protect blocks of code that need to run to completion even in the face of timeouts and possible process-interrupt calls from other processes.

3.2 Atomic read-modify-write primitives

When an atomic operation fits the pattern

(setf [place] (foo [place]))

and [place] is one of several inline-accessible 'slots' like (car x), the compiler can generate code to ensure that the value in [place] doesn't change between the initial read and the subsequent write. There are several macros that provide special cases of this operation, and a general conditional-update form on which all of them are based. [place] can be any of the following setf-places:

(global-symbol-value symbol-form)
symbol
(car cons-form)
(cdr cons-form)
(memref object-form offset-form pos-form size-form)
(memref-int address-form offset-form pos-form size-form)
(svref array-form index-form)
(structure-slot-ref structure-form)
(slot-value x y) (As of 11.0, if there are any added slot-value methods the atomic expansion for this form's inverse now calls slot-atomic-update-using-class, which can be extended by the subclass)
(symbol-plist x)
Local or special bound variables. (Note that atomic setting of these is unusual because local places are not shared between multiple processes.)

These places are a subset of places legal for setf.

The relevant macros are:

incf-atomic and decf-atomic: these expand into code that acts like incf or decf with the same arguments except that the update is atomic with respect to any other processes looking at the same [place]. That is, if several processes simultaneously try to incf-atomic and/or decf-atomic the same location, all operations occur sequentially in some unspecified order.
push-atomic and pop-atomic: these expand into code that acts like push or pop with the same arguments except that the update to [place] is atomic with respect to all other processes looking at the same [place]. That is, any number of simultaneous push-atomic and pop-atomic operations on the same location will act as if they occurred sequentially in some unspecified order.
atomic-conditional-setf: this is the primitive form on which the previous atomic operations are based. It acts very much like (when (eq [place] oldval-form) (setf [place] newval-form) t) except that the subforms of [place] are evaluated just once, the comparison and store, if there is one, are an atomic sequence.
atomic-conditional-setq: this special operator returns a boolean of success or failure to store the value into the symbol's value location. It is intended to be a synchronizing mechanism for symbols which represent variables shared by multiple threads.
update-atomic: this macro expands into code that does a read-modify-write operation that is atomic with respect to the value in a place.
globalq: (globalq x) is a macro shorthand for (system:global-symbol-value 'x).
defvar-nonbindable: this acts like a defvar, with the additional function of flagging symbol so that it cannot be lambda-bound.
get-atomic-modify-expansion: this function is called by the atomic-modify macro expanders to find out how to treat place-form. It returns six values.

3.3 Concurrency control for shared objects

None of the deprecated macros provide effective concurrency control in an smp lisp. New macros are provided to give the programmer tools for controlling concurrent access. There is no way to make the change to smp automatic and painless. The crux of the problem is that in the absence of smp, one-sided concurrency control works. That is, a process modifying a shared object could wrap a series of modifications with pre-8.2 forms like those using without-interrupts, and be assured that all processes would see those changes as an atomic transaction. The forms themselves assured that no other process would see the object until all the changes were in place.

A confounding factor is that pre-8.2 concurrency was very coarse-grained. Processes were preempted at a very low frequency, typically once in 2 seconds, unless they explicitly waited on some condition. Even with no attempt at concurrency control, it was rare to see an inconveniently-timed process switch break a transaction that should have been atomic.

In an SMP Lisp, concurrency is as fine-grained as it could possibly be. Writers and readers must all cooperate to ensure that access to shared structures is legitimate.

Allegro CL has a set of object-locking macros. These are lighter weight and less flexible than the mp:process-lock features, and do not rely on the mp package. They must be used with care to avoid deadlocks.

The macros and related functionality are:

with-locked-structure: expands into code that waits until the executing process acquires control of its argument structure object, then evaluates the forms in the body as an implicit progn. It then releases control of the object and returns whatever values the last form in the body returned.
synchronizing-structure: this structure class is used as a base structure when instances of a structure class are to be locked with the with-locked-structure macro.
basic-lock: this structure class is a simple extension of synchronizing-structure, adding a name slot.
with-locked-object: this macro expands into code that waits until the executing process acquires control of the object, then evaluates the forms in the body as an implicit progn. It then releases control of the object and returns whatever values the last form in the body returned.
with-object-lock-released: this macro can only appear lexically inside the body of a with-locked-object macro. The lock obtained in the immediately enclosing with-locked-object macro is released, body is executed, and then the lock is re-obtained.
lockable-object: this is a mixin class that allows subclasses to be used in the with-locked-object macro.
with-locked-stream: like with-locked-object, but applies to objects that are streams.
critical-section: this macro expands into code that uses a unique basic-lock associated with the expansion to control execution of the body code. Only one process at a time can acquire ownership of the lock and execute the forms in body. Ownership of the lock is relinquished when control leaves body by falling through or by non-local exit. The result of evaluating the critical-section code is the result of the last form in body.

3.4 Sharable locks

A sharable lock allows multiple user threads to access a resource simultaneously when it is safe and appropriate to share it. When exclusive use of the resource is required, a thread can ask for exclusive access. During exclusive access, all shared access is blocked. A typical use case is a data structure that can be safely accessed in read-only mode by many threads but requires exclusive access while the data structure is modified and may be temporarily in an inconsistent state.

A sharable-lock instance is created with make-sharable-lock. Locking contexts are provided with the macros with-sharable-lock. An application can manage locking explicitly with sharable-lock-lock and sharable-lock-unlock.

An important parameter in the definition of Allegro CL sharable locks is the max-shared count that specifies the maximum number of simultaneous shared users of a lock. When the maximum number of simultaneous shared users is reached additional attempts to share the lock must wait until some shared threads release control.

Once a thread requests exclusive access, subsequent request for shared access must wait until the exclusive request is honored and completed. Because there is a limit on the number of shared users, there is an upper bound on the delay that the exclusive request can endure. While an exclusive locker is running, shared requests are accepted until the max-shared limit is reached. These shared requests will be honored as soon as the exclusive user releases the lock. If another exclusive request arrives before the first is done, it will wait until the first exclusive request and up to max-shared shared requests are honored.

If several threads compete for exclusive access, it is possible to create a situation where an exclusive request is delayed indefinitely. The only solution in this case is to reduce the frequency of exclusive requests or to speed up their handling. It is also possible to delay shared requests indefinitely if too many threads compete for shared access. Such a situation can be alleviated by increasing the max-shared value. The size of a sharable lock is proportional to the max-shared value, but values in the thousands are still reasonable.

In the current implementation of sharable-lock, the fairness policy is strong writer preference. When an exclusive lock is requested, only the current shared users are allowed to continue. New requests for a shared (reader) lock must wait until the exclusive (writer) request is honored and completed. While an exclusive lock is in effect, shared requests up to the max-shared limit are allowed but remain pending until the exclusive lock is released.

Sharable-lock functionality

sharable-lock: the class of a sharable-lock object.
exclusive-locker-count, Generic Function
make-sharable-lock, Function
sharable-lock-idle-p, Generic Function
sharable-lock-lock, Generic Function
sharable-lock-unlock, Generic Function
shared-locker-count, Generic Function
with-sharable-lock, Macro
with-exclusive-lock, Macro (Use of this macro is deprecated)
with-shared-lock, Macro (Use of this macro is deprecated)

Errors signaled from sharable-lock operators:

3.5 The barrier API

A barrier instance allows several threads to share progress information or to synchronize their behavior. A newly created barrier is initialized in the enabled state with an expected count and an arriver count of zero. As each thread arrives at or passes through the barrier, the arriver count is incremented. All barrier-wait calls will block until the arriver count reaches the specified expcted count.

The state of a barrier can set to disabled. A disabled barrier is simply ignored -- passing through a disabled barrier has no effect, and call to barrier-wait returns immediately.

A typical application of barriers is to synchronize several threads to start some computation simultaneously. The program creates a barrier with a count of N; each of the N threads calls barrier-wait and blocks. When the slowest thread reaches the barrier, all the threads are released.

Another application of barriers is to allow one thread to detect when several threads have completed some task. The control thread creates a barrier with a count of (N+1) and calls barrier-wait; each of the N worker threads calls barrier-pass-through when it completes the task. When the Nth thread calls barrier-pass-through, the barrier-wait in the control thread returns.

Barrier API functionality

barrier, class
barrier-name, generic-function
barrier-arriver-count, generic-function
barrier-change-count, generic-function
barrier-enable, generic-function
barrier-disable, generic-function
barrier-pass-through, generic-function
barrier-unblock, generic function
barrier-wait, generic-function
make-barrier, function

3.6 Queues

The queue class and the functions for using it are thread-safe, but relatively heavy-weight. Because they use process-lock to control access to the queue, they use more resources than something based on with-locked-object.

A reasonable development approach would be to use an instance of mp:queue for transmitting information from producers to consumers, with the option of later coding a lighter-weight application-specific mechanism if performance tests showed the queue to be a significant cost factor.

3.7 Condition Variables

Condition variables are another way to synchronize the behavior of multiple threads. They are normally used in conjunction with a lock that controls the modification of some data structure.

The Allegro CL implementation of condition-variable is based on the definition of POSIX condition variables. The relevant functionality is implemented with the following class and functions.

condition-variable: class.
make-condition-variable: function.
condition-variable-wait: function.
condition-variable-wait-count: function.
condition-variable-signal: function.
condition-variable-broadcast: function.

4.0 Thread-safety

An operator which accesses or sets the value stored in an object is thread-safe when it can be called in a number of processes, otherwise unsynchronized by user code, and the result is the same as if the operations were done sequentially in a single process though in an unspecified order. Therefore, thread-safe operations will not error and will not produce bogus results due to a race condition (results that could never have occurred no matter in what order the operations were done). But thread-safe operations may produce data races, i.e. they will not necessarily produce the same results in repeated runs. See Race conditions vs data races in implementation.html for more information about this distinction.

Thread-unsafe operators are those which invite race-conditions; they could unexpectedly error or could produce results which are bogus according to the definition above.

For operators that do not make any changes (i.e. do not set values), thread-safety means that the result will be meaningful and correct for some instantaneous state of the data structure. Each thread will see a data structure in a consistent state. Operators that modify data are thread-safe if they see data structures in some instantaneous consistent state, and leave the data structure in a consistent state.

It is important to emphasize, so we will say it again, that thread-safe operators are not guaranteed to produce the same result in repeated runs, just a result that is consistent with some possible sequence of evaluation, produced by some interleaving of the operations from the multiple processes.

Thread-safe operators and operations

gethash, (setf gethash), and remhash: these are thread-safe operations. You will always get a result that could have arisen from the operations done in some possible sequence in a single process. (A possible sequence is one that interleaves the sequences from each individual process, preserving the order of calls in each specific process.) The result from the combination of sequences constitutes a Data Race, but not a Race Condition.
Property list functions get, (setf get), getf, remf, remprop: these are thread-safe so long as all processes modifying the property list use these operators. If a process modifies the property list using some other operator (such a (setf car) or (setf nth)), then the listed functions are not thread-safe.

(setf getf) is thread-safe only for places that are thread-safe for incf-atomic. When the otherwise undocumented excl::*warn-smp-propertylist-limitation* is true, the lisp will warn about thread-unsafe calls when the code is macroexpanded.

Thread-unsafe operators

Basically any operator that modifies a value is thread-unsafe except those listed above and the atomic operators described elsewhere in this document. In particular sequence functions and their setf's are thread-unsafe. Operators like pushnew, which is supposed to guarantee that duplicates are not added is thread-unsafe and may produce results that have unexpected duplicates.

Thread-safe modules

The jLinker (see jlinker.html) and the dbi (or aodbc) modules (see aodbc.html) are thread-safe in SMP or non-SMP Lisps.

4.1 Thread-unsafe standard Lisp operations: features, modules, require/provide, external-format loading, etc.

Certain standard Common Lisp operations and some standard Allegro CL operations cannot conveniently be made thread-safe. Here we summarize many of these issues.

Modification of global variables. This must be handled by each application in its own way. There cannot be an automatic built-in procedure. For example, pushnew to *features* or *modules* is not thread-safe.
defvar is not atomic and therefore not thread-safe. Since defvar occurs mostly in files, thread-safety is managed with the same strategy as require/provide below.
sys:*all-processes*: the variable sys:*all-processes* must be treated with the caution described on the documentation page.
The activities of require/provide/load cannot be made thread-safe. Applications must use their own protocols or locks to make sure files are loaded sequentially. Autoloading is safe because in Allegro CL there are internal locks and constraints on the content of autoloaded files. Unfortunately, the locks and constraints cannot be safely exported or generalized.
External format loading is not thread-safe. Applications must arrange for the sequential loading of external formats.
All definition operators (defun, defmacro, etc.) are non-atomic and unsafe.
ALL MOP operations that modify class structure or method dispatch are non-atomic and unsafe.
Some make-instance calls are unsafe. See the note just below.

Note on thread-safety of make-instance

Consider a compiled call to make-instance where the class and all the keywords are constants. If such a call is compiled or loaded into a Lisp image where the class is not yet defined, then the first execution of this call is not thread-safe. The workaround is to avoid such calls. The following fwrapper can be used to warn if such code is loaded (see fwrappers-and-advice.html for information on the fwrap facility):

   (def-fwrapper warn-cons (class-or-name inits)
     (when (and (symbolp class-or-name)
                (null (find-class class-or-name nil)))
       (warn "Delayed constructor call for ~S" class-or-name))
       (call-next-fwrapper))

   (fwrap 'excl::ensure-constructor :warn-cons 'warn-cons)

4.2 SMP and the MOP

The MetaObject Protocol (MOP) implementation in Allegro CL makes no effort to support multithreaded use of objects and generic functions while a class definition is being changed. Dispatch functions work with multi-threaded accesses that might update their caches, but keeping things straight when a class is being redefined is not manageable.

An application that updates a class definition while it is running should be designed to ensure that none of the objects are accessed, even for generic-function dispatch, until the redefinition is complete.

Note however that changing a class definition is not necessarily a direct result of a call to change-class. Because of lazy-evaluation techniques implied by CLOS and the MOP, something as simple as evaluating make-instance can cause finalizations on classes in the new instance's hierarchy (which in turn constitute changes to these classes).

So the rule-of-thumb for mixing smp and MOP changes is: Don't Do It! Be sure all CLOS class have become stable before allowing multiple threads to access them.

4.3 Stream thread safety

Streams are complex objects; operations on them are not generally thread safe. However, a mechanism has been provided that allows stream output to be interleaved, rather than haphazardly mixed character by character, and without destroying the integrity of the stream. The with-staged-output macro provides a way for multiple threads to write to a single stream. The stream must have been marked as synchronized-output-stream-p

4.4 Suspending all processes

It is not possible to suspend all threads programmatically. (All processes are suspended in a garbage collection but this cannot be generalized.)

5.0 Ensuring code is not loaded into an SMP Lisp

Some code you may have may not be SMP safe. However, this code will typically load and run in an SMP Lisp (probably producing incorrect results, perhaps silently). If you have a source file which should not be run in an SMP lisp, add this form to the top:

   (eval-when (compile load eval)
     (when (featurep :smp)
       (error "This cannot run in an SMP lisp")))

6.0 Memlog: A Facility for Minimally Intrusive Monitoring of Complex Application Behavior

The Lisp trace facility is a powerful and effective tool for monitoring and analyzing the behavior of a complex application. But when the application uses many threads, the trace output suffers from several drawbacks:

The generation of output to a stream introduces significant timing disruptions in the application.
Unsynchronized trace output can be garbled and unreadable.
If trace output is synchronized, the timing disruption is even more severe.

The memlog module implements a facility for making log entries into a memory data structure. By avoiding i/o operations a memory logging facility can have much less impact on the timing of the monitored application.

The run-time operations in the memlog module are carefully designed and implemented to be thread-safe and effective in both SMP and non-SMP environments.

The trace definition operations, memtrace and memuntrace, like many other operations that alter the state of the running Lisp system (MOP, fwrap, require, provide, ...), cannot be made thread-safe at any reasonable cost.

The operators associated with this facility are the following. Evaluate (require :smputil) to ensure the definitions are loaded into Lisp. All symbols naming these operators are in the multiprocessing package.

memlog (Function)
memtrace (Macro)
memtrace-def (Macro)
memtrace-do (Macro)
memuntrace (Macro)
memlog-init (Function)
memlog-start (Function)
memlog-stop (Function)
memlog-copy (Function)
memlog-show (Function)
memlog-disable (Function)
memlog-state (Function)
memlog-entry (Function)

Example

;; We define FACT which calculates the factorial of its (positive integer)
;; argument. We first use traditional tracing. We are in the IDE, and tracing
;; output (by default) goes to the listener for operators called in the Listener
;; and to the console for operatiors called in other thread. 
;; Then we stop tracing and initiate memlogging. Now output goes to a memlog object
;; and we can choose to see all output or just that relating to a particular
;; thread.
;;
cg-user(4): (defun fact (n) (if (zerop n) 1 (* n (fact (1- n)))))
fact
cg-user(5): (trace fact)
(fact)
cg-user(6): (fact 3)
 0[4]: (fact 3)
   1[4]: (fact 2)
     2[4]: (fact 1)
       3[4]: (fact 0)
       3[4]: returned 1
     2[4]: returned 1
   1[4]: returned 2
 0[4]: returned 6
6
cg-user(8): (mp:process-run-function "ftest" #'fact 3)
#<multiprocessing:process ftest(5) @ #x2028bbd82>

;; In the IDE, trace output (by default) goes to the console when operators
;; are not run in the Listener thread (behavior when not running the IDE is
;; different). Looking at the console, we see:
;;
;; cl-user(1): 
;; 0[5]: (common-graphics-user::fact 3)
;;   1[5]: (common-graphics-user::fact 2)
;;     2[5]: (common-graphics-user::fact 1)
;;       3[5]: (common-graphics-user::fact 0)
;;       3[5]: returned 1
;;     2[5]: returned 1
;;   1[5]: returned 2
;; 0[5]: returned 6
;;
;; We stop all tracing:

cg-user(10): (untrace)
nil

;; And we start MEMLOGGING:
;; First we load the module:
;;
cg-user(11): (require :smputil)
; Fast loading C:\acl100b2.64\code\smputil.fasl
t

;; We create a mlog object:

cg-user(12): (setq mlog (mp:memlog-init))
:stopped

;; We memtrace FACT (like TRACE, MEMTRACE arguments are not evaluated):
;;
cg-user(13): (mp:memtrace fact)
#<Interpreted Function fact>

;; We start memtracing:

cg-user(14): (mp:memlog-start)
:started
cg-user(15): (fact 3)
6

;; Now we examine the data collected. Note the format is different but
;; all the information is there.
;;
cg-user(17): (mp:memlog-show)
    4 Listener 1        35907 fact 1 3 nil nil 
    4 Listener 1        35907 fact 2 2 nil nil 
    4 Listener 1        35907 fact 3 1 nil nil 
    4 Listener 1        35907 fact 4 0 nil nil 
    4 Listener 1        35907 fact -4 1 nil nil 
    4 Listener 1        35907 fact -3 1 nil nil 
    4 Listener 1        35907 fact -2 2 nil nil 
    4 Listener 1        35907 fact -1 6 nil nil 
#S(multiprocessing::memory-log :state :stopped :gate #(549756110504 549756110504 0) :yield-rate 5 :yield t :skipped ...)

;; We run FACT in another process:

cg-user(19): (mp:process-run-function "ftest1" #'fact 3)
#<multiprocessing:process ftest1(5) @ #x2028dea52>

;; MEMLOG-SHOW with no arguments shows all data collected (and still present --
;; when the max size is reached, old data drops out but that has not happened
;; yet):
;;
cg-user(20): (mp:memlog-show)
    4 Listener 1        35907 fact 1 3 nil nil 
    4 Listener 1        35907 fact 2 2 nil nil 
    4 Listener 1        35907 fact 3 1 nil nil 
    4 Listener 1        35907 fact 4 0 nil nil 
    4 Listener 1        35907 fact -4 1 nil nil 
    4 Listener 1        35907 fact -3 1 nil nil 
    4 Listener 1        35907 fact -2 2 nil nil 
    4 Listener 1        35907 fact -1 6 nil nil 
    5 ftest1           146090 fact 1 3 nil nil 
    5 ftest1           146090 fact 2 2 nil nil 
    5 ftest1           146090 fact 3 1 nil nil 
    5 ftest1           146090 fact 4 0 nil nil 
    5 ftest1           146090 fact -4 1 nil nil 
    5 ftest1           146090 fact -3 1 nil nil 
    5 ftest1           146090 fact -2 2 nil nil 
    5 ftest1           146090 fact -1 6 nil nil 
#S(multiprocessing::memory-log :state :stopped :gate #(549756117928 549756117928 0) :yield-rate 5 :yield t :skipped ...)

;; We can restrict what is shown to a specific thread:
;;
cg-user(21): (mp:memlog-show :thread "ftest1")
    5 ftest1           146090 fact 1 3 nil nil 
    5 ftest1           146090 fact 2 2 nil nil 
    5 ftest1           146090 fact 3 1 nil nil 
    5 ftest1           146090 fact 4 0 nil nil 
    5 ftest1           146090 fact -4 1 nil nil 
    5 ftest1           146090 fact -3 1 nil nil 
    5 ftest1           146090 fact -2 2 nil nil 
    5 ftest1           146090 fact -1 6 nil nil 
#S(multiprocessing::memory-log :state :stopped :gate #(549756161960 549756161960 0) :yield-rate 5 :yield t :skipped ...)

;; We stop memlogging:
cg-user(22): (mp:memlog-stop)
:stopped
cg-user(23):