FunctionPackage: systemToCDocOverviewCGDocRelNotesFAQIndexPermutedIndex
Allegro CL version 10.1
Minimal update since the initial 10.1 release.
10.0 version

reap-os-subprocess

Arguments: &key (pid -1) (wait t)

This function replaces os-wait, which has the same functionality. This function has a better interface (using keyword rather than optional arguments).

If a process is started by the run-shell-command with the wait keyword argument to that function nil, then the process will remain in the system after it completes until either Lisp exits or Lisp executes reap-os-subprocess to inquire about the exit status. To prevent the system becoming clogged with processes, a program that spawns a number of processes with :wait nil must be sure to call reap-os-subprocess after each process finishes.

The arguments, as described below, specify what reap-os-subprocess does and what processes it looks for but the function always returns three values:

  1. The exit status of the reaped process or nil, if no process was reaped. The cases when no process is reaped are described below (basically, there are not suitable processes to be reaped).
  2. The pid of the reaped process, or, when no process was reaped, the argument pid in some cases and nil in other cases. When a process is reaped, this second returned value is its process id. When there are suitable processes running but none to reap, and the wait keyword argument is nil, the pid argument is returned as the second value. When there are no suitable processes running or suitable to reap, the second returned value is nil. (The first returned value tells you whether or not a process was reaped. If the first returned value is nil, no process was reaped regardless of the second returned value.)
  3. On UNIX, the number of the signal that caused the reaped process to exit, and nil if no process was reaped, and always nil on Windows. This third returned value was added in release 7.0. On UNIX/Linux platforms, it allows you to ensure the process exited normally. A status of 0 alone does not guarantee that: you must also examine the signal number. On Windows, that information is not available and nil is always returned as the third value.

Exactly what reap-os-subprocess does depends on the status of spawned processes and the keyword arguments. The pid argument controls what processes might be considered by reap-os-subprocess. If pid is -1 (the default), all processes are considered. If pid is 0, only processes in the same process group as the executing Lisp image are considered. If pid is a positive integer, only the process with that process id is considered. In the rest of this description, processes means `processes considered by reap-os-subprocess'. See the Unix system documentation of the waitpid() system call (you can usually see this by typing "man waitpid" at a Unix prompt).

If there are any processes started by run-shell-command with the argument :wait nil which have exited but for which reap-os-subprocess has not been run, one of them is selected by the operating system and is reaped, with reap-os-subprocess returning three values: (1) the exit status of the reaped process; (2) process id of the reaped process, and (3) the number of the signal that caused the reaped process to exit on UNIX/Linux platforms, and nil on Windows.

If there are no such processes which have exited but there are processes which are still running, then the behavior of reap-os-subprocess depends on the wait keyword argument. If it is t (the default), sys:reap-os-subprocess will wait (disabling multiprocessing, if necessary) until one of the running processes exits. Then reap-os-subprocess returns that process's status; id; and the relevant signal number on UNIX/Linux and nil on Windows.

If wait is nil (the default is t), reap-os-subprocess will immediately return three values: nil, nil, and nil if there are no processes to clean up; a status, pid, and signal number or nil if the process with number pid is cleaned up; nil, the pid argument to reap-os-subprocess, and nil if processes are still running and none has yet exited.

If there are no running processes, reap-os-subprocess returns immediately with the three values nil, nil, nil.

This function simply calls the Unix system function waitpid with the pid and nohang flags. Its behavior is determined by the behavior of that function.

Here are some examples of return values:

  1. nil nil nil: there are no processes, running or not, suitable to reap (based on the value of the pid argument). The wait keyword argument can be true or nil.
  2. nil -1 nil: there are processes that are suitable to reap but they are all still running, so none reaped; pid was -1 (the default, so unspecified or specified -1) and it is returned as the second value, as specified when there are suitable processes but none ready to reap. wait must be nil.
  3. nil 0 nil: same as 2 (processes suitable to reap but all still running). pid was specified as 0 (meaning only look for processes in the same group as the Lisp process), so 0 is the second returned value. wait must be nil.
  4. nil 12345 nil: as in 2 and 3, there is a processes suitable to reap (the process with pid 12345) but it is still running. pid was specified as 12345 (meaning consider the process with that pid, if it exists), so 12345 is the second returned value. wait must be nil.
  5. 0 12345 nil: process with pid 12345 reaped, it exited with status 0, no signal caused it to exit (or Lisp is on Windows). The wait keyword argument can be true or nil.
  6. 0 12345 9: process with pid 12345 reaped, it exited with status 0, signal 9 caused it to exit (and Lisp is not on Windows). The wait keyword argument can be true or nil.

A note on the order of execution when reading from a program to be reaped in a non-multiprocessing environment

Code written similar to the following skeleton of code may hang:

(multiple-value-bind (shell-stream error-stream process)
    (excl:run-shell-command cmd
       :input :stream :output :stream :error :stream)

  (when process
    (loop (when (sys:reap-os-subprocesses :pid process :wait nil)
	     (return))))

  ;; now read from shell-stream and then close the streams

  )

In the code sample, the process is reaped prior to reading the process output. While this often works, because many programs don't bother to wait for all of their writes to complete before exiting, it may cause hanging if the pipe to which the data is sent fills up and thus not all data can be written until some reading is done, or the child program waits until each input has been read before writing more data. Some operating systems will cause select() to not return ready status on the output descriptor if any data at all is in that pipe, regardless of whether a call to write() would have succeeded.

The correct outline for the code is:

(multiple-value-bind (shell-stream error-stream process)
    (excl:run-shell-command cmd
       :input :stream :output :stream :error :stream)

  ;; do all the reading from shell-stream

  (when process
    (loop (when (sys:reap-os-subprocesses :pid process :wait nil)
	     (return))))

  ;;  close the streams  
)

When you are using multiprocessing, you can use multiprocessing tools such as process-wait to ensure that no hanging occurs.

See os-interface.htm for information on running shell programs.


Copyright (c) 1998-2022, Franz Inc. Lafayette, CA., USA. All rights reserved.
This page was not revised from the 10.0 page.
Created 2019.8.20.

ToCDocOverviewCGDocRelNotesFAQIndexPermutedIndex
Allegro CL version 10.1
Minimal update since the initial 10.1 release.
10.0 version