ToC

DocOverview

CGDoc

RelNotes

FAQ

Index

PermutedIndex

Allegro CL version 10.1
Unrevised from 10.0 to 10.1.
10.0 version

Streams in Allegro CL

This document contains the following sections:

1.0 Simple-stream introduction
2.0 Simple-stream background
   2.1 Problems with Gray streams
   2.2 A new stream hierarchy
3.0 The programming model
   3.1 How to get a simple-stream and how to get a Gray stream
   3.2 Trivial Stream Dispatch
   3.3 Simple-stream Description
4.0 Device Level Functionality
   4.1 Device Interface
5.0 Implementation of Standard Interface Functionality for Simple-Streams
   5.1 Implementation of Common Lisp Functions for simple-streams
   5.2 Extended Interface Functionality
      5.2.1 Blocking behavior in simple-streams
      5.2.2 The endian-swap keyword argument to read-vector and write-vector
   5.3 Force-output and finish-output policy
6.0 Higher Level functions
7.0 Simple-stream Class Hierarchy
8.0 Implementation Strategies
9.0 Control-character Processing
10.0 Device-writing Tips
   10.1 Defining new stream classes
   10.2 Device-open
   10.3 From-scratch device-open
   10.4 Implementation Helpers for device-read and device-write
   10.5 Other Stream Implementation Functions and Macros
   10.6 Details of stream-line-column and charpos
11.0 The simple-stream class hierarchy illustrated
12.0 Encapsulating Streams
   12.1 Encapsulation terminology
   12.2 Strategy descriptions necessary for encapsulation
   12.3 Valid connections between octet-oriented and character-oriented streams
   12.4 Examples of stream encapsulations
      12.4.1 Rot13b: An Example of Bidirectional Stream Encapsulation
      12.4.2 Base64: an example of binary stream encapsulation
   12.5 Encapsulating composing external-formats
Appendix A. Built-in stream methods and their uses
   Appendix A.1. The print-object built-in stream method
Appendix B. peek-byte and unread-byte

The Allegro CL streams model uses simple-streams, which formally have no element-type but act in general as if they have element type (unsigned-byte 8). The implementation is described in this document. An older stream implementation called Gray streams is still supported. It is described in gray-streams.htm

A simple-stream is created whenever a file is opened (with open) without an element-type specified. If open is called with an element-type specified, a Gray stream is created, except for string output streams, as noted next.

String output streams are always simple-streams even though both make-string-output-stream and with-output-to-string accept an element-type keyword argument. Both operators create simple-streams regardless of whether a value is specified for the element-type keyword argument or not.

The transition from Gray streams to simple-streams should be easy and transparent unless you have done extensive stream customization.

It is unlikely that users who are not concerned with stream details will have to concern themselves with this document. Common stream usages, such as opening files for reading and writing, will simply work as expected.

Symbols naming stream functionality that are not standard Common Lisp symbols are generally in the :excl package.

1.0 Simple-stream introduction

The standard Allegro CL stream implementation uses simple-streams. Also supported are Gray streams (see gray-streams.htm. Both kinds of stream may co-exist in a lisp, and compatibility is maintained for the standard Common Lisp streams interface, the Gray streams implementation maintains its compatibility with previous Gray versions with minimal source intervention when they have been used.

The Allegro CL simple-stream is the next generation evolving from the stream of the same name in Common Graphics (the windowing systems used by Allegro CL on Windows). The Common Graphics simple-stream was created as a basis for window-based input and output. The new generation of simple-stream is designed to include this window-based I/O and at the same time to retain the speed always expected for non-window based streams. It is also designed to promote further advances in technology requirements such as International Character sets and server-based I/O.

2.0 Simple-stream background

We felt a new stream implementation was needed because of problems we found with the Gray stream design. In this section, we describe these problems and describe the new implementation.

2.1 Problems with Gray streams

Gray streams distinguish input and output directions per class, forcing combination and mixins in order to model the 3 different modes (input only, output only, and input/output) for various stream classes.
Gray streams, in accordance with CL, distinguish streams by element-type. We have found this to be an unfortunate limitation, since it makes it hard to transfer varying-width elements in the same stream. Allegro CL 5.0.1 introduced bivalent streams, which allow both varying-width elements and character elements to be transferred. This was a step in the right direction, and enables web servers to be written more easily and efficiently.
Gray streams methods, which define the specific streams implementation, are defined immediately below the level of the CL streams interface, which causes a couple of problems:
1. The CLOS dispatch is performed at a higher level than is necessary, thus creating inefficient instruction execution paths that are not easily optimizable.
2. The implementation interface of Gray streams, which is specified using CLOS, overlaps in its behavior, thus causing confusion as to what specializations are needed. For example, the obvious implementation for stream-read-char-no-hang is to call stream-read-char after a call to stream-listen. However, since subclassing a stream can result in a version which does not perform this listen/read combination, further subclassing is not possible without having the source to this version, since it is otherwise not possible to know whether to define a method for stream-read-char-no-hang, or for stream-read-char and stream-listen, or perhaps for all three, in which case it would be unknown which code would be actually executed.
Gray streams force the duplication of a large amount of code, for the implementation of the basic functionalities such as stream-read-char. In a sense, this is due to the fact that the implementation level is too high, and this forces the duplication of effort in the implementation.

2.2 A new stream hierarchy

A new class hierarchy of streams with a base-class of simple-stream removes the problems inherent in Gray streams. Simple-streams are more efficient, and are simpler in concept, making it easier to extend the streams interface by object-oriented means.

A major simplification in the simple-stream hierarchy over Gray streams is the collapsing of many class distinctions into one:

Gray streams distinguish input and output directions per class, whereas simple-streams make this distinction with flags.
Gray streams distinguish streams by element-type. Simple-streams have no element-type, but (with the exception of string streams) always act as if transferring octets (8-bit bytes).

The Gray streams class hierarchy distinguishes between different kinds of stream usage. The simple-stream hierarchy distinguishes between different kinds of external I/O device or pseudo-device.

This device-level interface is CLOS oriented, but is at a much lower level than the Gray streams implementation level, making it much more efficient in both execution speed and in space.

3.0 The programming model

3.1 How to get a simple-stream and how to get a Gray stream

The Gray streams implementation is supported along with simple streams. Programs using customized Gray streams will, therefore, continue to work as in earlier releases with only minor changes. (Users of Gray streams must ensure, as described below, that calls to open have the element-type keyword argument specified -- if it is unspecified, give it the value character. Users must also be careful not to assume a stream they did not create is a Gray stream, and that can be done using synonym streams or by loading a compatibility package, again as described below in this section. Gray streams are described in gray-streams.htm.)

The normal mechanism used to specify a simple-stream is to call open (and thus any callers of open, such as with-open-file) to open the stream without specifying the element-type. This will cause a simple-stream to be created, instead of a Gray stream. A Gray stream will be created if an element-type argument of any kind is given to the open call.

A potential problem arises when legacy code requiring Gray streams calls open with no element-type argument. Under CL specification this kind of open causes the element-type to default to character. All CL functionality will be compatible between Gray streams and simple-streams, but if the user was counting on specific Gray stream functionality in character streams, then the open call must be changed to include :element-type 'character as arguments, which will force a Gray stream.

Another problem arises when Gray streams application code assumes that the stream it is handed is a Gray stream, and thus tries to call stream- methods on it. (Allegro CL names the Gray streams CLOS substrate stream-[cl-function-name], e.g. stream-read-char for read-char.) Allegro CL with the new stream implementation solves this problem by defining its synonym-stream implementation as a Gray stream, but in such a way that all calls via the synonym-stream-symbol are non-Gray-specific. For example, a call to stream-read-char to a synonym-stream will result in a call to read-char on the synonym-stream-symbol. This is slightly slower than dispatching on stream-read-char, but it does provide for compatibility with legacy code. A programmer who doesn't want to rewrite a subsystem to use simple-streams can simply ensure that any stream passed to that subsystem is a synonym stream.

If the stream being manipulated is one that is not easily wrapped as a synonym-stream, (e.g. *terminal-io*) a second approach is provided in the form of the module :gray-compat. This module contains methods on simple-streams for generic functions normally associated with Gray streams. If, for example, the call

(stream-read-char *standard-input*)

exists in legacy code, then requiring the :gray-compat module (by evaluating (require :gray-compat)) defines a method on stream-read-char for simple-streams, which simply wraps some argument and return-value processing around a call to read-char. This again is slower than calling read-char directly, but provides compatibility for legacy code.

3.2 Trivial Stream Dispatch

CL stream-functions read-byte, write-byte, read-char, etc., all distinguish in a trivial manner whether the stream is a Gray or simple stream. If a Gray stream is detected, the associated Gray generic function is called for the stream, so that for example, read-char calls stream-read-char, write-char calls stream-write-char, etc. However, if the stream is determined to be a simple-stream, then the specified lower level functionality for the function is called, which may involve calls to specific device-level functionality. This is described in section Section 5.1 Implementation of Common Lisp Functions for simple-streams below.

Note that this trivial dispatch does not use any CLOS dispatch mechanism, and the functionality that is called for a simple-stream may be inlined in the function. Thus, for example, all write-byte operations for a simple-stream are performed without any function calls, unless the buffer fills up and device-write must be called.

3.3 Simple-stream Description

A simple-stream has no specific element-type associated with it. Instead, the fundamental unit of transfer for a simple stream that is not a string stream is the octet (an 8-bit byte), and all transfers are made at the lowest level with respect to octets. It is up to the implementation to decide how to optimize data transfers for particular situations where data paths are either wider or narrower than 8 bits.

A simple-stream is always buffered. Whereas support is provided for buffering for CL and Gray streams, buffering is not explicitly required in these stream specifications. However, the explicit requirement that simple-streams are buffered allow a simpler and potentially more efficient model. Note that there is no direct interface to simple-stream buffers. The buffering layer resides just below the CL interface level, and the device layer is just below the buffering layer.

In the following diagram, we show the function call hierarchy (top to bottom, as higher level units have function calls to lower level units, eventually reaching the lowest layer, which is the device layer) and the data flow (output from Lisp left to right, input to Lisp right to left).

          --------->   output direction   --------->

          <=========   input direction    <=========
                                                                    functional
     User Level              Strategy Level         Device-level      call
                        |                      |                    hierarchy
  --------------------                                                  |
 |  CL functionality  | |                      |                        |
  --------------------                                                  |
   |    |         |     |                      |                        v
   |    |          -------------------------------------              
   |    |               |                      |         |            
   |    |                                                v            
   |    |               |                      |  ------------------- 
   |    |                                        | Control-character |
   |    |               |                      | |   processing      |
   |    |                                         ------------------- 
   |    |               |                      |    |    |   |        
   |    |                           .---------------     |   |        
   |    |               |           v          |         |   |        
   |    |                  ------------------            |   |        
   |    |               | | external-format  | |         |   |        
   |    |                 |   processing     |           |   |        
   |    |               |  ------------------  |         |   |        
   |    |                        |                       |   |        
   |     -------------------.    |    .------------------    |        
   |                    |   v    v    v        |             |        
   |                       ---------------                   |        
   |                    | |   Buffering   |    |             |        
   |                       ---------------                   |        
   |                    |        |             |             |        
   |                              -----------------------.   |        
    --------------------------------------------------.  |   |        
                        |                      |      v  v   v        
                                                  -----------------   
                        |                      | |   Device layer  |  
                                                  -----------------

String streams bypass the external-format (that is, External File Format, as defined in Common Lisp) processing, since their destinations are not really external.

Programmers will work with simple-streams at various levels, wearing one of three different hats at any one time:

As an applications programmer (or as a user), who calls the standard interface functionality (including standard Common Lisp and related functions described in Section 5.1 Implementation of Common Lisp Functions for simple-streams below).
As a device-level programmer, who extends the stream interface by writing device-level methods for subclassed streams.
As a strategy-level programmer, who implements the standard interface functionality (which calls the device-level functionality). Strategy is an advanced level and most programmers will not need to program at this level (using instead, the tools already provided). Strategy is discussed in this document but explicit strategy-level programming rules and tips are outside its scope.

There is also a set of functions provided which aid in the implementation of the device layer, and which at the same time are themselves User Level functions. These functions allow the design of encapsulating streams, where the encapsulating stream's device level becomes the encapsulated stream's user level. These functions are discussed in Section 10.4 Implementation Helpers for device-read and device-write.

The intended interaction by the user or applications programmer is to work above the buffer level. The user does so by calling standard CL functions. The device-level programmer may define new classes and device-level methods for "drivers" (we are using the word analogously to device drivers that are implemented in operating systems), but even then it is not intended that the user call the device-level methods directly. But note that it is possible to call device-level methods if all of the rules are followed. The strategy-level programmer may design an alternate API that calls the device-level, but it must conform to the requirements that allow the device-level to work properly. The intended role of the API is that it is a thin layer which manipulates its buffer and thus deals with the device layer as little as possible. Such API's are intended to be very fast.

4.0 Device Level Functionality

The device level is called that because it provides an underlying implementation that can be specialized to suit particular kinds of stream connections, in a similar manner to a device driver in an operating system.

Only simple-streams provide a device layer; Gray streams do not. The device layer puts the object implementation of a simple-stream at a lower level than the object layer of Gray streams.

The goals of the device layer are:

to be used with buffering at higher levels
to provide a small functional interface that has no overlap
to use methods that are called as few times as possible (a call to the device interface should be thought of as a call to the operating-system, and thus expensive, so the number of such calls must be minimized).
to use fast method dispatch like the current Allegro CL Gray streams implementation.
to allow encapsulating streams by allowing streams to implement device level functionality.

The device layer is not intended to be called directly, except by strategies for higher-level API interfaces that conform to strategy rules. Such APIs should be very lightweight and fast so that there is no need or temptation to call the device-layer directly. Creators of such higher-level APIs must be especially careful to understand the buffering issues involved, including those described in device-read and device-write.

Note that the device layer can implement whatever kind of connection it is set up to do. Usually this means that it will talk directly to a file handle or file descriptor number. However, the connection can be made to a stream of a different type instead of directly to an operating-system level file. By this means, Java style stream encapsulations can be created by the device-level programmer.

Such encapsulation functionality is done automatically by some functions provided as implementation helpers (see Section 10.4 Implementation Helpers for device-read and device-write).

4.1 Device Interface

Simple-streams are normally opened with device-open and closed with device-close. device-buffer-length returns the desired length of buffers to be allocated for the stream, if any. device-file-position returns a positive integer that is the current octet (8-bit byte) position of the device represented by its argument stream. device-file-length returns the number of octets (8-bit bytes) in the argument stream if possible.

device-read fills a buffer (if possible) with data from its argument stream. device-clear-input clears any pending input on the device connected to its argument stream. device-write writes from the buffer to the argument stream. device-clear-output clears pending output.

A method that doesn't fall under the strict buffer-unaware read-write device methods is device-finish-record. Unlike device-read and device-write, that method may manipulate stream slots, allocate new work spaces, or call out recursively to higher level stream functions. The intention here is to separate the pure fill and flush aspect of device-read and device-write from the more complex aspects of mapping and record-orientation.

The one exception to the buffer-unaware separation in device-read and device-write is when they receive a null buffer argument from the strategy layer, and their start and end arguments are not the same. This will occur if the buffer that would have been passed is the actual buffer of the stream.

Under this circumstance the device-read/device-write method has a little leeway; it must assume that the null buffer argument refers to the appropriate buffer in the actual stream, and must retrieve that argument for use. However, it is free to detach and/or replace the buffer with another of the same size. Also, in the case of device-read, the length of the buffer must be used as the end argument, which will also be nil if the buffer argument is nil (unless end is also eql to start). This flagging of the stream's buffer enables device-read and device-write methods to be written that perform advanced buffer-management and asynchronous read-write operations.

5.0 Implementation of Standard Interface Functionality for Simple-Streams

The first subsection describes the implementation of standard Common Lisp functions that deal with streams. Note that the behavior is usually different for Gray streams (where the CL function usually calls an Allegro-CL-specific associated generic functions) and for simple-streams (on which the CL function usually operates directly).

The second subsection describes additional functions that operate on streams, but are specific to Allegro CL.

5.1 Implementation of Common Lisp Functions for simple-streams

Given the device interface, we can now describe how standard Common Lisp functions and some related Allegro CL functions are implemented in terms of these driver functions. Because the intention of this section is to provide implementation information, but not to describe how to use the functions, usage details such as argument lists are not provided.

open

Function

Package: common-lisp

See open for the ANSI description.

For both Gray and simple streams, open effectively turns into a call to make-instance of a stream class. Additionally, for simple-streams, a shared-initialize after method calls device-open to actually establish the connection with the external device or file. If the device-open call then fails and thus returns nil, then device-close is called immediately with a true abort argument.

A call to open creates a simple-stream when the element-type keyword argument is not specified. A Gray stream is created when the element-type keyword argument is specified.

open has an &allow-other-keys specification, and an &rest argument. This &rest argument forms the basis of the make-instance initargs when it is called via apply.

A special case exists for an open with :direction :probe: this case is not a normal open and does not actually result in a connection of any kind being made. Instead, make-instance is called to make an instance of probe-simple-stream.

close

Generic Function

Package: common-lisp

See close for the ANSI description.

The Gray stream system in Allegro CL implements close as a generic function, which is perfectly legal according to CL, which defines close as a function (i.e. a generic function is indeed a function). However, a generic function implies a specialization capability that does not exist for simple-streams; simple-stream specializations should be on device-close. Besides Gray streams, close can be specialized on streams that are neither Gray or simple-streams. One example of this is Allegro CL's passive socket connection. Because of this, close remains a generic function, but for simple streams is treated as if non-generic, that is simple-streams should not specialize on close, but should specialize on device-close instead. The method for simple-streams simply calls device-close precisely once, and a method for fundamental stream (the top-level Gray stream class) breaks the connection and sets a closed-flag in the stream.

If the abort keyword argument is true, any buffers are cleared without being flushed. If abort is false, then any unflushed buffers are forced out to the device before closing.

read-byte

Generic Function

Package: common-lisp

See read-byte for the ANSI description.

For a Gray stream, read-byte calls stream-read-byte.

Otherwise: If the stream's buffer is empty, an attempt is made to fill the buffer by calling device-read with the blocking argument set to true. If device-read returns -1, then we are at eof; either eof-value is returned or else an end-of-file error occurs.

If the stream's buffer is now not empty, the next octet (8-bit byte) is extracted from the buffer and returned.

This is a blocking function. See Section 5.2.1 Blocking behavior in simple-streams.

read-char

Function

Package: common-lisp

See read-char for the ANSI description.

For a Gray stream, read-char calls stream-read-char.

Otherwise: The external format is called to accumulate (as if using read-byte) as many octets (8-bit bytes) as is necessary to form a character. If an end-of-file is generated by any of the read-bytes, eof processing is done depending on the eof arguments.

If the character that results is a control character (one whose char-code is less than 32) and the control-in table has a function for that character, then it is a function with two arguments (the stream and the character) which is called to interpret the control character at this time. If the control-in function returns, it returns either a character which is processed normally, or nil, which is interpreted as an eof and eof processing is done. Note that the control-in handler must not try to do any reading from the stream at all; the intention for the control-in handler is to translate an already-received character to another, or to perform an operation and return a character. For ligatures and other multiple-character inputs, a composing external-format should be used or created, or else an encapsulation created for such translations.

If we got this far, the character length is recorded for unread-char and the character is returned.

Note that if eof occurs while reading a character, the actions taken by read-char depend on the external-format. The default action, and by far the most common, is to do eof processing. However, the external format may decide to return a character (saved from a previous read-char) or to generate an error.

This is a blocking function. See Section 5.2.1 Blocking behavior in simple-streams.

unread-char

Function

Package: common-lisp

See unread-char for the ANSI description.

For a Gray stream, unread-char calls stream-unread-char.

Otherwise: if the unread-char character-length is set, then place the buffer and file position back to that and unset the unread-char length. Error if the unread-char character-length is not set.

read-char-no-hang

Function

Package: common-lisp

See read-char-no-hang for the ANSI description.

For a Gray stream, read-char-no-hang calls stream-read-char-no-hang.

Otherwise: The external format is called to accumulate (as if using excl::read-byte-no-hang) as many octets (8-bit bytes) as is necessary to form a character. If an end-of-file is generated by any of the byte reads, eof processing is done depending on the eof arguments.

If the character is a control character (one whose char-code is less than 32) and the stream-class specifies interpretation of such characters, it is performed at this time, which may include eof-processing for a control-D.

If we got this far, the character length is recorded for unread-char and the character is returned.

Note that if it is not possible to complete the build of the character, the actions taken by read-char-no-hang depend on the situation:

If there is no data currently available, Enough octets are unread to put the stream into a similar state as it was before the operation was started, and nil is returned. (similar means that the buffer may have been read during this operation, but that the pointers are set so that the next octet read will be the same as the first octet read for this operation).
If an eof condition exists, it is up to the external-format to decide whether or not to do normal eof processing, generate an error, or return a character. The default action, and by far the most common, is to do eof processing.

This is a non-blocking function. See Section 5.2.1 Blocking behavior in simple-streams.

peek-char

Function

Package: common-lisp

See peek-char for the ANSI description.

For a Gray stream, peek-char calls stream-peek-char. Otherwise: a read-char equivalent is done, followed by an unread-char.

listen

Function

Package: common-lisp

See listen for the ANSI description.

An extra second optional argument, width, is added to listen (the first optional argument is the specified stream argument). width specifies the number of octets to read before returning true, or the value character. Currently, any other value than 1 will be treated as if it were specified as 'character.

For a Gray stream, listen calls stream-listen.

Otherwise: If a character-oriented listen is specified (i.e. width is character), then an attempt is made to build the complete character, as if with read-char-no-hang. If successful, the equivalent of an unread-char is then done and true is returned; otherwise nil is returned. If 1 octet is being listened for, then if the buffer is not empty, true is returned. Otherwise device-read is called with a null blocking argument. If that returns 0, then nil is returned, otherwise true is returned.

If the added optional argument is 1 or not specified, only an octet (8-bit byte) is looked for, otherwise external-format processing is used to attempt to build a character in a non-blocking way; if it is determined that the character can definitely be built, then t is returned. However, the state of the stream is left in such a way that an unread-char can be done even after the listen (as is appropriate).

read-line

Function

Package: common-lisp

See read-line for the ANSI description.

For a Gray stream, read-line calls stream-read-line and processes the return values according to eof-error-p processing.

Otherwise: String buffers are allocated as necessary and read-char equivalent is performed until either a #\Newline or eof is seen. A new string is allocated of the proper length and filled with the copied data from the temporary buffer(s) and then returned along with the missing newline flag.

Note: The read-line functionality can be optimized in the following way: A string buffer is allocated (this first one presumably on the stack) and read-char equivalent is performed until the next #\Newline or eof is seen (or until the buffer is full, at which time new buffers are allocated as necessary). A new string of the proper length is then constructed and filled with the copied data from the temporary buffer(s) and then returned along with the missing newline flag.

Space-efficient variants of read-line

The functions read-line-into and simple-stream-read-line are similar to read-line but also take result string arguments to the the line which is read, thereby causing little or no consing.

read-sequence

Function

Package: common-lisp

Arguments: sequence stream &key start end partial-fill

See read-sequence for the ANSI description. Note that Allegro CL uses the additional partial-fill keyword argument, which is not specified in ANSI CL.

For a Gray stream, read-sequence calls stream-read-sequence.

Otherwise: If the sequence is a string, then for every element of the string, a read-char equivalent is performed. Following the last read-character, the unread-char length is set (instead of at every character read). If partial-fill is true, then a read-char-no-hang equivalent is used instead of read-char, after the first character is read with read-char.

If the sequence is an octet vector (i.e. a vector of (signed-byte 8) or (unsigned-byte 8) elements), then the equivalent of read-vector is performed (but possibly blocking if partial-fill is false - see discussion below).

Any other sequence type generates an error (for a simple-stream).

The partial-fill keyword argument

This argument controls the blocking behavior. See Section 5.2.1 Blocking behavior in simple-streams for a general discussion of blocking.

This argument controls the behavior when there are not enough objects (of whatever is being read) on stream to fill the sequence passed as the first argument (at least as far as end, if given) and no EOF is seen. The ANSI specification for read-sequence requires it to block until the sequence is filled or an EOF is seen. In the Allegro CL implementation, the ANSI behavior (blocking) is observed if partial-fill is nil (the default).

If partial-fill is true, however, read-sequence will block for the first element, but will not block for any elements after the first, and so may return prior to the request being completed.

In all cases, read-sequence returns the index in the sequence of the next element not read.

clear-input

Function

Package: common-lisp

See clear-input for the ANSI description.

For a Gray stream, clear-input calls stream-clear-input. Otherwise: if there is any input buffering in the stream, it is thrown away. Then device-clear-input is called. An additional optional buffer-only argument is added above and beyond the ANSI CL spec which allows only the buffer to be cleared, without necessarily performing any other operations on encapsulations of the stream. This argument is passed to device-clear-input.

write-byte

Function

Package: common-lisp

See write-byte for the ANSI description.

For a Gray stream, write-byte calls stream-write-byte.

Otherwise: If the buffer is full, device-write is called to first write the buffer out, so that the buffer is made empty. An octet (8-bit byte) is expected as input. It is now stored into the stream's (non-full) buffer.

This is the lowest level functionality in the output portion of the CL API functions. Higher level functions which may call this function are: write-char, write-sequence, write-vector. Whether or not these functions actually call write-byte, call an internal but similar function, or expand all of write-byte's functionality inline is not specified.

This is a blocking function. See Section 5.2.1 Blocking behavior in simple-streams.

write-char

Function

Package: common-lisp

See write-char for the ANSI description.

For a Gray stream, write-char calls stream-write-char.

Otherwise: If the character to be output is a control character (one whose char-code is less than 32), the control-out table is consulted for a control-out function for that character. If one exists it is assumed to be a function of two arguments (the stream and the character), and is called for device-level processing. If the control-out function exists and returns non-nil, then no further action is taken for this character since it was handled successfully in the control-out function. If the control-out function does not exist or exists and returns nil, then normal processing continues for that character. Normal processing means that the character is treated as itself, to be sent uninterpreted to the stream.

The external-format functionality currently in effect is called for the character, which may result in any number of octets (8-bit bytes) being generated. These octets are then treated as if write-byte were called for each one, in the order they were received from the external-format processing.

This is a blocking function. See Section 5.2.1 Blocking behavior in simple-streams.

write-string

Function

Package: common-lisp

See write-string for the ANSI description.

For a Gray stream, write-string calls stream-write-string.

Otherwise: For each character in the specified range in the string, the equivalent of a write-char is performed.

write-sequence

Function

Package: common-lisp

See write-sequence for the ANSI description.

For a Gray stream, write-sequence calls stream-write-sequence.

Otherwise: If the sequence is a string, then the equivalent of write-string is performed.

If the sequence is an octet vector (i.e. a vector of (signed-byte 8) or (unsigned-byte 8) elements), then the equivalent of write-vector is performed. Any other sequence type generates an error (for a simple-stream).

This is a blocking function. See Section 5.2.1 Blocking behavior in simple-streams.

terpri

Function

Package: common-lisp

See terpri for the ANSI description.

For a Gray stream, terpri calls stream-terpri. Otherwise: the equivalent of a write-char of #\Newline is performed.

fresh-line

Function

Package: common-lisp

See fresh-line for the ANSI description.

For a Gray stream, fresh-line calls stream-fresh-line. Otherwise: if the stream can be determined to be at the start of a line, then nothing is done and nil is returned, otherwise the equivalent of a write-char of #\Newline is performed.

finish-output

Function

Package: common-lisp

See finish-output for the ANSI description.

For a Gray stream, finish-output calls stream-finish-output. Otherwise: if there is any output in the stream's output buffer, is is written via device-write with a non-nil blocking argument.

Note that since Allegro CL does not queue writes, and since device-write calls are not required to write all of the requested bytes, the current implementation of finish-output loops on device-write calls until all of the unprocessed data are transferred.

See Section 5.3 Force-output and finish-output policy for a discussion of force-output/finish-output policy in Allegro CL.

force-output

Function

Package: common-lisp

See force-output for the ANSI description.

For a Gray stream, force-output calls stream-force-output. Otherwise: if there is any output in the stream's output buffer, is is written via device-write with a blocking argument of nil.

Note that since Allegro CL does not queue writes, and since device-write calls are not required to write all of the requested bytes, the current implementation of force-output is similar to finish-output, in that it loops on device-write calls until all of the unprocessed data are transferred.

See Section 5.3 Force-output and finish-output policy for a discussion of force-output/finish-output policy in Allegro CL.

clear-output

Function

Package: common-lisp

See clear-output for the ANSI description.

For a Gray stream, clear-output calls stream-clear-output. Otherwise: the stream's output buffer is cleared and device-clear-output is called on the stream.

file-position

Function

Package: common-lisp

See file-position for the ANSI description.

For a Gray stream, file-position calls excl::stream-file-position.

Otherwise: For simple-streams that are not string simple-streams, file-positions are always specified as a number of octets (8-bit bytes). For string simple-streams, file-positions are specified as number of characters.

If the position-spec argument is not given, the file position is calculated, possibly involving a call to device-file-position, and returned. Note that the file position may be precached in the stream, and device-file-position may have been called by some other CL functions.

If the position-spec argument is given, then the new file position is calculated and stored. This may involve a call to (setf device-file-position), if the position is outside of the buffer range.

stream-element-type

Function

Package: common-lisp

See stream-element-type for the ANSI description.

For a Gray stream, stream-element-type returns the appropriate value.

For a simple-stream, stream-element-type always returns (unsigned-byte 8).

5.2 Extended Interface Functionality

These additional functions are provided in Allegro CL for operating on streams. Each is described on its own documentation page.

5.2.1 Blocking behavior in simple-streams

There are three modes of blocking behavior when writing items in a sequence to a stream or filling a sequence with items read from a stream. The issue is what to do when (for writing) the entire sequence cannot be written and (for reading) the entire sequence cannot be filled, but no EOF is encountered. (By `the entire sequence', we mean that part specified by start and end if those are supplied.) Here are the modes and a description of what happens in that mode if the whole operation does not complete and no EOF is encountered.

Blocking mode: the system blocks (waits or hangs) until the operation can be completed.
Blocking/Non-Blocking (B/NB) mode: the system only blocks on the first element of the sequence: if it cannot be written or read, the system blocks. If the first element is successfully written or read, and a subsequent element cannot be written or read, the function doing the writing/reading returns, typically with the return value(s) indicating exactly what was accomplished.
Non-Blocking mode: the system never blocks. If an element cannot be written or read, the function doing the writing/reading returns, typically with the return value(s) indicating exactly what was accomplished. (read-char-no-hang is an example of a non-blocking reading function.)

Here is the blocking behavior of the various ANSI CL and Allegro CL functions having to do with writing and reading sequences and individual items.

cl:read-char: blocking. See the discussion in Section 5.1 Implementation of Common Lisp Functions for simple-streams.
cl:write-char: blocking. See the discussion in Section 5.1 Implementation of Common Lisp Functions for simple-streams.
cl:read-byte: blocking. See the discussion in Section 5.1 Implementation of Common Lisp Functions for simple-streams.
cl:write-byte: blocking. See the discussion in Section 5.1 Implementation of Common Lisp Functions for simple-streams.
cl:read-char-no-hang: non-blocking. See the discussion in Section 5.1 Implementation of Common Lisp Functions for simple-streams.
cl:write-sequence: blocking. See the discussion in Section 5.1 Implementation of Common Lisp Functions for simple-streams.
cl:read-sequence: behavior is controlled by the non-standard partial-fill argument. If partial-fill is nil, cl:read-sequence is blocking. If partial-fill is non-nil cl:read-sequence is B/NB. See the discussion in Section 5.1 Implementation of Common Lisp Functions for simple-streams.
write-octets: after the existing contents of the buffer are cleared (the function will block until this is done), B/NB if its blocking argument is true, non-blocking if its blocking argument is nil.
read-octets: B/NB if its blocking argument is true, non-blocking if its blocking argument is nil.
write-vector: after the existing contents of the buffer are cleared (the function will block until this is done), this function is B/NB.
read-vector: B/NB.
device-write: after the existing contents of the buffer are cleared (the function will block until this is done), B/NB if its blocking argument is true, non-blocking if its blocking argument is nil.
device-read: B/NB if its blocking argument is true, non-blocking if its blocking argument is nil.

Note on putting a B/NB function in a loop: the result can be blocking behavior until as many characters are written or read as there are iterations of the loop. B/NB behavior guarantees that one element at least is written or read (or the system blocks until that happens). In a loop, with each pass guaranteeing one element, you can guarantee as many elements as desired.

5.2.2 The endian-swap keyword argument to read-vector and write-vector

The endian-swap keyword argument to read-vector and write-vector allows the byte-ordering to be controlled so as to allow big-endian and little-endian machines to communicate with each other. Each version of Allegro CL has either :big-endian or :little-endian on its *features* list to identify it appropriately. The endian-swap argument is effective only in reads into and writes from vectors that are not strings, and is silently ignored if given when a string is being passed to read-vector or write-vector.

There are three kinds of values that can be given to the endian-swap argument:

A number, which designates a value to logxor into the byte index of the vector being accessed. See the table and discussion below for more information
The values :byte-8, :byte-16, :byte-32, :byte-64, or :byte-128 to indicate the width of the element whose bytes to reverse. For example, :byte-16 swaps every pair of bytes, and :byte-32 swaps every group of 4 bytes. Note that :byte-8 does nothing to the byte-ordering, and is included only for symmetry.
The value :network-order. This value does nothing on big-endian machines, and on little-endian machines, causes bytes to be swapped based on the element-size of the vector being read into or written from. For example, a double-float vector, which has an element width of 64 bits, is swapped on a :byte-64 basis on a little-endian machine.

The byte-swapping mechanism relies on the fact that objects are always aligned on 8 or 16 byte boundaries, depending on whether the lisp is a 32-bit or 64-bit lisp. Therefore, it is inadvisable to specify a numeric value of greater than 7 (in a 32-bit lisp) or 15 (in a 64-bit lisp).

The byte swapping mechanism is in fact implemented by performing a logxor on the current index of the next byte to get out of the vector. The resultant xor'ed index is used as the true byte index into the array. No attempt is made to ensure that the index is valid (within range): it is the user's responsibility to ensure that. This is always ensured if the endian-swap specification matches the element width of the vector (e.g. an (unsigned-byte 16) vector is given an endian-swap value of :byte-16, or a double-float vector is given an endian-swap value of :byte-64).

The following table shows how the bytes actually appear after swapping. Since the swapping is symmetrical, it can be used in either direction, for both reading and writing. Given the natural byte order of bytes A, B, C, D, E, F, G, H to start, the table shows the byte order of the resultant bytes for some example cases:

name           value           order

:byte-8           0    A  B  C  D  E  F  G  H
:byte-16          1    B  A  D  C  F  E  H  G
  ----            2    C  D  A  B  G  H  E  F
:byte-32          3    D  C  B  A  H  G  F  E
  ----            4    E  F  G  H  A  B  C  D
  ----            5    F  E  H  G  B  A  D  C
  ----            6    G  H  E  F  C  D  A  B
:byte-64          7    H  G  F  E  D  C  B  A
                  ...

5.3 Force-output and finish-output policy

There is no explicit requirement by the ANSI CL Spec for an implementation to provide force-output or finish-output calls automatically; the programmer is always responsible for providing enough of these calls to ensure that output is seen at its final destination in a timely manner. However, it is also counter to good stream optimization techniques to call force-output or finish-output; the positive effects of buffering are reduced when flushed by these functions.

Allegro CL provides minimal force-output calls when certain operations are performed, especially on interactive streams. But not all operations cause force-output calls. It might be confusing to observe that write-line will flush the output to an interactive stream, but that write-string will not do so. The division is simple, though, and can be explained easily.

Interactive Streams:

An interactive stream is specifically defined in Allegro CL as one for which interactive-stream-p returns true. In Allegro CL interactive-stream-p is setf'able, so any stream can in fact be interactive. Usually, it makes most sense to set this attribute on a stream that will act as a listener.

Interactive force-output:

An interactive force-output is an operation that calls force-output on a stream if and only if the stream is interactive. If the stream is not interactive, no force output is done.

Output-forcing activity:

Allegro CL attempts to force-output as seldom as possible for minimal acceptable buffer-flushing but with maximal buffer performance.

In functions described in Section 5.1 Implementation of Common Lisp Functions for simple-streams, generally no force-output operations are done. This excludes close, which by virtue of the device-close after method performs a force-output if the abort argument is false. It also obviously excludes force-output and finish-output, which start or perform the buffer flushing explicitly.
All functions listed in Section 6.0 Higher Level functions will perform an interactive force-output, that is, force-output will be called if and only if the stream is interactive, after the outermost such function is finished. If recursive printing is done using print-object methods, no flushing is done unless explicitly requested, except when the outermost print-object method is done (or when the higher-level caller of print-object has finished).
Whenever format is called on an interactive-stream, a force-output is performed after the format string is processed.
Some top-level functions which aid the read-eval-print-loop functionality will call force-output regardless of whether the stream is interactive or not (usually it is the case that the stream is interactive).

No other calls to force-output are guaranteed by Allegro CL, although such force-output calls might be inserted when deemed necessary. However, the general guideline that Allegro CL follows is that in most cases a redundant force-output call is not good, and is thus usually avoided.

6.0 Higher Level functions

These functions are all written as if they call lower level CL functions, and do not necessarily call device-level functionality directly.

7.0 Simple-stream Class Hierarchy

The class hierarchy for streams starts with stream at the head, and implementations which include other stream classes such as Gray streams will place those stream classes as subclasses of stream. In Allegro CL, the only subclasses of stream are simple-stream and fundamental-stream. fundamental-stream denotes a Gray stream.

The simple-stream class hierarchy is divided into three fundamental simple-stream classes (which in turn have subclasses not listed in the diagram), based on the kinds of buffering they do:

          --> fundamental-stream ...
         |      (Gray streams)
         |
stream --+
         |                     --> single-channel-simple-stream ...
         |                    | 
          --> simple-stream --+--> dual-channel-simple-stream ...
                              | 
                               --> string-simple-stream ...

These simple-stream subclasses cannot be mixed. They are intended to implement three styles of input/output in fundamentally different ways.

single-channel-simple-stream
dual-channel-simple-stream
string-simple-stream

8.0 Implementation Strategies

The basic behavior of the Common Lisp functions is described in Section 5.1 Implementation of Common Lisp Functions for simple-streams. That description should be taken on an as-if basis, which means that the specific functions described may not actually be called at all, or else they might be implemented using compiler-macros to call lower-level functions after type inferencing proofs have been established (in other words, the implementation works as if it was implemented as described). However, the device-level interface does not have this freedom; those methods applicable for the stream class must be called in the way specified. This is to guarantee to the device-writer that methods that are written for a particular purpose will indeed be called.

However, the selection of methods to call when appropriate depends on the strategy used. Listed below are various sets of functions that are called for various stream types.

All streams: device-buffer-length, device-clear-input, device-close, device-open
Single-channel: device-clear-output, device-file-length, device-file-position
Direct: device-write (for synchronizing memory)
Non-mapped: device-read, device-write
Dual-channel: device-clear-output, device-finish-record (input only), device-read, device-write
String: device-file-length, device-file-position, device-finish-record (output only), device-read, device-write
Composing: device-clear-output, device-file-length, device-file-position

9.0 Control-character Processing

Whenever a control character (one whose char-code is less than 32) is seen when reading or writing on a stream, a decision must be made as to what to do with these characters. In a "raw" environment, the characters are processed as themselves; when writing they are inserted into the buffer (possibly after translation to octet form) and when reading they are simply returned as Lisp characters (possibly after having been assembled from octet form). In a "cooked" environment, at least some control characters turn into instructions at the device level, and are not inserted into or extracted from the stream as characters.

An example of this is terpri, which is simply a write-char of a #\Newline. On a terminal stream, a terpri simply sends the #\Newline as a character (though its sending may require a column indicator in the stream to be set to 0 as well). However, a window stream should not see a #\Newline at the device-level, instead the action should be to "move the cursor down one line and to the far left side of the window".

The simple-streams design allows for both of these kinds of action. Each stream has two slots, a control-in slot and a control-out slot, which may contain tables of functions that are consulted when the character being read or written is determined to be a control character. The actions taken are as follows:

If the control-out table has a function entry for the control-character being written, that function is called with two arguments: the stream and the character. The control-out function should perform whatever work that it is required to do, and return non-nil, meaning that it is finished processing that character, or else nil, which means that the normal character processing action is taken which inserts the character into the stream.

If the control-in table has a function entry for the control-character that has just been read out of the stream, that function is called with two arguments: the stream and the character. The required actions are taken, and the function returns a new character to substitute (or the old character), or it may return nil to indicate end-of-file. Note that the control-in handler must not try to do any reading from the stream at all; the intention for the control-in handler is to translate an already-received character to another, or to perform an operation and return a character. For ligatures and other multiple-character inputs, a composing external-format should be used or created, or else an encapsulation created for such translations.

Control-tables are built with make-control-table and are stored into the appropriate control-in or control-out slots by device-open. The following standard control handlers and tables are examples of such but are not intended for programmer use.

std-dc-newline-in-handler: takes a stream and a character as arguments and returns the character argument, after side-effecting the stream.
std-newline-out-handler: takes a stream and a character as arguments and returns nil (indicating that further character processing should be done) while side-effecting a slot of the stream.
std-tab-out-handler: takes a stream and a character as arguments and returns nil (indicating that further character processing should be done) while side-effecting a slot of the stream.
*std-control-out-table*: value is a control-table which contains std-newline-out-handler and std-tab-out-handler in their appropriate locations. Users must not modify this table.
*terminal-control-in-table*: value is a control-table which contains std-dc-newline-in-handler in its appropriate locations. Users must not modify this table.

10.0 Device-writing Tips

This section gives some tips for device-writing. It is not comprehensive, and some of the functions and macros it refers to may or may not be documented. The section is Allegro CL specific, but may be taken as a guide for other implementations as well.

10.1 Defining new stream classes

New stream classes may be created which subclass existing classes. If the superclass chosen is a currently instantiable class, such as terminal-simple-stream, file-simple-stream, etc., then the device methods may be used as they are, or they may be called by call-next-method by the more specialized method. If the superclass chosen is one of the three major streams (single-channel-simple-stream, dual-channel-simple-stream, or string-simple-stream) then much of the device functionality will have to be written from scratch. There may be some methods that exist to provide defaults (for example, the default device-buffer-length method specializes on simple-stream to provide a default for all simple-streams). Other methods, such as device-open, have no appropriate default action, and are thus not supplied.

To define a new stream class in Allegro CL, the iodefs module must be required to provide some defining macros. The class may be then defined using def-stream-class:

(require :iodefs) 

(def-stream-class blarg (terminal-simple-stream) 
  ((slot1 :initform nil) 
   (slot2 :initform nil :accessor blarg-slot1)) 
  (:default-initargs :input-handle 
        (error "blarg stream must have a :input-handle arg")))

10.2 Device-open

Each primary method to device-open returns a stream that is fully connected to its device; it can perform all operations intended on that device. When a primary method performs a call-next-method to do a device-open on a less-specific device, that functionality is complete when the call-next-method returns.

For example, suppose a whiz-bang is a type of file which has a header line associated with it, to be internalized and then ignored as data. The whiz-bang stream might be defined as

(def-stream-class whiz-bang 
                 (file-simple-stream) 
  ((header :initform nil :accessor whiz-bang-header)))

The device-open for whiz-bang might call the primary-method for the file, and then do its own work afterward:

(defmethod device-open ((stream whiz-bang) slot-names initargs) 
  (declare (ignore initargs slot-names)) 
  (let ((success (call-next-method))) 
    (when success ;; read and internalize the header 
       (setf (whiz-bang-header stream) (read-line stream)) 
     t)))

Note that:

The stream is fully operational as a file-simple-stream after the call-next-method, unless it returns nil, which indicates that the device-open failed. File operations may thus be performed on the stream.
Whiz-bang operations may be performed on the stream after successful return from device-open at this level. Presumably this might include querying the whiz-bang-header slot for its content or for a print-method.
Before, around, and after methods should not be used to perform initializations that might be used by any more-specific device in its device-open call.

10.3 From-scratch device-open

A device-open that does not call-next-method must perform the following:

It must make the connection with its device, perhaps using an operating system call or other low-level mechanism. This includes setting the input-handle and/or output-handle slot, if appropriate.
It must install the buffer(s) into the stream. Any buffers to be installed are obtained either by finding them in the options list or by allocating them after calling device-buffer-length to determine what length of buffer to allocate. Buffers that already exist in a resourced stream may be reused, if appropriate. For dual-channel streams doing output, the max-out-pos slot must be initialized.
It must install, if appropriate, any control-tables that will be used by the stream (see Section 9.0 Control-character Processing).
It must set the instance flags as appropriate. The instance flags byte is not the same as the flags slot in the stream. The instance-flags can only be seen by inspecting the stream in "raw" mode (see inspector.htm for information on raw mode). The flag bits are accessed very quickly to determine what kind of stream it is: gray or simple, single/dual/string, input and/or output, and possibly xp (i.e. pretty-printing string). A stream that does not have these flags set will not be streamp, even if it is of stream class. The add-stream-instance-flags macro is provided to add appropriate flag bits. The actual format of the flags is not discussed in this document.
It must ensure that the external-format encapsulation shape of the stream is consistent by calling compose-encapsulating-streams. This step is not needed for string streams, or if (setf stream-external-format) is used to perform step #6.
It must set the external-format, based on the options given. Non-generic functions provided are install-single-channel-character-strategy, install-dual-channel-character-strategy, install-string-input-character-strategy, and install-string-output-character-strategy. (The deprecated install-string-character-strategy also does this.) (setf stream-external-format) may be used to accomplish this in a generic way. If the non-generic approach is used, then (sm excl::melded-stream stream) should be passed into these functions, instead of the stream (see sm). The encapsulation shape assurance in step #5 will guarantee that the melded-stream slot holds the correct stream, even if there is no encapsulation (and thus the melded-stream of the stream is itself).

10.4 Implementation Helpers for device-read and device-write

The following two sets of functions allow device-read and device-write methods to be implemented.

Note that the supplied device-read and device-write functions do not generate errors themselves, but pass them back to the higher level for processing. This allows read-octets and write-octets to pass errors back as well, as the implementation of a higher level (encapsulating) device-read and device-write.

The first group of functions do only minimal checking on their arguments. Further, they act as implementation helpers for device level methods and their behaviors are thus not intuitive except at the device level. For those reasons, they should never be used at any level other than the device-level.

The second group of functions can be called at any level.

10.5 Other Stream Implementation Functions and Macros

The following operators are named by symbols exported from the excl package. They are loaded with

(require :iodefs)

They are intended for implementing device-level functionality and should not be used except for that purpose.

add-stream-instance-flags
compose-encapsulating-streams
def-stream-class
funcall-stm-handler
funcall-stm-handler-2
remove-stream-instance-flags
with-stream-class
sm
install-dual-channel-character-strategy
install-single-channel-character-strategy
install-string-input-character-strategy
install-string-output-character-strategy
install-string-character-strategy (use is deprecated in favor of the previous two functions)

10.6 Details of stream-line-column and charpos

A charpos slot exists in every simple-stream. Accessors are implemented for this slot via stream-line-column (which is setfable) and the initarg normally sets the slot to 0.

The intention of this slot is for use as a column indicator, when possible. When the slot is nil, the column is unknown.

When the charpos slot is non-nil, character output functionalities have the effect of incrementing charpos. In many streams, a newline control-out handler will reset the charpos to 0. It is always set to nil when non-character write operations are performed on the stream.

Streams that need to support pretty printing must support an accurate charpos in order to generate correct pretty output. Most streams have control-out handlers which keep charpos accurate when newlines and/or tabs are processed.

For fastest write operations, charpos should be set to nil by device-open, and no control-out handlers which set charpos should be installed into the stream (otherwise the writing of (for example) a newline will cause charpos counting to resume).

11.0 The simple-stream class hierarchy illustrated

The following diagram shows the simple-stream hierarchy, somewhat simplified. To read this listing, note that it is in a simple tree structure. Every node has a list of subclasses immediately indented two spaces to the right and below it. Nodes with the labels sN (that is, s1, s2, etc., for streams), ssN (that is, ss1, ss2, etc., for simple-streams), or gN (that is, g1, g2, etc., for gray streams) are nodes with multiple inheritance, with class names defined somewhere below their first usage. Nodes with labels followed by class names provide the actual definition of those labels. Class names marked by (A) are autoloadable classes.

The Gray stream hierarchy is also illustrated.

Only streams named by exported symbols are included.

For simple-streams, the device-open options are listed for each class which is normally instantiatable. Simple-stream classes which do not have device-open options listed should not normally be subclassed. The [simple-open-options] are listed at the bottom of the diagram.

;;;   The Simple Stream Hierarchy

stream
  ;; major mixins for dpAns:
  file-stream
    s1
    s2   
  string-stream
    s3
    s4
  ;; Simple-streams:
  simple-stream
    probe-simple-stream                    options: filename
    single-channel-simple-stream
      direct-simple-stream
        buffer-input-simple-stream         options: buffer external-format start end
        buffer-output-simple-stream        options: buffer external-format
        ss1
      null-simple-stream                   options: external-format
      s1 file-simple-stream                options: [simple-open-options]  [filename is required]
        ss1 mapped-file-simple-stream (A)  options: [simple-open-options, release-handle]  [filename is required]
    dual-channel-simple-stream
      terminal-simple-stream               options: [simple-open-options]
      socket-simple-stream                 options: [simple-open-options]
      socket-base-simple-stream            options: [simple-open-options]
      excl::hiper-simple-stream
    s3 string-simple-stream
      composing-stream                     options: [none]
      string-input-simple-stream           options: string start end
        ss2
      string-output-simple-stream          options: (string (make-string (device-buffer-length stream)))
        fill-pointer-output-simple-stream  options: (string (error ...))
        excl::limited-string-output-simple-stream
        xp-simple-stream                   options [none]
        annotation-output-simple-stream
        ss2 bidirectional-character-encapsulating-stream  options: (base-stream (error ...))
  ;; Gray streams:
  fundamental-stream (A)
    fundamental-input-stream (A)
      g1
      g3
      g13
    fundamental-output-stream (A)
      g2
      g4
      g14
    fundamental-character-stream (A)
      g1 fundamental-character-input-stream (A)
        g4a
        g7
        g16
        g25
        g26
        g28
        g29
        g30
        g31
      g2 fundamental-character-output-stream (A)
        g4a
        g8
        g17
        g22
        g26
        g27
        g28
        g29 echo-stream (A)
        g30
        g32
    fundamental-binary-stream (A)
      g3 fundamental-binary-input-stream (A)
        g4a
        g10
        g19
        g26
        g28
        g30
        g31 concatenated-stream
      g4 fundamental-binary-output-stream (A)
        g4a excl::null-stream (A)
        g11
        g20
        g26 excl::bdbv-socket-stream (A)
        g28 synonym-stream (A)
        g30 two-way-stream (A)
        g32 broadcast-stream (A)
      excl::binary-socket-stream (A)
        g10
        g11
    excl::socket-stream (A)
      g5 excl::input-socket-stream (A)
        g7 input-terminal-stream (A)
          g9
        g10 input-binary-socket-stream (A)
          g12
      g6 excl::output-socket-stream (A)
        g8 output-terminal-stream (A)
          g9 bidirectional-terminal-stream (A)
        g11 output-binary-socket-stream (A)
          g12 bidirectional-binary-socket-stream (A)
      excl::terminal-stream (A)
    s2 excl::file-gray-stream (A)
      g13 excl::input-file-stream (A)
        g15
        g16 excl::character-input-file-stream (A)
          g18
        g19 excl::binary-input-file-stream (A)
          g21
      g14 excl::output-file-stream (A)
        g15 excl::bidirectional-file-stream (A)
          g18
          g21
        g17 excl::character-output-file-stream (A)
          g18 excl::character-bidirectional-file-stream (A)
        g20 excl::binary-output-file-stream (A)
          g21 excl::binary-bidirectional-file-stream (A)
    s4 excl::string-gray-stream (A)
      g22 excl::string-output-stream (A)
        g23
        g24 ??
        excl::stream-output-stream-circular
      g24 excl::fill-pointer-output-stream (A)
      g25 excl::string-input-stream (A)
    excl::annotation-encapsulation-mixin (A)
      g23 excl::string-output-with-encapsulated-annotation-stream (A)
      g27 excl::xp-stream (A)

;;  simple-open-options:

   filename
   (direction :input)
   if-exists
   if-does-not-exist
   external-format -- always defaults to :default
   input-handle (see device-open)
   output-handle (see device-open)
   mapped (see device-open)
   fn-in (same as input-handle, for compatibility only)
   fn-out (same as output-handle, for compatibility only)

12.0 Encapsulating Streams

Much of this document, streams.htm, discusses CLOS techniques for customizing streams. Allegro CL supports another way to customize streams: encapsulation. Encapsulation is a kind of filtering approach; data flows through various streams which have been attached end-to-end, and those data are processed and possibly transformed at each stream stage. Encapsulation is a more modular approach. Each component stream can be simpler and, therefore, more widely applicable.

Both encapsulation and CLOS specialization techniques are compatible with each other, and can be used in combination to greatly enhance the data processing capabilities in Allegro CL.

In the following subsections we describe stream encapsulation and provide some examples.

12.1 Encapsulation terminology

We define the terms used in this section:

Encapsulation, encapsulator, encapsulate, encapsulatee: An encapsulation is the attachment of more than one stream of any class in a chain, beginning with the i/o device and ending with the program. A stream X is said to encapsulate a stream Y when stream Y appears as stream X's input-handle or output-handle. In this case, stream X is the encapsulator and stream Y is the encapsulated stream or the encapsulatee.
Input direction and output direction: The input direction has data moving from the i/o device to the program. The output direction has data moving from the program to the i/o device. These are obvious definitions, but are needed as reminders of the definitions of inner and outer, given next, which may be less intuitive.
Inner and outer: The innermost encapsulator is the stream which is closest to the program. The outermost encapsulator is the stream which encapsulates the closest stream to the i/o device (note that the closest stream to the i/o device is not an encapsulator). This naming may seem counterintuitive, since an encapsulator is inside (inner with respect to) the stream it encapsulates. Normal usage would have an encapsulator outside its encapsulatee, but viewing the whole process, inner more usefully describes closer to the program and further from the i/o device.

12.2 Strategy descriptions necessary for encapsulation

There are several slots in Allegro CL simple-streams which must be exposed in order for encapsulations to be allowed. The names of these slots are not (typically) exported. Here we describe those slots here, and their accessors, if they are defined.

excl::input-handle, excl::output-handle: These slots hold either fixnum values representing operating system file numbers, or else streams which are the encapsulated stream, or else nil. The handle slot for each direction must match the capability of the stream; i.e. if a stream is not open for input, input-handle must be nil. If the stream changes states while open, the handles must follow that state; e.g. if one direction of a socket stream is shut down, that corresponding handle must be set to nil. Accessors excl::stream-input-handle and excl::stream-output-handle are provided for these slots.

excl::melded-stream, excl::melding-base: These slots always hold streams, and allow for the special encapsulation style of composing external-formats. See Section 12.5 Encapsulating composing external-formats for further explanations.

excl::buffer, excl::out-buffer: These are the buffer slots. The out-buffer slot is used when two buffers might be used, to hold output data. The buffer slot is not called in-buffer because it sometimes acts as a bidirectional buffer; when there is only one buffer in the stream the buffer slot is the one that is used. The exception to this rule is for string-output-simple-stream which does use out-buffer. This allows streams to be subclassed on both string-input-simple-stream and string-output-simple-stream, without the stratagies clashing for the two directions. No accessors are exported for these slots.

excl::buffer-ptr, excl::max-out-pos: These are the buffer maximums. When a device-read returns a number of octets read, the strategy usually adds the value to the start value it gave to device-read and sets buffer-ptr to that value. And for a dual-channel stream, max-out-pos is usually set at device-open time to the length of the buffer.

There are a couple of differences between buffer-ptr and max-out-pos, which might suggest the reason for the different names given them:

max-out-pos is usually set once, and not touched again. buffer-ptr is generally set before and/or after a device-read.
max-out-pos always holds one greater than the current maximum index, whereas buffer-ptr might hold a -1 (representing an eof).

excl::buffpos, excl::outpos: These slots hold indices into buffer and out-buffer vectors, respectively. Ignoring overflows for the moment, the basic read-byte operation consists simply of doing an aref of the buffer at buffpos position, followed by an increment of buffpos. Likewise, the basic write-byte operation (on a dual-channel stream) consists of a setf of the aref of out-buffer at out-pos, followed by an increment of out-pos.

[Note: excl::buffer, excl::buffpos, and excl::buffer-ptr are defined for all simple-streams. excl::out-buffer, excl::outpos, and excl::max-out-pos are only defined for streams for which it makes sense, i.e. dual-channel and string-output simple-streams.]

To make the octet and character strategy work as efficiently as possible, the buffpos/buffer-ptr and out-pos/max-out-pos pairs must always retain numeric values, and if there is any data remaining in buffer to be read or if there is any room in out-buffer for writing, the position slot will be less than the max slot. The complete basic strategy for reading and writing an octet takes advantage of this fact to provide a single-test instruction sequence for determining if the next character can be read (for which the answer should normally be "yes"). Thus, for example, a template for strategy to read an octet has the following form (in unoptimized lisp pseudocode):

 (when (>= (sm buffpos stream) (sm buffer-ptr stream))
   ;; Buffer needs filling
   [various operations which may include device-read])
 (prog1
     (aref (sm buffer stream) (sm buffpos stream))
   (incf (sm buffpos stream)))

excl::last-char-read-size, excl::encapsulated-char-read-size: These two slots determine how much of the octet buffer constitutes a character for the purposes of unreading that character, and how much to copy back of the buffer contents to the beginning of the buffer before reading (via device-read) into the rest of the buffer. These slots always hold numeric values, and are reset to 0 by many synchronizing operations.

The strategy function slots

The slots listed below hold character-strategy functions, which can be built in any way desired, but which conform to the requirements of their respective functionality descriptions. Allegro CL internally uses the char-to-octets and octets-to-char macros to incorporate external-format processing as part of the stream's character-strategy functionality.

excl::j-listen

Holds a function which expects one argument which is the stream. The function returns nil if a read-char would hang, and non-nil if read-char would not hang. Note that the name j-listen does not imply Common Lisp functionality, which tends to be inconsistent in its definition for listening, especially in the face of multi-octet characters. Instead, j-listen is precisely defined to determine whether a complete character can be read by read-char (or if an error or an EOF might occur) and would probably better have been named j-no-hang-p instead. This also implies that in a multibyte character situation where only part of the character has become available, the j-listen function must return nil even if some of the octets were readable.

The j-listen function must always leave the stream in the same state as it started, including the unread-char character-length, so that an unread-char could be performed after a listen.

excl::j-read-char

Holds a function which receives the following arguments: stream eof-error-p eof-value blocking.

The stream, eof-error-p, and eof-value arguments have the same semantics of similarly named arguments to read-char. The blocking argument determines whether the operation may block; a nil value causes the equivalent behavior as read-char-no-hang, and a non-nil value causes read-char behavior. Either enough octets are read to be converted to a character, or a character is read, and returned.

excl::j-write-char

Holds a function which receives the following arguments: character stream.

The character and stream arguments have the same semantics as similarly named arguments to write-char. The j-write-char function is always assumed to be a blocking write (i.e. if writing cannot be done temporarily due to resource limitations, the function will wait for the resources to become freed). The character is written to the stream's buffer, after being converted to octets via the external-format, if the stream is octet-oriented. The character is returned from the function.

excl::j-read-chars

Holds a function which receives the following arguments: stream string search start end blocking.

The stream, start, and end arguments have similar semantics as similarly named arguments to read-sequence. The string argument can be any string, and is the string which will be filled with the characters formed either by reading the next character, or by reading of octets and converting to characters via the external format of the stream.

If search is nil, then only as many characters as can be read are in fact read, and 1 plus the index of the last character read is returned as the only value. (This document used to say the number of characters read was returned, but that was incorrect.)

If search is given, it should be a character. As each character is read, it is compared to the search value. If the character read doesn't match, it is transferred to the string, but if it matches, the reading stops with the matching character being consumed but not transferred to the string. At that time, the index of the next character to read is returned as the first value, and the second value returned is based on the success of the search; the three possible second values are nil (search character was not found), t (search character found), or :eof (end-of-file encountered). (This document used to say the number of characters transferred was returned, but that was incorrect.)

The blocking argument determines what kind of blocking to perform. A value of nil causes no blocking. A value of t causes blocking to always occur, and the operation will either complete or get an error. A value of :bnb causes blocking for the first character, but non-blocking for subsequent characters.

excl::j-write-chars

Holds a function which receives the following arguments: string stream start end.

The stream, start, and end arguments have similar semantics as similarly named arguments to read-sequence. The string argument can be any string, and is the string from which characters will be supplied to either be written through the stream or to be converted via the external-format of the stream and then stored as octets into the stream's buffer.

The j-write-chars function is always a blocking operation, and will always complete before returning or will error.

excl::j-unread-char

Holds a function which receives the following arguments: stream relaxed.

The stream argument has similar semantics as the stream argument to unread-char. The relaxed argument allows the unread to be performed without error, even if the last-char-read-size is 0. Normally, the only time that this will be necessary is when a hard or soft eof has been encountered during a multi-character composition, in which case the eof must be unread.

The j-unread-char function backs up one character in the stream. If the stream is an encapsulator, then this may mean backing up several characters in the stream it encapsulates.

For an octet-based stream, the last-char-read-size of the stream will determine how many octets to back up in the buffer in order to unread the character. That slot is set in one of two ways:

it is incremented during the building of a character.
it is set by an encapsulating stream which has read one or more characters in order to compose a character at a higher level.

The protocols for setting and incrementing this slot ensure that the two methods do not conflict with each other.

Note that there is no character argument to this unread-char implementation. Due to the buffering of Allegro CL streams, and due to the difficulty in specifying an eof condition as a unique character (for the puroposes of unreading) the Allegro CL implementation of unread-char does not actually use the character argument, but instead checks it for validity and then ignores it, because the character is always available in the stream's buffer.

excl::j-read-byte

Holds a function which receives the following arguments: stream eof-error-p eof-value.

The stream, eof-error-p, and eof-value arguments have the same semantics of similarly named arguments to read-byte. Either enough octets are read to be converted to a byte, or a byte is read, and returned.

excl::j-read-byte is not really a character strategy. It was added after the other slots (in release 7.0) in order to handle bivalent encapsulating streams, which may not have a buffer associated with them.

excl::j-write-byte

Holds a function which receives the following arguments: byte stream.

The byte and stream arguments have the same semantics as similarly named arguments to write-byte. The j-write-byte function is always assumed to be a blocking write (i.e. if writing cannot be done temporarily due to resource limitations, the function will wait for the resources to become freed). The byte is written to the stream's buffer, after being converted to octets, if the stream is octet-oriented. The byte is returned from the function.

The j-write-byte function is always a blocking operation, and will always complete before returning or will error.

excl::j-write-byte is not really a character strategy. It was added after the other slots (in release 7.0) in order to handle bivalent encapsulating streams, which may not have a buffer associated with them.

12.3 Valid connections between octet-oriented and character-oriented streams

Both single-channel-simple-streams and dual-channel-simple-streams are octet-oriented, that is, they employ octet buffers internally. String streams are character-oriented streams, which means that their internal buffers are strings. Various implications can be made from these definitions, and these implications affect how encapsulations can be made:

An octet-oriented stream can read or write both binary data and character data, but a character oriented stream cannot read or write binary data. So, in effect, read-char works on an octet-oriented stream (via the stream's installed character strategy) but read-byte does not work on a string
A character-oriented stream is never attached to an external device. This is because a Lisp string has internal representations that are not necessarily translatable to external foreign string formats.
The previous implication further implies that a character-oriented stream generally does not use an external-format. The external-format slot of the stream exists in a string-stream, but is not used for any of the defined character-strategies; the point of string streams is to bypass overhead needed to transform characters into octets and back again.

Encapsulations can be set up in a couple of ways. If we label character-oriented streams with the letter C, and octet-oriented streams with the letter O, then the following kinds of encapsulation configurations can be done:

Simple unencapsulated string stream:

  prog - C - string

Simple unencapsulated octet stream, externally connected:

  prog - O - i/o

Simple unencapsulated octet stream, internally connected:

  prog - O - octet-buffer

Encapsulations on internally connected string streams:

  prog - C - C - ... - C - string

Encapsulations on internally connected octet streams:

  prog - C - C - ... - C - O - ... - O - buffer

Encapsulations on externally connected octet streams:

  prog - C - C - ... - C - O - ... - O - i/o

In other words, a character oriented stream may encapsulate any kind of stream, and an octet-oriented stream may be encapsulated by any kind of stream, but an octet-oriented stream cannot encapsulate a character-oriented stream.

12.4 Examples of stream encapsulations

Three examples are provided, which show how to encapsulate streams in different styles. (Only two of the examples are fully worked out.) The first shows a string-stream which uses two buffers and which thus allows bidirectional communication on dual-channel encapsulatees. The second is a single-buffer string stream which nevertheless allows bidirectional data to and from a single-channel stream like a file. The third example demonstrates an octet-based encapsulator, and it also demonstrated the ability to pick off bits in a stream and divide or combine them for presentation as octets in a stream at a higher level.

All of these examples share some common features:

The code is set in the :user package. In real user-defined encapsulations, any package can be created and used for the implementation. Do not use the :excl package, since that is our Allegro CL implementation package.
The :iodefs module is needed at compile-time, to define several of the macros used in the code. Note that since it is a compile-time require, the module is not necessary for normal runtime. However, you can load the module by evaluating (require :iodefs).
The encapsulatee is specified by the :base-stream argument. That choice of name is arbitrary. The base-stream is obtained from the options argument to each device-open method. It is up to device-open to do any keyword checking that it wants to do. In the examples, the existance of the base-stream keyword it the only thing checked; any other keywords are ignored, as if by :allow-other-keys.
The setting of one or both handle slots to the base-stream is what establishes the connection to the "device" (in this case, another stream). Because the encapsulatee is an open stream, the connection made by the setting of handles is what satisfies device-open's requirement to establish the connection.
One common error that a device-level programmer might make is to forget to return a non-nil value from the device-open method. This will always result in a closed stream, since the shared-initialize :after method on simple-streams always calls device-close if the device-open method returns nil.

12.4.1 Rot13b: An Example of Bidirectional Stream Encapsulation

Rot13 is a simple translation technique used many times to shield internet readers from potentially offensive text. The simple rule is that alphabetic characters are shifted by exactly 13 characters in the alphabet, whichever way is possible. Because there are exactly 26 characters in the English alphabet, two such shifts will reproduce the original result. Thus the original text is easy to get, but it takes a conscious act on the part of the reader to read such text.

This encapsulation example uses a bidirectional string stream with two buffers, and is intended to encapsulate another dual-buffer stream (either another encapsulating dual-buffer string stream or a dual-channel stream).

First, an example run:

cl-user(1): :cl examples/streams/rot13b
; Fast loading examples/streams/rot13b.fasl
cl-user(2): (setq xxx (make-instance 
                       'rot13-bidirectional-stream :base-stream *terminal-io*))
#<rot13-bidirectional-stream "^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@" @ #x716fa842>
cl-user(3): (format xxx "hello")
uryyb
nil
cl-user(4): (format xxx "uryyb")
hello
nil
cl-user(5): (read-line xxx)
The quick brown fox jumped over the lazy dog.
"Gur dhvpx oebja sbk whzcrq bire gur ynml qbt."
nil
cl-user(6): (read-line xxx)
Gur dhvpx oebja sbk whzcrq bire gur ynml qbt.
"The quick brown fox jumped over the lazy dog."
nil
cl-user(7):

Note that the encapsulated stream being used above is *terminal-io*, which is the same as *standard-input* in this example transcript. This naturally causes the listener to wait after each (read-line xxx) for the input, just as a (read-line *terminal-io*) or else a (read-line *standard-input*) would do the same thing. Such behavior would not be seen if the stream being encapsulated were a socket or some other terminal stream.

Source code and discussion

The source code is available in [Allegro directory]/examples/streams/rot13b.cl. The important definitions are described below:

The class definition for rot13-bidirectional-stream has the new (in 8.0) bidirectional-character-encapsulating-stream as its superclass. A bidirectional stream has both the input and the output slots necessary for dual-buffer operation, and thus that the standard string strategy functions will work. No other new slots are necessary in this class.

Device-read and device-write:

The code defines a device-read method to read the test to be tranformed and a device-read method to write it out. Note that because this is a string-stream, it will not be dealing with external-formats and stream-external-format will return :default.

rotate-char:

This is the basic workhorse routine for rot13. The input character is rotated within the alphabet if it is an alphabetic character, and left alone otherwise. Due to the nature of the rot13 algorithm, given any character char, (rotate-char (rotate-char char)) will always return char.

12.4.2 Base64: an example of binary stream encapsulation

Base64 encoding is part of the MIME specification as rfc1521 (see http://www.faqs.org/rfcs/rfc1521.html). It uses a limited character set to textually encode data of any kind, even if the transmission line has as few as six bits in width. This contributes to the universal usability of base64. This encoding is definitely not a compression technique; in fact data are expanded by 33% when decoded.

This encapsulation example demonstrates the capability of simple-streams to work with sub-octet sizes, and for encapsulations to pass character data through octet streams. The example only implements the decoder on the input side of the stream. An encoder/write side could easily be written as well, but would complicate the example.

First, an example run:

;;  Assume the file country.txt exists with the following contents:
cl-user(1): (shell "cat country.txt")
Now is the time
for all good people
to come to the aid
of their country.
0
cl-user(2): (shell "mpack -o country.mime -s test country.txt")
0
cl-user(3): (shell "cat country.mime")
Message-ID: <27716.1004255130@killer>
Mime-Version: 1.0
Subject: test
Content-Type: multipart/mixed; boundary="-"

This is a MIME encoded message.  Decode it with "munpack"
or any other MIME reading software.  Mpack/munpack is available
via anonymous FTP in ftp.andrew.cmu.edu:pub/mpack/
---
Content-Type: application/octet-stream; name="country.txt"
Content-Transfer-Encoding: base64
Content-Disposition: inline; filename="country.txt"
Content-MD5: 9wLn2T6WejxfbCKZsS6ALw==

Tm93IGlzIHRoZSB0aW1lCmZvciBhbGwgZ29vZCBwZW9wbGUKdG8gY29tZSB0byB0aGUgYWlk
Cm9mIHRoZWlyIGNvdW50cnkuCg==

-----
0
cl-user(4): :cl examples/streams/base64
; Fast loading examples/streams/base64.fasl
cl-user(5): (setq yyy (open "country.mime"))
#<file-simple-stream #p"country.mime" for input pos 0 @ #x716fd2fa>
cl-user(6): (setq xxx (make-instance 'base64-reader-stream :base-stream yyy))
#<base64-reader-stream
   for input fd #<file-simple-stream #p"country.mime" for input pos 0>
  @ #x716ff572>
cl-user(7): (read-line xxx)
"Now is the time"
nil
cl-user(8): (read-line xxx)
"for all good people"
nil
cl-user(9): (read-line xxx)
"to come to the aid"
nil
cl-user(10): (read-line xxx)
"of their country."
nil
cl-user(11): (read-line xxx)
Error: eof encountered on stream
       #<base64-reader-stream
          for input fd #<file-simple-stream
                         #p"country.mime" for input pos 585>
         @ #x716ff572>
  [condition type: end-of-file]

Restart actions (select using :continue):
 0: Return to Top Level (an "abort" restart).
 1: Abort entirely from this process.
[1] cl-user(12):

The algorithm of this encapsulator is a simplistic one; decoding doesn't start until just after the first blank line after a line which starts "Content-Transfer-Encoding: base64". Decoding stops again when a line starts with a space or a dash.

Source code and explanation:

[Allegro directory]/examples/streams/base64.cl. The important definitions are described below:

The class definition for base64-reader-stream adds four new slots, two are used for unconverted raw data, and two are used to prime the stream to start decoding the base64. The raw-data slot will contain a small string, and the raw-count slot tracks how much of this string has been moved to the stream's buffer. The primed slot will be set to either nil (for unprimed) or 'primed, which means that the priming string has been seen, or 'ready, which means that the blank line after the priming string has already been seen.

Device-open:

The device-open method follows the standard from-scratch style, as described in Section 10.3 From-scratch device-open. In addition, the four new slots for this class are initialized to a starting point, and with a raw string that will be used.

Device-buffer-length:

The device-buffer-length method defined for this stream class allows a smaller raw string buffer to be used, since rfc1521 limits the length of a valid line to 76 characters.

base64-decode:

This variable holds a table, built argorithmically, which allows the conversion from base64 characters to their respective 6-bit codes.

Device-read:

The device-read method returns as much converted data as possible, up to the requested amount. Portions of unconverted raw data are retrieved by calling get-some-raw-data, and if successful, an algorithm is performed to decode that raw data - each 4 octet set of unconverted data is changed to 3 octets of converted data. Any extra octets modulo 4 that have been read are moved to the beginning of the raw buffer and are not decoded until any further reads obtain the full 4-octet package.

get-some-raw-data:

This function calls j-read-chars to get as much data as possible, up to the count (but not more than the raw-buffer size). It loops until it can't read anymore, or until it has filled at least some data into the raw-data buffer.

Any data that don't represent the actual base64 text are discarded. This is done via a state machine, which must be primed and triggered before the buffer is actually filled. The primed slot is normally nil when no base64 encoding is being done, and is primed to the value 'primed by the function prime-it (described below). Once the stream is primed, a blank line is needed to actually trigger the state machine and to put it into 'ready state, after which the buffer can be filled. Once the state is ready, characters are read into the raw-buffer until either a #\Newline is encountered, or until the state is changed again to nil by the occurrence either of another blank line or a dash character ( #\- )

The state remains unchanged from call to call, in case no data (or even no header) are read.

prime-it:

This function's purpose is to match arbitrary input to the string "Content-Transfer-Encoding: base64". Such matching can be done one character at a time or from multiple characters, using the prime-count slot which keeps track of the current number of matched characters from previous calls. New data in the buffer argument always starts at the 0th element.

As it exists, the base64 example is a toy only. There are a number of things that might be done to make it less of a toy:

It should better parse headers, and not pass over non-base64 encodings (when other encodings are encountered, errors should be generated). Currently, the string "Content-Transfer-Encoding: base64" is the only thing that primes the stream for parsing.
It should make attempts to convert partial groupings of 4 octets, if possible. If only two octets are read out of the 4 octets required for a 3-octet result, at least one result octet should be decodable. Also, if 3 of the 4 octets are read, 2 result octets should be decodable. As it currently stands, this stream class waits until all 4 octets are read before decoding them, and will thus cause listen to return nil in such a circumstance, even though at least one of the result octets should have been decodable.
The file-position method and its setf method is not specialized on this stream class. It is not clear how file-position should be defined. One simplistic approach is to treat the file-position as transparent; the device-file-position simply returns the file-position of the encapsulatee. However, (setf file-position) would pose a problem for the priming state of the stream, since it would not be known whether or not the new position is somewhere in the base64 data, or in header or other non-base64 areas.

12.5 Encapsulating composing external-formats

It is possible to create composed external-formats using encapsulating streams. (See Composed External-Formats in iacl.htm for a description of composed external-formats.) Encapsulated-based composed external-formats operate by melding two or more streams together. This is as opposed to macro-based composed external-formats, which operate by combining the composer and composee external-format conversion macros to create a single new external-format conversion macro. Macro-based composed external-formats are defined using compose-external-formats. At this time, functions similar to compose-external-formats which define encapsulated-based composed external-formats are not available, but are planned for future Allegro CL releases.

Allegro CL includes an encapsulated-based composing external-format for translating Ascii return/linefeed octet codes into Common Lisp #\Newline characters. This external-format is called :e-crlf. Its implementation is described in detail in this section.

find-external-format accepts a two-element list argument. The first element names an encapsulated-based composer external-format, and the second element names an external-format. Note that the second element itself can be a list, thus recursively denoting an inner composition.

Thus, (find-external-format '(:e-crlf :foo-base-ef)) returns an encapsulated-based external-format which is the same as the :foo-base-ef external-format except that the Common Lisp #\Newline character is converted to Ascii return/linefeed codes and vice-versa. (Such external-formats are useful/necessary for text files native to DOS/Windows which use the two Ascii octet codes to terminate lines.)

Because a stream's external-format can be switched dynamically, the style of stream encapsulation for external-formats is much different than the normal encapsulation style of attaching a stream handle. One reason for the difference is normal encapsulations can cause identity confusion. If stream xxx is opened and then encapsulated by stream yyy, then any variables that once referred to xxx would have to be changed to refer to yyy instead. If a user does

(setf (stream-external-format xxx)
      (find-external-format '(:e-crlf  :foo-base-ef)))

or, equivalently,

(setf (stream-external-format xxx)
      '(:e-crlf  :foo-base-ef))

then the identity and class of the stream which xxx holds must not change.

The melded-stream slot contains the next stream in a composition instead of the handle slots in external-format encapsulation. And besides that slot, the melding-base slot always contains the base stream of the composition (which is the stream that retains its identity no matter what the composition looks like).

Composed External-Format Description:

Composing stream class:

The composing-stream class is introduced to implement the external-format composition. It has string-stream as superclass, and has no extra slots. An exception to the general rule that all streams are buffered, composing-streams are not buffered themselves, but read and write one character at a time; A specialization on a composing-stream might have tables to work with, such as translation tables, in the same way that encapsulating streams use them, but any buffering is done vicariously through its base stream.

Straw encapsulation model:

It is easiest to explain the intricacies of the composing-external-format model by demonstrating what the model would have looked like if the identity problem had not in fact been a problem. Note that this entire section below is for demonstration purposes only, and does not describe the actual composition structure.

A composing-external-format like (:e-crlf :foo-base-ef) would have been represented as two streams - a composing-external-format stream, whose melded-stream and melding-base both contain the base stream. The composing-external-format stream would contain the composing external-format character strategy functions (in this case, for example, the j-read-char slot would contain #'crlf-read-char) and the base stream would contain its own strategy functions.

Reading a character would consist of funcall'ing the j-read-char function of the composing-stream. That function, crlf-read-char, would read and/or unread individual characters from the melded-stream (i.e. the file stream) by funcalling the j-read-char and j-unread-char functions of that stream as appropriate, and by then combining those characters (this composing-external-format will combine #\Return followed by #\Linefeed into a single #\Newline character). The j-read-char functions of the file stream would get its characters in the usual way, by operating on its own buffer (and possibly thus calling one of the device functions).

Although Allegro CL includes only two composing-formats, :e-crlf; and the less commonly used :e-crcrlf (see #\newline discussion in iacl.htm for more about the crcrlf external-format), it is possible that new composing-formats will be defined in the future. For example, if some kind of ligature combining external-format called, say, :e-ligature were created (though such a external-format currently does not exist) then it could be combined with others via

 (setf (stream-external-format xxx) '(:e-crlf (:e-ligature  :foo-base-ef)))

In our hypothetical architecture, we would have set this up as three streams; if xxx is the stream with the :foo-base-ef strategies and the buffer, and yyy is the composing stream with crlf strategies, and if zzz is the composing stream with the (hypothetical) ligature strategies, then the above setf would attach the streams as follows:

  yyy -> zzz -> xxx

where the arrow represents the connection made by the melded-stream slot. A read-char on yyy would funcall yyy's j-read-char, which would combine characters obtained by calling j-read-char on its melded-stream zzz, which in turn would return characters formed by combining or expanding ligatures recognized by reading characters from its melded-stream xxx, which finally obtains its characters from its buffer (filling it if necessary).

Actual encapsulation model:

In reality, the above straw model using a hypothetical architecture will not work, because the setting of external-formats should never change the identity of a stream. The above hypothetical architecture requires that the read-char pass yyy as the stream, but identity requirements need xxx to be the stream to be passed to read-char. Thus, the real model is somewhat convoluted; if we still keep our hypothetical :e-ligature external-format, then if

 (setf (stream-external-format xxx) '(:e-crlf (:e-ligature  :foo-base-ef)))

is done, with xxx as the same base stream as before, and if we then say

 (with-stream-class (stream)
    (setq yyy (sm melded-stream xxx))
    (setq zzz (sm melded-stream yyy)))

then the following picture would apply:

 xxx -> yyy -> zzz

xxx will still have a file-simple-stream whose melding-base is itself, and whose melded-stream is yyy. Also, xxx's character-strategy slots will be filled with :e-crlf strategies (e.g. #'excl::crlf-read-char, etc.)
yyy will be the crlf composing-stream, whose melding-base will be xxx and whose melded-stream will be zzz. Its character-strategy functions will be whatever were defined for the ligature external-format (presumably named something like ligature-read-char, etc).
zzz will be the (still hypothetical) ligature composing-stream, whose melding-base will be xxx and whose melded-stream will also be xxx. It's character-strategy will be the strategy from the file-simple-streams [in this case the read-char strategy is #'(efft sc-read-char :foo-base-ef)].

Finally, xxx's external-format slot will contain the results of

(find-external-format '(:e-crlf (:e-ligature  :foo-base-ef)))

and yyy's external-format slot will contain the results of

(find-external-format '(:e-ligature  :foo-base-ef))

and zzz's external-format slot will contain the results of

  (find-external-format  :foo-base-ef)

The rotation of the strategy functions implies that all strategy functions must be aware of the rotation and must consider where the respective slots are in these cycles; all external-format related slots (such as those for holding last character octet size and external-format state information) are in the current stream, whereas stream-related slots (such as buffers, pointers, etc) are in the melded-stream (the next stream down in the cycle). Highest-level functionlity such as dribble, control-handlers, etc, always remain in the base stream.

Note also that since this kind of "melding" encapsulation is in a different direction than regular encapsulation, the handles of the base-stream are not modified and point to the next "real" encapsulation outward.

So all strategies everywhere take an indirection through the melded-stream slot, to get to the next stream down for its real operation. A standard pattern of coding in use is shown in this example:

(defun crlf-read-char (e-stream eof-error-p eof-value block)
    (with-stream-class (stream e-stream)
       (let ((stream (sm melded-stream e-stream))
    ...

So in fact the j-read-char in xxx is going to have xxx as e-stream, and yyy as stream.

As an example, consider the crlf algorithm:

Read a char
Then
1. If the character is not #\Return, return the char and done.
2. If eof was encountered, perform the requested eof processing.
Char is #\Return; read char (call it char2).
Then
1. If char2 is eof, then return char.
2. If char2 is #\Linefeed, return #\Newline
Char is #\Return, but char2 is something else. unread-char without disturbing the ability to unread the #\Return.
Return char, which is #\Return.

The crlf-read-char strategy does this by calling j-read-char and j-unread-char on stream (rather than on e-stream). Note that if there is a third stage, yyy's strategies will be ligature strategies, but will end up operating by calling j-read-char on zzz, which are in fact the base-stream strategies. Now, remembering that all strategies take the indirection, the base-stream strategies will take the melded-stream slot of zzz which is xxx, which has the buffers which these strategies expect.

Appendix A: Built-in stream methods and their uses

There are built-in methods for stream classes. Some are described in the following subsections.

Appendix A.1 The print-object built-in stream method

The print-object method affects all Common Lisp objects, and streams are no different. Of the two kinds of streams, Gray and Simple, the simple-streams usage of print-object is the most involved.

Some of the general items that might be printed in a stream print-object method are listed by name in each individual description, and might be:

printing status: if the stream has been manufactured specifically for printing through another stream, then "printing for <name>" is printed, where <name> is the name of the stream for which this stream is gathering output. A pretty-printing (i.e. an xp) stream is an example of a printing stream.
open status: if the file is open, no special status is printed, but if the file is only open for input or output, or else is closed, that status is included in the printing of the stream.
position: if file-position information is appropriate (even if the stream does not represent an actual file) then "pos: <n>" is printed where <n> is a file position.

The action of the method on different stream classes is an follows:

Gray streams (see gray-streams.htm: most Gray streams use the default print-object method, which prints the class of the stream (unreadably, with its address identity). For streams which are also instances of excl::file-gray-stream or excl::socket-stream, the general items (listed above -- printing status, open status and position) are also printed.
Simple-streams: there is a method on simple-stream which is the default simple-stream method; it prints the stream unreadably with the text "[not completely built]". This method should always be overridden; its presence indicates a missing print-object method on the simple-stream class. Most simple-streams have their own print-object methods, which will only include the "[not completely built]" text if the stream does not yet have the correct state after the device-open method has fully completed its tasks. If there is not a print-object method for that simple-stream class, however, this default method indicates that the stream is not (and will never be) completely built - it represents a design error in the simple-stream class.

The three major subclasses of simple-stream treat print-object slightly differently:

dual-channel-simple-stream: this class is listed first because it has a print-object method directly on it; no print-object methods need be provided for subclasses of dual-channel simple-streams. This print-object method prints the relevant general items listed above: printing status and open status (position is never printed in a dual-channel stream because dual-channel streams don't tend to have positions). In addition, if the stream has an input or output handle (or both) they are printed as "fd: <handle(s)>" where <handle(s)> is one or both handles separated by /.
single-channel-simple-stream: because single-channel streams are so diverse, there is not one style of printing that can handle all situations, so of the single-channel-simple-streams listed in Section 11.0 The simple-stream class hierarchy illustrated, only the following (and their subclasses) have print-object methods; if subclassing on any other single-channel stream is desired, a print-object method may need to be provided as part of the stream implementation:
- file-simple-stream: this method provides the general items: printing status, open status, position, as well as the filename and "mapped" if the file is a mapped file.
- direct-simple-stream: those direct-simple-streams which are not also mapped-files tend to be buffer streams. The print-object method for these streams is simplistic, providing only printing-status and position if applicable. More complex direct simple-streams should provide their own print-object methods.
probe-simple-stream: although this class is a subclass of file-simple-stream, it only prints the filename.
string-simple-stream: although it might seem that string-simple-streams should be consistent, they are not; the class is the only class which provides completely transparent transfer of characters between the API and the device. So only those streams which are listed below (and their subclasses) have print-object methods:
- string-input-simple-stream: provides only the position general item (of the general items listed above). In addition, a sample of the current contents of the string buffer is printed as a string; if the string is 15 characters or longer, ... is printed at the end.
- string-output-simple-stream: provides only the position general item (of the general items listed above). In addition, the printing status general item is provided, but instead of showing an underlying handle, the actual string being built is shown, with elipses for strings 15 characters or more.
- composing-stream: provides only the position general item (of the general items listed above). Also, if the stream has a handle it prints "encapsulated by <encapsulator>" where <encapsulator> is the stream encapsulating this one. Also, if the encapsulating stream is a file or socket, the name is included.

Appendix B: peek-byte and unread-byte

Ever since the inception of simple-streams, there has been a small need for the capability to unread individual octets from a stream. When the Ansi CL spec was being develped around 1990, multi-octet characters were a relatively new concept, and so Common Lisp still held to the concept that the charater was the basic unit of data transfer, and that the "byte" was a variable size. But nowadays, the roles have reversed, and we tend to use the term "octet" (8-bit byte) to refer to what most people think of as bytes, and characters have become the larger unit, sometimes taking 3, 4, or even more octets to encode in various encoding systems, which are handled in Allegro CL by external-formats.

We have always had the peek-char and unread-char functions, and because there is at least one external-format (named :latin1-base, also nicknamed :octets) which upon reading draws a one-to-one correspondence between characters and octets, so that peek-char and unread-char can be used indirectly to peek at and unread bytes. However, in situations where external-formats other than :latin1-base are being used by default, this poses the inconvenience of having to switch external-formats to latin1-base and then back again every time it is desired to unread a single octet. So these two functions are provided which act on octets only and which do not require an external-format change:

ToC

DocOverview

CGDoc

RelNotes

FAQ

Index

PermutedIndex

Allegro CL version 10.1
Unrevised from 10.0 to 10.1.
10.0 version