| Allegro CL version 10.1 Unrevised from 10.0 to 10.1. 10.0 version |
This document contains the following sections:
1.0 Simple-stream introduction
The Allegro CL streams model uses simple-streams, which formally have
no element-type but act in general as if they have element type
(unsigned-byte 8)
. The implementation is described
in this document. An older stream implementation called Gray
streams is still supported. It is described in
gray-streams.htm
A simple-stream is created whenever a file is opened (with open) without an element-type specified. If open is called with an element-type specified, a Gray stream is created, except for string output streams, as noted next.
String output streams are always simple-streams even though both make-string-output-stream and with-output-to-string accept an element-type keyword argument. Both operators create simple-streams regardless of whether a value is specified for the element-type keyword argument or not.
The transition from Gray streams to simple-streams should be easy and transparent unless you have done extensive stream customization.
It is unlikely that users who are not concerned with stream details will have to concern themselves with this document. Common stream usages, such as opening files for reading and writing, will simply work as expected.
Symbols naming stream functionality that are not standard Common Lisp
symbols are generally in the :excl
package.
The standard Allegro CL stream implementation uses simple-streams. Also supported are Gray streams (see gray-streams.htm. Both kinds of stream may co-exist in a lisp, and compatibility is maintained for the standard Common Lisp streams interface, the Gray streams implementation maintains its compatibility with previous Gray versions with minimal source intervention when they have been used.
The Allegro CL simple-stream is the next generation evolving from the stream of the same name in Common Graphics (the windowing systems used by Allegro CL on Windows). The Common Graphics simple-stream was created as a basis for window-based input and output. The new generation of simple-stream is designed to include this window-based I/O and at the same time to retain the speed always expected for non-window based streams. It is also designed to promote further advances in technology requirements such as International Character sets and server-based I/O.
We felt a new stream implementation was needed because of problems we found with the Gray stream design. In this section, we describe these problems and describe the new implementation.
A new class hierarchy of streams with a base-class of
simple-stream
removes the problems inherent
in Gray streams. Simple-streams are more efficient, and are simpler in
concept, making it easier to extend the streams interface by
object-oriented means.
A major simplification in the simple-stream hierarchy over Gray streams is the collapsing of many class distinctions into one:
The Gray streams class hierarchy distinguishes between different kinds of stream usage. The simple-stream hierarchy distinguishes between different kinds of external I/O device or pseudo-device.
This device-level interface is CLOS oriented, but is at a much lower level than the Gray streams implementation level, making it much more efficient in both execution speed and in space.
The Gray streams implementation is supported along with simple
streams. Programs using customized Gray streams will, therefore,
continue to work as in earlier releases with only minor
changes. (Users of Gray streams must ensure, as described below, that
calls to open have the
element-type keyword argument specified -- if it is unspecified, give
it the value character
. Users must also be careful
not to assume a stream they did not create is a Gray stream, and that
can be done using synonym streams or by loading a compatibility
package, again as described below in this section. Gray streams are
described in gray-streams.htm.)
The normal mechanism used to specify a simple-stream is to call open (and thus any callers of open, such as with-open-file) to open the stream without specifying the element-type. This will cause a simple-stream to be created, instead of a Gray stream. A Gray stream will be created if an element-type argument of any kind is given to the open call.
A potential problem arises when legacy code requiring Gray streams
calls open with no
element-type argument. Under CL specification this kind of
open causes the element-type to default to
character
. All CL functionality will be compatible
between Gray streams and simple-streams, but if the user was counting
on specific Gray stream functionality in character streams, then the
open call must be changed to
include :element-type 'character
as arguments,
which will force a Gray stream.
Another problem arises when Gray streams application code assumes that the stream it is handed is a Gray stream, and thus tries to call stream- methods on it. (Allegro CL names the Gray streams CLOS substrate stream-[cl-function-name], e.g. stream-read-char for read-char.) Allegro CL with the new stream implementation solves this problem by defining its synonym-stream implementation as a Gray stream, but in such a way that all calls via the synonym-stream-symbol are non-Gray-specific. For example, a call to stream-read-char to a synonym-stream will result in a call to read-char on the synonym-stream-symbol. This is slightly slower than dispatching on stream-read-char, but it does provide for compatibility with legacy code. A programmer who doesn't want to rewrite a subsystem to use simple-streams can simply ensure that any stream passed to that subsystem is a synonym stream.
If the stream being manipulated is one that is not easily wrapped as a
synonym-stream, (e.g. *terminal-io*) a second approach is provided in
the form of the module :gray-compat
. This module
contains methods on simple-streams for generic functions normally
associated with Gray streams. If, for example, the call
(stream-read-char *standard-input*)
exists in legacy code, then requiring the
:gray-compat
module (by evaluating
(require :gray-compat)
) defines a method on
stream-read-char for simple-streams, which simply
wraps some argument and return-value processing around a call to
read-char. This again is slower than calling
read-char directly, but provides compatibility for
legacy code.
CL stream-functions read-byte, write-byte, read-char, etc., all distinguish in a trivial manner whether the stream is a Gray or simple stream. If a Gray stream is detected, the associated Gray generic function is called for the stream, so that for example, read-char calls stream-read-char, write-char calls stream-write-char, etc. However, if the stream is determined to be a simple-stream, then the specified lower level functionality for the function is called, which may involve calls to specific device-level functionality. This is described in section Section 5.1 Implementation of Common Lisp Functions for simple-streams below.
Note that this trivial dispatch does not use any CLOS dispatch mechanism, and the functionality that is called for a simple-stream may be inlined in the function. Thus, for example, all write-byte operations for a simple-stream are performed without any function calls, unless the buffer fills up and device-write must be called.
A simple-stream has no specific element-type associated with it. Instead, the fundamental unit of transfer for a simple stream that is not a string stream is the octet (an 8-bit byte), and all transfers are made at the lowest level with respect to octets. It is up to the implementation to decide how to optimize data transfers for particular situations where data paths are either wider or narrower than 8 bits.
A simple-stream is always buffered. Whereas support is provided for buffering for CL and Gray streams, buffering is not explicitly required in these stream specifications. However, the explicit requirement that simple-streams are buffered allow a simpler and potentially more efficient model. Note that there is no direct interface to simple-stream buffers. The buffering layer resides just below the CL interface level, and the device layer is just below the buffering layer.
In the following diagram, we show the function call hierarchy (top to bottom, as higher level units have function calls to lower level units, eventually reaching the lowest layer, which is the device layer) and the data flow (output from Lisp left to right, input to Lisp right to left).
---------> output direction ---------> <========= input direction <========= functional User Level Strategy Level Device-level call | | hierarchy -------------------- | | CL functionality | | | | -------------------- | | | | | | v | | ------------------------------------- | | | | | | | v | | | | ------------------- | | | Control-character | | | | | | processing | | | ------------------- | | | | | | | | | .--------------- | | | | | v | | | | | ------------------ | | | | | | external-format | | | | | | | processing | | | | | | ------------------ | | | | | | | | | -------------------. | .------------------ | | | v v v | | | --------------- | | | | Buffering | | | | --------------- | | | | | | | -----------------------. | --------------------------------------------------. | | | | v v v ----------------- | | | Device layer | -----------------
String streams bypass the external-format (that is, External File Format, as defined in Common Lisp) processing, since their destinations are not really external.
Programmers will work with simple-streams at various levels, wearing one of three different hats at any one time:
There is also a set of functions provided which aid in the implementation of the device layer, and which at the same time are themselves User Level functions. These functions allow the design of encapsulating streams, where the encapsulating stream's device level becomes the encapsulated stream's user level. These functions are discussed in Section 10.4 Implementation Helpers for device-read and device-write.
The intended interaction by the user or applications programmer is to work above the buffer level. The user does so by calling standard CL functions. The device-level programmer may define new classes and device-level methods for "drivers" (we are using the word analogously to device drivers that are implemented in operating systems), but even then it is not intended that the user call the device-level methods directly. But note that it is possible to call device-level methods if all of the rules are followed. The strategy-level programmer may design an alternate API that calls the device-level, but it must conform to the requirements that allow the device-level to work properly. The intended role of the API is that it is a thin layer which manipulates its buffer and thus deals with the device layer as little as possible. Such API's are intended to be very fast.
The device level is called that because it provides an underlying implementation that can be specialized to suit particular kinds of stream connections, in a similar manner to a device driver in an operating system.
Only simple-streams provide a device layer; Gray streams do not. The device layer puts the object implementation of a simple-stream at a lower level than the object layer of Gray streams.
The goals of the device layer are:
The device layer is not intended to be called directly, except by strategies for higher-level API interfaces that conform to strategy rules. Such APIs should be very lightweight and fast so that there is no need or temptation to call the device-layer directly. Creators of such higher-level APIs must be especially careful to understand the buffering issues involved, including those described in device-read and device-write.
Note that the device layer can implement whatever kind of connection it is set up to do. Usually this means that it will talk directly to a file handle or file descriptor number. However, the connection can be made to a stream of a different type instead of directly to an operating-system level file. By this means, Java style stream encapsulations can be created by the device-level programmer.
Such encapsulation functionality is done automatically by some functions provided as implementation helpers (see Section 10.4 Implementation Helpers for device-read and device-write).
Simple-streams are normally opened with device-open and closed with device-close. device-buffer-length returns the desired length of buffers to be allocated for the stream, if any. device-file-position returns a positive integer that is the current octet (8-bit byte) position of the device represented by its argument stream. device-file-length returns the number of octets (8-bit bytes) in the argument stream if possible.
device-read fills a buffer (if possible) with data from its argument stream. device-clear-input clears any pending input on the device connected to its argument stream. device-write writes from the buffer to the argument stream. device-clear-output clears pending output.
A method that doesn't fall under the strict buffer-unaware read-write device methods is device-finish-record. Unlike device-read and device-write, that method may manipulate stream slots, allocate new work spaces, or call out recursively to higher level stream functions. The intention here is to separate the pure fill and flush aspect of device-read and device-write from the more complex aspects of mapping and record-orientation.
The one exception to the buffer-unaware separation in device-read and device-write is when they receive a null buffer argument from the strategy layer, and their start and end arguments are not the same. This will occur if the buffer that would have been passed is the actual buffer of the stream.
Under this circumstance the device-read/device-write method has a
little leeway; it must assume that the null buffer argument refers to
the appropriate buffer in the actual stream, and must retrieve that
argument for use. However, it is free to detach and/or replace the
buffer with another of the same size. Also, in the case of device-read, the length of the
buffer must be used as the end argument, which
will also be nil
if the buffer argument is
nil
(unless end is also
eql to start). This flagging of the stream's
buffer enables device-read
and device-write methods to
be written that perform advanced buffer-management and asynchronous
read-write operations.
The first subsection describes the implementation of standard Common Lisp functions that deal with streams. Note that the behavior is usually different for Gray streams (where the CL function usually calls an Allegro-CL-specific associated generic functions) and for simple-streams (on which the CL function usually operates directly).
The second subsection describes additional functions that operate on streams, but are specific to Allegro CL.
Given the device interface, we can now describe how standard Common Lisp functions and some related Allegro CL functions are implemented in terms of these driver functions. Because the intention of this section is to provide implementation information, but not to describe how to use the functions, usage details such as argument lists are not provided.
See open for the ANSI description.
For both Gray and simple streams, open effectively turns into a call to make-instance of a stream
class. Additionally, for simple-streams, a shared-initialize after method calls device-open to actually establish
the connection with the external device or file. If the device-open call then fails and thus
returns nil
, then device-close is called immediately with a true
abort argument.
A call to open creates a simple-stream when the element-type keyword argument is not specified. A Gray stream is created when the element-type keyword argument is specified.
open has an &allow-other-keys specification, and an &rest argument. This &rest argument forms the basis of the make-instance initargs when it is called via apply.
A special case exists for an open with
:direction :probe
: this case is not a normal open
and does not actually result in a connection of any kind being
made. Instead, make-instance is called to make an
instance of probe-simple-stream
.
See close for the ANSI description.
The Gray stream system in Allegro CL implements close as a generic function, which is perfectly legal according to CL, which defines close as a function (i.e. a generic function is indeed a function). However, a generic function implies a specialization capability that does not exist for simple-streams; simple-stream specializations should be on device-close. Besides Gray streams, close can be specialized on streams that are neither Gray or simple-streams. One example of this is Allegro CL's passive socket connection. Because of this, close remains a generic function, but for simple streams is treated as if non-generic, that is simple-streams should not specialize on close, but should specialize on device-close instead. The method for simple-streams simply calls device-close precisely once, and a method for fundamental stream (the top-level Gray stream class) breaks the connection and sets a closed-flag in the stream.
If the abort keyword argument is true, any buffers are cleared without being flushed. If abort is false, then any unflushed buffers are forced out to the device before closing.
See read-byte for the ANSI description.
For a Gray stream, read-byte calls stream-read-byte.
Otherwise: If the stream's buffer is empty, an attempt is made to fill the buffer by calling device-read with the blocking argument set to true. If device-read returns -1, then we are at eof; either eof-value is returned or else an end-of-file error occurs.
If the stream's buffer is now not empty, the next octet (8-bit byte) is extracted from the buffer and returned.
This is a blocking function. See Section 5.2.1 Blocking behavior in simple-streams.
See read-char for the ANSI description.
For a Gray stream, read-char calls stream-read-char.
Otherwise: The external format is called to accumulate (as if using read-byte) as many octets (8-bit bytes) as is necessary to form a character. If an end-of-file is generated by any of the read-bytes, eof processing is done depending on the eof arguments.
If the character that results is a control character (one whose
char-code is less
than 32) and the control-in table has a function for that character,
then it is a function with two arguments (the stream and the
character) which is called to interpret the control character at this
time. If the control-in function returns, it returns either a
character which is processed normally, or nil
, which is interpreted as an eof and eof
processing is done. Note that the control-in handler must not try to
do any reading from the stream at all; the intention for the
control-in handler is to translate an already-received character to
another, or to perform an operation and return a character. For
ligatures and other multiple-character inputs, a composing
external-format should be used or created, or else an encapsulation
created for such translations.
If we got this far, the character length is recorded for unread-char and the character is returned.
Note that if eof occurs while reading a character, the actions taken by read-char depend on the external-format. The default action, and by far the most common, is to do eof processing. However, the external format may decide to return a character (saved from a previous read-char) or to generate an error.
This is a blocking function. See Section 5.2.1 Blocking behavior in simple-streams.
See unread-char for the ANSI description.
For a Gray stream, unread-char calls stream-unread-char.
Otherwise: if the unread-char character-length is set, then place the buffer and file position back to that and unset the unread-char length. Error if the unread-char character-length is not set.
See read-char-no-hang for the ANSI description.
For a Gray stream, read-char-no-hang calls stream-read-char-no-hang.
Otherwise: The external format is called to accumulate (as if using excl::read-byte-no-hang) as many octets (8-bit bytes) as is necessary to form a character. If an end-of-file is generated by any of the byte reads, eof processing is done depending on the eof arguments.
If the character is a control character (one whose char-code is less than 32) and the stream-class specifies interpretation of such characters, it is performed at this time, which may include eof-processing for a control-D.
If we got this far, the character length is recorded for unread-char and the character is returned.
Note that if it is not possible to complete the build of the character, the actions taken by read-char-no-hang depend on the situation:
nil
is returned. (similar
means that the buffer may have been read during this operation, but
that the pointers are set so that the next octet read will be the same
as the first octet read for this operation).
This is a non-blocking function. See Section 5.2.1 Blocking behavior in simple-streams.
See peek-char for the ANSI description.
For a Gray stream, peek-char calls stream-peek-char. Otherwise: a read-char equivalent is done, followed by an unread-char.
See listen for the ANSI description.
An extra second optional argument, width, is
added to listen (the first
optional argument is the specified stream
argument). width specifies the number of octets
to read before returning true, or the value
character
. Currently, any other value than 1 will
be treated as if it were specified as 'character.
For a Gray stream, listen calls stream-listen.
Otherwise: If a character-oriented listen is specified
(i.e. width is character
),
then an attempt is made to build the complete character, as if with
read-char-no-hang. If
successful, the equivalent of an unread-char is then done and true is returned;
otherwise nil
is returned. If 1 octet is
being listened for, then if the buffer is not empty, true is
returned. Otherwise device-read is called with a null
blocking argument. If that returns 0, then
nil
is returned, otherwise true is returned.
If the added optional argument is 1 or not specified, only an octet (8-bit byte) is looked for, otherwise external-format processing is used to attempt to build a character in a non-blocking way; if it is determined that the character can definitely be built, then t is returned. However, the state of the stream is left in such a way that an unread-char can be done even after the listen (as is appropriate).
See read-line for the ANSI description.
For a Gray stream, read-line calls stream-read-line and processes the return values according to eof-error-p processing.
Otherwise: String buffers are allocated as necessary and read-char equivalent is performed until either a #\Newline or eof is seen. A new string is allocated of the proper length and filled with the copied data from the temporary buffer(s) and then returned along with the missing newline flag.
Note: The read-line functionality can be optimized in the following way: A string buffer is allocated (this first one presumably on the stack) and read-char equivalent is performed until the next #\Newline or eof is seen (or until the buffer is full, at which time new buffers are allocated as necessary). A new string of the proper length is then constructed and filled with the copied data from the temporary buffer(s) and then returned along with the missing newline flag.
The functions read-line-into and simple-stream-read-line are similar to read-line but also take result string arguments to the the line which is read, thereby causing little or no consing.
Arguments: sequence stream &key start end partial-fill
See read-sequence for the ANSI description. Note that Allegro CL uses the additional partial-fill keyword argument, which is not specified in ANSI CL.
For a Gray stream, read-sequence calls stream-read-sequence.
Otherwise: If the sequence is a string, then for every element of the string, a read-char equivalent is performed. Following the last read-character, the unread-char length is set (instead of at every character read). If partial-fill is true, then a read-char-no-hang equivalent is used instead of read-char, after the first character is read with read-char.
If the sequence is an octet vector (i.e. a vector of
(signed-byte 8)
or (unsigned-byte
8)
elements), then the equivalent of read-vector is performed (but possibly
blocking if partial-fill is false - see
discussion below).
Any other sequence type generates an error (for a simple-stream).
This argument controls the blocking behavior. See Section 5.2.1 Blocking behavior in simple-streams for a general discussion of blocking.
This argument controls the behavior when there are not enough objects
(of whatever is being read) on stream to fill the
sequence passed as the first argument (at least as far as
end, if given) and no EOF is seen. The ANSI
specification for read-sequence requires it to block until
the sequence is filled or an EOF is seen. In the Allegro CL
implementation, the ANSI behavior (blocking) is observed if
partial-fill is nil
(the default).
If partial-fill is true, however, read-sequence will block for the first element, but will not block for any elements after the first, and so may return prior to the request being completed.
In all cases, read-sequence returns the index in the sequence of the next element not read.
See clear-input for the ANSI description.
For a Gray stream, clear-input calls stream-clear-input. Otherwise: if there is any input buffering in the stream, it is thrown away. Then device-clear-input is called. An additional optional buffer-only argument is added above and beyond the ANSI CL spec which allows only the buffer to be cleared, without necessarily performing any other operations on encapsulations of the stream. This argument is passed to device-clear-input.
See write-byte for the ANSI description.
For a Gray stream, write-byte calls stream-write-byte.
Otherwise: If the buffer is full, device-write is called to first write the buffer out, so that the buffer is made empty. An octet (8-bit byte) is expected as input. It is now stored into the stream's (non-full) buffer.
This is the lowest level functionality in the output portion of the CL API functions. Higher level functions which may call this function are: write-char, write-sequence, write-vector. Whether or not these functions actually call write-byte, call an internal but similar function, or expand all of write-byte's functionality inline is not specified.
This is a blocking function. See Section 5.2.1 Blocking behavior in simple-streams.
See write-char for the ANSI description.
For a Gray stream, write-char calls stream-write-char.
Otherwise: If the character to be output is a control character (one
whose char-code is less than
32), the control-out table is consulted for a control-out function for
that character. If one exists it is assumed to be a function of two
arguments (the stream and the character), and is called for
device-level processing. If the control-out function exists and
returns non-nil, then no further action is taken for this character
since it was handled successfully in the control-out function. If the
control-out function does not exist or exists and returns nil
, then normal processing continues for that
character. Normal processing means that the character is treated as
itself, to be sent uninterpreted to the stream.
The external-format functionality currently in effect is called for the character, which may result in any number of octets (8-bit bytes) being generated. These octets are then treated as if write-byte were called for each one, in the order they were received from the external-format processing.
This is a blocking function. See Section 5.2.1 Blocking behavior in simple-streams.
See write-string for the ANSI description.
For a Gray stream, write-string calls stream-write-string.
Otherwise: For each character in the specified range in the string, the equivalent of a write-char is performed.
See write-sequence for the ANSI description.
For a Gray stream, write-sequence calls stream-write-sequence.
Otherwise: If the sequence is a string, then the equivalent of write-string is performed.
If the sequence is an octet vector (i.e. a vector of (signed-byte 8) or (unsigned-byte 8) elements), then the equivalent of write-vector is performed. Any other sequence type generates an error (for a simple-stream).
This is a blocking function. See Section 5.2.1 Blocking behavior in simple-streams.
See terpri for the ANSI description.
For a Gray stream, terpri calls stream-terpri. Otherwise: the equivalent of a write-char of #\Newline is performed.
See fresh-line for the ANSI description.
For a Gray stream, fresh-line
calls stream-fresh-line. Otherwise: if the stream
can be determined to be at the start of a line, then nothing is done
and nil
is returned, otherwise the equivalent
of a write-char of #\Newline
is performed.
See finish-output for the ANSI description.
For a Gray stream, finish-output calls stream-finish-output. Otherwise: if there is any output in the stream's output buffer, is is written via device-write with a non-nil blocking argument.
Note that since Allegro CL does not queue writes, and since device-write calls are not required to write all of the requested bytes, the current implementation of finish-output loops on device-write calls until all of the unprocessed data are transferred.
See Section 5.3 Force-output and finish-output policy for a discussion of force-output/finish-output policy in Allegro CL.
See force-output for the ANSI description.
For a Gray stream, force-output calls stream-force-output. Otherwise: if there is
any output in the stream's output buffer, is is written via device-write with a
blocking argument of nil
.
Note that since Allegro CL does not queue writes, and since device-write calls are not required to write all of the requested bytes, the current implementation of force-output is similar to finish-output, in that it loops on device-write calls until all of the unprocessed data are transferred.
See Section 5.3 Force-output and finish-output policy for a discussion of force-output/finish-output policy in Allegro CL.
See clear-output for the ANSI description.
For a Gray stream, clear-output calls stream-clear-output. Otherwise: the stream's output buffer is cleared and device-clear-output is called on the stream.
See file-position for the ANSI description.
For a Gray stream, file-position calls
excl::stream-file-position
.
Otherwise: For simple-streams that are not string simple-streams, file-positions are always specified as a number of octets (8-bit bytes). For string simple-streams, file-positions are specified as number of characters.
If the position-spec argument is not given, the file position is calculated, possibly involving a call to device-file-position, and returned. Note that the file position may be precached in the stream, and device-file-position may have been called by some other CL functions.
If the position-spec argument is given, then the new file position is calculated and stored. This may involve a call to (setf device-file-position), if the position is outside of the buffer range.
See stream-element-type for the ANSI description.
For a Gray stream, stream-element-type returns the appropriate value.
For a simple-stream, stream-element-type always
returns (unsigned-byte 8)
.
These additional functions are provided in Allegro CL for operating on streams. Each is described on its own documentation page.
There are three modes of blocking behavior when writing items in a sequence to a stream or filling a sequence with items read from a stream. The issue is what to do when (for writing) the entire sequence cannot be written and (for reading) the entire sequence cannot be filled, but no EOF is encountered. (By `the entire sequence', we mean that part specified by start and end if those are supplied.) Here are the modes and a description of what happens in that mode if the whole operation does not complete and no EOF is encountered.
Here is the blocking behavior of the various ANSI CL and Allegro CL functions having to do with writing and reading sequences and individual items.
nil
, cl:read-sequence is blocking. If
partial-fill is non-nil
cl:read-sequence is B/NB. See the discussion in
Section 5.1 Implementation of Common Lisp Functions for simple-streams.
nil
.
nil
.
nil
.
nil
.
Note on putting a B/NB function in a loop: the result can be blocking behavior until as many characters are written or read as there are iterations of the loop. B/NB behavior guarantees that one element at least is written or read (or the system blocks until that happens). In a loop, with each pass guaranteeing one element, you can guarantee as many elements as desired.
The endian-swap keyword argument to read-vector and write-vector allows the byte-ordering to be
controlled so as to allow big-endian and little-endian machines to
communicate with each other. Each version of Allegro CL has either
:big-endian
or :little-endian
on
its *features*
list to identify it
appropriately. The endian-swap argument is effective only in
reads into and writes from vectors that are not strings, and is
silently ignored if given when a string is being passed to read-vector or
write-vector.
There are three kinds of values that can be given to the endian-swap argument:
:byte-8
, :byte-16
,
:byte-32
, :byte-64
, or
:byte-128
to indicate the width of the element
whose bytes to
reverse. For example, :byte-16
swaps every pair of
bytes,
and :byte-32
swaps every group of 4 bytes.
Note that :byte-8
does nothing to the byte-ordering,
and is included only
for symmetry.:network-order
. This value does nothing on
big-endian machines, and on little-endian machines, causes bytes to be
swapped based on
the element-size of the vector being read into or written from.
For example, a
double-float vector, which has an element width of 64 bits,
is swapped on a :byte-64
basis on a little-endian
machine.The byte-swapping mechanism relies on the fact that objects are always aligned on 8 or 16 byte boundaries, depending on whether the lisp is a 32-bit or 64-bit lisp. Therefore, it is inadvisable to specify a numeric value of greater than 7 (in a 32-bit lisp) or 15 (in a 64-bit lisp).
The byte swapping mechanism is in fact implemented by performing a
logxor on the current index of the next byte to get
out of the vector. The resultant xor'ed index is used
as the true byte index into the array. No attempt is made to ensure
that the index is valid (within range): it is the user's
responsibility to ensure that. This is always ensured if the
endian-swap specification matches the element width of the vector
(e.g. an (unsigned-byte 16)
vector is
given an endian-swap value of :byte-16
,
or a double-float vector is given an endian-swap
value of :byte-64
).
The following table shows how the bytes actually appear after swapping. Since the swapping is symmetrical, it can be used in either direction, for both reading and writing. Given the natural byte order of bytes A, B, C, D, E, F, G, H to start, the table shows the byte order of the resultant bytes for some example cases:
name value order :byte-8 0 A B C D E F G H :byte-16 1 B A D C F E H G ---- 2 C D A B G H E F :byte-32 3 D C B A H G F E ---- 4 E F G H A B C D ---- 5 F E H G B A D C ---- 6 G H E F C D A B :byte-64 7 H G F E D C B A ...
There is no explicit requirement by the ANSI CL Spec for an implementation to provide force-output or finish-output calls automatically; the programmer is always responsible for providing enough of these calls to ensure that output is seen at its final destination in a timely manner. However, it is also counter to good stream optimization techniques to call force-output or finish-output; the positive effects of buffering are reduced when flushed by these functions.
Allegro CL provides minimal force-output calls when certain operations are performed, especially on interactive streams. But not all operations cause force-output calls. It might be confusing to observe that write-line will flush the output to an interactive stream, but that write-string will not do so. The division is simple, though, and can be explained easily.
An interactive stream is specifically defined in Allegro CL as one for which interactive-stream-p returns true. In Allegro CL interactive-stream-p is setf'able, so any stream can in fact be interactive. Usually, it makes most sense to set this attribute on a stream that will act as a listener.
An interactive force-output is an operation that calls force-output on a stream if and only if the stream is interactive. If the stream is not interactive, no force output is done.
Allegro CL attempts to force-output as seldom as possible for minimal acceptable buffer-flushing but with maximal buffer performance.
No other calls to force-output are guaranteed by Allegro CL, although such force-output calls might be inserted when deemed necessary. However, the general guideline that Allegro CL follows is that in most cases a redundant force-output call is not good, and is thus usually avoided.
These functions are all written as if they call lower level CL functions, and do not necessarily call device-level functionality directly.
The class hierarchy for streams starts with stream
at the head, and implementations which include other stream classes
such as Gray streams will place those stream classes as subclasses of
stream
. In Allegro CL, the only subclasses of
stream
are simple-stream
and
fundamental-stream
.
fundamental-stream
denotes a Gray stream.
The simple-stream
class hierarchy is divided into
three fundamental simple-stream classes (which in turn have subclasses
not listed in the diagram), based on the kinds of buffering they do:
--> fundamental-stream ... | (Gray streams) | stream --+ | --> single-channel-simple-stream ... | | --> simple-stream --+--> dual-channel-simple-stream ... | --> string-simple-stream ...
These simple-stream subclasses cannot be mixed. They are intended to implement three styles of input/output in fundamentally different ways.
single-channel-simple-stream
dual-channel-simple-stream
string-simple-stream
The basic behavior of the Common Lisp functions is described in Section 5.1 Implementation of Common Lisp Functions for simple-streams. That description should be taken on an as-if basis, which means that the specific functions described may not actually be called at all, or else they might be implemented using compiler-macros to call lower-level functions after type inferencing proofs have been established (in other words, the implementation works as if it was implemented as described). However, the device-level interface does not have this freedom; those methods applicable for the stream class must be called in the way specified. This is to guarantee to the device-writer that methods that are written for a particular purpose will indeed be called.
However, the selection of methods to call when appropriate depends on the strategy used. Listed below are various sets of functions that are called for various stream types.
Whenever a control character (one whose char-code is less than 32) is seen when reading or writing on a stream, a decision must be made as to what to do with these characters. In a "raw" environment, the characters are processed as themselves; when writing they are inserted into the buffer (possibly after translation to octet form) and when reading they are simply returned as Lisp characters (possibly after having been assembled from octet form). In a "cooked" environment, at least some control characters turn into instructions at the device level, and are not inserted into or extracted from the stream as characters.
An example of this is terpri, which is simply a write-char of a #\Newline. On a terminal stream, a terpri simply sends the #\Newline as a character (though its sending may require a column indicator in the stream to be set to 0 as well). However, a window stream should not see a #\Newline at the device-level, instead the action should be to "move the cursor down one line and to the far left side of the window".
The simple-streams design allows for both of these kinds of action. Each stream has two slots, a control-in slot and a control-out slot, which may contain tables of functions that are consulted when the character being read or written is determined to be a control character. The actions taken are as follows:
- If the control-out table has a function entry for the control-character being written, that function is called with two arguments: the stream and the character. The control-out function should perform whatever work that it is required to do, and return non-nil, meaning that it is finished processing that character, or else
nil
, which means that the normal character processing action is taken which inserts the character into the stream.- If the control-in table has a function entry for the control-character that has just been read out of the stream, that function is called with two arguments: the stream and the character. The required actions are taken, and the function returns a new character to substitute (or the old character), or it may return
nil
to indicate end-of-file. Note that the control-in handler must not try to do any reading from the stream at all; the intention for the control-in handler is to translate an already-received character to another, or to perform an operation and return a character. For ligatures and other multiple-character inputs, a composing external-format should be used or created, or else an encapsulation created for such translations.
Control-tables are built with make-control-table and are stored into the appropriate control-in or control-out slots by device-open. The following standard control handlers and tables are examples of such but are not intended for programmer use.
nil
(indicating that further character processing
should be done) while side-effecting a slot of the stream.
nil
(indicating that further character processing
should be done) while side-effecting a slot of the stream.
*std-control-out-table*
: value is a
control-table which contains std-newline-out-handler and std-tab-out-handler in their
appropriate locations. Users must not modify this table.
*terminal-control-in-table*
: value is a
control-table which contains std-dc-newline-in-handler in its appropriate
locations. Users must not modify this table.
This section gives some tips for device-writing. It is not comprehensive, and some of the functions and macros it refers to may or may not be documented. The section is Allegro CL specific, but may be taken as a guide for other implementations as well.
New stream classes may be created which subclass existing classes. If the superclass chosen is a currently instantiable class, such as terminal-simple-stream, file-simple-stream, etc., then the device methods may be used as they are, or they may be called by call-next-method by the more specialized method. If the superclass chosen is one of the three major streams (single-channel-simple-stream, dual-channel-simple-stream, or string-simple-stream) then much of the device functionality will have to be written from scratch. There may be some methods that exist to provide defaults (for example, the default device-buffer-length method specializes on simple-stream to provide a default for all simple-streams). Other methods, such as device-open, have no appropriate default action, and are thus not supplied.
To define a new stream class in Allegro CL, the iodefs module must be required to provide some defining macros. The class may be then defined using def-stream-class:
(require :iodefs) (def-stream-class blarg (terminal-simple-stream) ((slot1 :initform nil) (slot2 :initform nil :accessor blarg-slot1)) (:default-initargs :input-handle (error "blarg stream must have a :input-handle arg")))
Each primary method to device-open returns a stream that is fully connected to its device; it can perform all operations intended on that device. When a primary method performs a call-next-method to do a device-open on a less-specific device, that functionality is complete when the call-next-method returns.
For example, suppose a whiz-bang is a type of file which has a header line associated with it, to be internalized and then ignored as data. The whiz-bang stream might be defined as
(def-stream-class whiz-bang (file-simple-stream) ((header :initform nil :accessor whiz-bang-header)))
The device-open for whiz-bang might call the primary-method for the file, and then do its own work afterward:
(defmethod device-open ((stream whiz-bang) slot-names initargs) (declare (ignore initargs slot-names)) (let ((success (call-next-method))) (when success ;; read and internalize the header (setf (whiz-bang-header stream) (read-line stream)) t)))
Note that:
nil
, which indicates that the device-open failed. File operations may thus
be performed on the stream.
A device-open that does not call-next-method must perform the following:
max-out-pos
slot must be initialized.
(sm excl::melded-stream
stream)
should be passed into these functions, instead of
the stream (see sm). The
encapsulation shape assurance in step #5 will guarantee that the
melded-stream slot holds the correct stream, even if there is no
encapsulation (and thus the melded-stream of the stream is itself).
The following two sets of functions allow device-read and device-write methods to be implemented.
Note that the supplied device-read and device-write functions do not generate errors themselves, but pass them back to the higher level for processing. This allows read-octets and write-octets to pass errors back as well, as the implementation of a higher level (encapsulating) device-read and device-write.
The first group of functions do only minimal checking on their arguments. Further, they act as implementation helpers for device level methods and their behaviors are thus not intuitive except at the device level. For those reasons, they should never be used at any level other than the device-level.
The second group of functions can be called at any level.
The following operators are named by symbols exported from the
excl
package. They are loaded with
(require :iodefs)
They are intended for implementing device-level functionality and should not be used except for that purpose.
A charpos slot exists in every simple-stream. Accessors are implemented for this slot via stream-line-column (which is setfable) and the initarg normally sets the slot to 0.
The intention of this slot is for use as a column indicator, when possible.
When the slot is nil
, the column is unknown.
When the charpos slot is non-nil
, character
output functionalities have the effect of incrementing charpos. In many streams, a newline
control-out handler will reset the charpos to 0. It is always set to nil
when non-character write operations are performed
on the stream.
Streams that need to support pretty printing must support an accurate charpos in order to generate correct pretty output. Most streams have control-out handlers which keep charpos accurate when newlines and/or tabs are processed.
For fastest write operations, charpos should be set to nil
by device-open, and no
control-out handlers which set charpos should be installed into the
stream (otherwise the writing of (for example) a newline will cause
charpos counting to resume).
The following diagram shows the simple-stream hierarchy, somewhat simplified. To read this listing, note that it is in a simple tree structure. Every node has a list of subclasses immediately indented two spaces to the right and below it. Nodes with the labels sN (that is, s1, s2, etc., for streams), ssN (that is, ss1, ss2, etc., for simple-streams), or gN (that is, g1, g2, etc., for gray streams) are nodes with multiple inheritance, with class names defined somewhere below their first usage. Nodes with labels followed by class names provide the actual definition of those labels. Class names marked by (A) are autoloadable classes.
The Gray stream hierarchy is also illustrated.
Only streams named by exported symbols are included.
For simple-streams, the device-open options are listed for each class
which is normally instantiatable. Simple-stream classes which do not
have device-open options
listed should not normally be subclassed. The
[simple-open-options]
are listed at the bottom of
the diagram.
;;; The Simple Stream Hierarchystream
;; major mixins for dpAns:file-stream
s1 s2string-stream
s3 s4 ;; Simple-streams:simple-stream
probe-simple-stream
options: filenamesingle-channel-simple-stream
direct-simple-stream
buffer-input-simple-stream
options: buffer external-format start endbuffer-output-simple-stream
options: buffer external-format ss1null-simple-stream
options: external-format s1file-simple-stream
options: [simple-open-options] [filename is required] ss1mapped-file-simple-stream
(A) options: [simple-open-options, release-handle] [filename is required]dual-channel-simple-stream
terminal-simple-stream
options: [simple-open-options]socket-simple-stream
options: [simple-open-options]socket-base-simple-stream
options: [simple-open-options] excl::hiper-simple-stream s3string-simple-stream
composing-stream
options: [none]string-input-simple-stream
options: string start end ss2string-output-simple-stream
options: (string (make-string (device-buffer-length stream)))fill-pointer-output-simple-stream
options: (string (error ...)) excl::limited-string-output-simple-streamxp-simple-stream
options [none]annotation-output-simple-stream
ss2bidirectional-character-encapsulating-stream
options: (base-stream (error ...)) ;; Gray streams:fundamental-stream
(A)fundamental-input-stream
(A) g1 g3 g13fundamental-output-stream
(A) g2 g4 g14fundamental-character-stream
(A) g1fundamental-character-input-stream
(A) g4a g7 g16 g25 g26 g28 g29 g30 g31 g2fundamental-character-output-stream
(A) g4a g8 g17 g22 g26 g27 g28 g29echo-stream
(A) g30 g32fundamental-binary-stream
(A) g3fundamental-binary-input-stream
(A) g4a g10 g19 g26 g28 g30 g31concatenated-stream
g4fundamental-binary-output-stream
(A) g4a excl::null-stream (A) g11 g20 g26 excl::bdbv-socket-stream (A) g28synonym-stream
(A) g30two-way-stream
(A) g32broadcast-stream
(A) excl::binary-socket-stream (A) g10 g11 excl::socket-stream (A) g5 excl::input-socket-stream (A) g7input-terminal-stream
(A) g9 g10input-binary-socket-stream
(A) g12 g6 excl::output-socket-stream (A) g8output-terminal-stream
(A) g9bidirectional-terminal-stream
(A) g11output-binary-socket-stream
(A) g12bidirectional-binary-socket-stream
(A) excl::terminal-stream (A) s2 excl::file-gray-stream (A) g13 excl::input-file-stream (A) g15 g16 excl::character-input-file-stream (A) g18 g19 excl::binary-input-file-stream (A) g21 g14 excl::output-file-stream (A) g15 excl::bidirectional-file-stream (A) g18 g21 g17 excl::character-output-file-stream (A) g18 excl::character-bidirectional-file-stream (A) g20 excl::binary-output-file-stream (A) g21 excl::binary-bidirectional-file-stream (A) s4 excl::string-gray-stream (A) g22 excl::string-output-stream (A) g23 g24 ?? excl::stream-output-stream-circular g24 excl::fill-pointer-output-stream (A) g25 excl::string-input-stream (A) excl::annotation-encapsulation-mixin (A) g23 excl::string-output-with-encapsulated-annotation-stream (A) g27 excl::xp-stream (A) ;; simple-open-options: filename (direction :input) if-exists if-does-not-exist external-format -- always defaults to :default input-handle (see device-open) output-handle (see device-open) mapped (see device-open) fn-in (same as input-handle, for compatibility only) fn-out (same as output-handle, for compatibility only)
Much of this document, streams.htm, discusses CLOS techniques for customizing streams. Allegro CL supports another way to customize streams: encapsulation. Encapsulation is a kind of filtering approach; data flows through various streams which have been attached end-to-end, and those data are processed and possibly transformed at each stream stage. Encapsulation is a more modular approach. Each component stream can be simpler and, therefore, more widely applicable.
Both encapsulation and CLOS specialization techniques are compatible with each other, and can be used in combination to greatly enhance the data processing capabilities in Allegro CL.
In the following subsections we describe stream encapsulation and provide some examples.
We define the terms used in this section:
There are several slots in Allegro CL simple-streams which must be exposed in order for encapsulations to be allowed. The names of these slots are not (typically) exported. Here we describe those slots here, and their accessors, if they are defined.
excl::input-handle
,
excl::output-handle
: These slots hold either fixnum
values representing operating system file numbers, or else streams
which are the encapsulated stream, or else nil
. The handle slot for each direction must match
the capability of the stream; i.e. if a stream is not open for input,
input-handle must be nil
. If the stream
changes states while open, the handles must follow that state; e.g. if
one direction of a socket stream is shut down, that corresponding
handle must be set to nil
. Accessors
excl::stream-input-handle and excl::stream-output-handle
are provided for these slots.
excl::melded-stream
,
excl::melding-base
: These slots always hold
streams, and allow for the special encapsulation style of composing
external-formats. See
Section 12.5 Encapsulating composing external-formats for further
explanations.
excl::buffer
, excl::out-buffer
:
These are the buffer slots. The out-buffer slot is used when two
buffers might be used, to hold output data. The buffer slot is not
called in-buffer because it sometimes acts as a bidirectional
buffer; when there is only one buffer in the stream the buffer slot is
the one that is used. The exception to this rule is for string-output-simple-stream
which
does use out-buffer. This allows streams to be subclassed on both
string-input-simple-stream
and string-output-simple-stream
,
without the stratagies clashing for the two directions. No accessors
are exported for these slots.
excl::buffer-ptr
,
excl::max-out-pos
: These are the buffer maximums.
When a device-read returns a number of octets read, the strategy
usually adds the value to the start value it gave to device-read and sets buffer-ptr to that value.
And for a dual-channel stream, max-out-pos is usually set at device-open time to the
length of the buffer.
There are a couple of differences between
buffer-ptr
and max-out-pos
,
which might suggest the reason for the different names given them:
max-out-pos
is usually set once, and not touched
again. buffer-ptr
is generally set before and/or
after a device-read.
max-out-pos
always holds one greater than the
current maximum index, whereas buffer-ptr
might hold
a -1 (representing an eof).
excl::buffpos
, excl::outpos
:
These slots hold indices into buffer and out-buffer vectors,
respectively. Ignoring overflows for the moment, the basic read-byte operation consists simply of
doing an aref of the buffer at buffpos position, followed by an
increment of buffpos. Likewise, the basic write-byte operation (on a
dual-channel stream) consists of a setf of the aref of out-buffer at out-pos,
followed by an increment of out-pos.
[Note: excl::buffer
,
excl::buffpos
, and
excl::buffer-ptr
are defined for all
simple-streams. excl::out-buffer
,
excl::outpos
, and
excl::max-out-pos
are only defined for streams
for which it makes sense, i.e. dual-channel and string-output
simple-streams.]
To make the octet and character strategy work as efficiently as possible, the buffpos/buffer-ptr and out-pos/max-out-pos pairs must always retain numeric values, and if there is any data remaining in buffer to be read or if there is any room in out-buffer for writing, the position slot will be less than the max slot. The complete basic strategy for reading and writing an octet takes advantage of this fact to provide a single-test instruction sequence for determining if the next character can be read (for which the answer should normally be "yes"). Thus, for example, a template for strategy to read an octet has the following form (in unoptimized lisp pseudocode):
(when (>= (sm buffpos stream) (sm buffer-ptr stream)) ;; Buffer needs filling [various operations which may include device-read]) (prog1 (aref (sm buffer stream) (sm buffpos stream)) (incf (sm buffpos stream)))
excl::last-char-read-size
,
excl::encapsulated-char-read-size
: These two slots
determine how much of the octet buffer constitutes a character for the
purposes of unreading that character, and how much to copy back of the
buffer contents to the beginning of the buffer before reading (via
device-read) into
the rest of the buffer. These slots always hold numeric values, and
are reset to 0 by many synchronizing operations.
The slots listed below hold character-strategy functions, which can be built in any way desired, but which conform to the requirements of their respective functionality descriptions. Allegro CL internally uses the char-to-octets and octets-to-char macros to incorporate external-format processing as part of the stream's character-strategy functionality.
Holds a function which expects one
argument which is the stream. The function returns nil
if a read-char would hang, and non-nil
if read-char would not hang. Note that the name
j-listen does not imply Common Lisp functionality, which tends
to be inconsistent in its definition for listening, especially in the
face of multi-octet characters. Instead,
j-listen is precisely defined to determine
whether a complete character can be read by read-char (or if an error or an EOF might occur)
and would probably better have been named j-no-hang-p instead.
This also implies that in a multibyte character situation where only
part of the character has become available, the
j-listen function must return nil
even if some of the octets were readable.
The j-listen function must always leave the stream in the same state as it started, including the unread-char character-length, so that an unread-char could be performed after a listen.
Holds a function which receives the following arguments: stream eof-error-p eof-value blocking.
The stream, eof-error-p, and
eof-value arguments have the same semantics of
similarly named arguments to read-char. The blocking
argument determines whether the operation may block; a nil
value causes the equivalent behavior as read-char-no-hang, and a non-nil
value causes read-char behavior. Either enough
octets are read to be converted to a character, or a character is
read, and returned.
Holds a function which receives the following arguments: character stream.
The character and stream arguments have the same semantics as similarly named arguments to write-char. The j-write-char function is always assumed to be a blocking write (i.e. if writing cannot be done temporarily due to resource limitations, the function will wait for the resources to become freed). The character is written to the stream's buffer, after being converted to octets via the external-format, if the stream is octet-oriented. The character is returned from the function.
Holds a function which receives the following arguments: stream string search start end blocking.
The stream, start, and end arguments have similar semantics as similarly named arguments to read-sequence. The string argument can be any string, and is the string which will be filled with the characters formed either by reading the next character, or by reading of octets and converting to characters via the external format of the stream.
If search is nil
, then
only as many characters as can be read are in fact read, and 1 plus
the index of the last character read is returned as the only
value. (This document used to say the number of characters
read was returned, but that was incorrect.)
If search is given, it should be a character. As
each character is read, it is compared to the search value. If the
character read doesn't match, it is transferred to the string, but if
it matches, the reading stops with the matching character being
consumed but not transferred to the string. At that time, the index of
the next character to read is returned as the first value, and the
second value returned is based on the success of the search; the three
possible second values are nil
(search
character was not found), t
(search character
found), or :eof (end-of-file encountered). (This document used
to say the number of characters transferred was returned, but that was
incorrect.)
The blocking argument determines what kind of
blocking to perform. A value of nil
causes
no blocking. A value of t
causes blocking to
always occur, and the operation will either complete or get an error.
A value of :bnb causes blocking for the first character, but
non-blocking for subsequent characters.
Holds a function which receives the following arguments: string stream start end.
The stream, start, and end arguments have similar semantics as similarly named arguments to read-sequence. The string argument can be any string, and is the string from which characters will be supplied to either be written through the stream or to be converted via the external-format of the stream and then stored as octets into the stream's buffer.
The j-write-chars function is always a blocking operation, and will always complete before returning or will error.
Holds a function which receives the following arguments: stream relaxed.
The stream argument has similar semantics as the stream argument to unread-char. The relaxed argument allows the unread to be performed without error, even if the last-char-read-size is 0. Normally, the only time that this will be necessary is when a hard or soft eof has been encountered during a multi-character composition, in which case the eof must be unread.
The j-unread-char function backs up one character in the stream. If the stream is an encapsulator, then this may mean backing up several characters in the stream it encapsulates.
For an octet-based stream, the last-char-read-size of the stream will determine how many octets to back up in the buffer in order to unread the character. That slot is set in one of two ways:
The protocols for setting and incrementing this slot ensure that the two methods do not conflict with each other.
Note that there is no character argument to this unread-char implementation. Due to the buffering of Allegro CL streams, and due to the difficulty in specifying an eof condition as a unique character (for the puroposes of unreading) the Allegro CL implementation of unread-char does not actually use the character argument, but instead checks it for validity and then ignores it, because the character is always available in the stream's buffer.
Holds a function which receives the following arguments: stream eof-error-p eof-value.
The stream, eof-error-p, and eof-value arguments have the same semantics of similarly named arguments to read-byte. Either enough octets are read to be converted to a byte, or a byte is read, and returned.
excl::j-read-byte is not really a character strategy. It was added after the other slots (in release 7.0) in order to handle bivalent encapsulating streams, which may not have a buffer associated with them.
Holds a function which receives the following arguments: byte stream.
The byte and stream arguments have the same semantics as similarly named arguments to write-byte. The j-write-byte function is always assumed to be a blocking write (i.e. if writing cannot be done temporarily due to resource limitations, the function will wait for the resources to become freed). The byte is written to the stream's buffer, after being converted to octets, if the stream is octet-oriented. The byte is returned from the function.
The j-write-byte function is always a blocking operation, and will always complete before returning or will error.
excl::j-write-byte is not really a character strategy. It was added after the other slots (in release 7.0) in order to handle bivalent encapsulating streams, which may not have a buffer associated with them.
Both single-channel-simple-stream
s and dual-channel-simple-stream
s are
octet-oriented, that is, they employ octet buffers internally. String
streams are character-oriented streams, which means that their
internal buffers are strings. Various implications can be made from
these definitions, and these implications affect how encapsulations
can be made:
string-stream
, but is not used for
any of the defined character-strategies; the point of string streams
is to bypass overhead needed to transform characters into octets and
back again.
Encapsulations can be set up in a couple of ways. If we label character-oriented streams with the letter C, and octet-oriented streams with the letter O, then the following kinds of encapsulation configurations can be done:
Simple unencapsulated string stream: prog - C - string Simple unencapsulated octet stream, externally connected: prog - O - i/o Simple unencapsulated octet stream, internally connected: prog - O - octet-buffer Encapsulations on internally connected string streams: prog - C - C - ... - C - string Encapsulations on internally connected octet streams: prog - C - C - ... - C - O - ... - O - buffer Encapsulations on externally connected octet streams: prog - C - C - ... - C - O - ... - O - i/o
In other words, a character oriented stream may encapsulate any kind of stream, and an octet-oriented stream may be encapsulated by any kind of stream, but an octet-oriented stream cannot encapsulate a character-oriented stream.
Three examples are provided, which show how to encapsulate streams in different styles. (Only two of the examples are fully worked out.) The first shows a string-stream which uses two buffers and which thus allows bidirectional communication on dual-channel encapsulatees. The second is a single-buffer string stream which nevertheless allows bidirectional data to and from a single-channel stream like a file. The third example demonstrates an octet-based encapsulator, and it also demonstrated the ability to pick off bits in a stream and divide or combine them for presentation as octets in a stream at a higher level.
All of these examples share some common features:
(require :iodefs)
.
nil
value from the
device-open method. This
will always result in a closed stream, since the shared-initialize
:after method on simple-streams always calls device-close if the device-open method returns nil
.
Rot13 is a simple translation technique used many times to shield internet readers from potentially offensive text. The simple rule is that alphabetic characters are shifted by exactly 13 characters in the alphabet, whichever way is possible. Because there are exactly 26 characters in the English alphabet, two such shifts will reproduce the original result. Thus the original text is easy to get, but it takes a conscious act on the part of the reader to read such text.
This encapsulation example uses a bidirectional string stream with two buffers, and is intended to encapsulate another dual-buffer stream (either another encapsulating dual-buffer string stream or a dual-channel stream).
First, an example run:
cl-user(1): :cl examples/streams/rot13b ; Fast loading examples/streams/rot13b.fasl cl-user(2): (setq xxx (make-instance 'rot13-bidirectional-stream :base-stream *terminal-io*)) #<rot13-bidirectional-stream "^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@" @ #x716fa842> cl-user(3): (format xxx "hello") uryyb nil cl-user(4): (format xxx "uryyb") hello nil cl-user(5): (read-line xxx) The quick brown fox jumped over the lazy dog. "Gur dhvpx oebja sbk whzcrq bire gur ynml qbt." nil cl-user(6): (read-line xxx) Gur dhvpx oebja sbk whzcrq bire gur ynml qbt. "The quick brown fox jumped over the lazy dog." nil cl-user(7):
Note that the encapsulated stream being used above is *terminal-io*
, which is the same
as *standard-input*
in
this example transcript. This naturally causes the listener to wait
after each (read-line xxx)
for the input, just as a
(read-line *terminal-io*)
or else a
(read-line *standard-input*)
would do the same
thing. Such behavior would not be seen if the stream being
encapsulated were a socket or some other terminal stream.
The source code is available in [Allegro directory]/examples/streams/rot13b.cl. The important definitions are described below:
The class definition for rot13-bidirectional-stream
has the new (in 8.0) bidirectional-character-encapsulating-stream
as its superclass. A bidirectional stream has both the input and the
output slots necessary for dual-buffer operation, and thus that the
standard string strategy functions will work. No other new slots are
necessary in this class.
The code defines a device-read method to read the test to be tranformed and a device-read method to write it out. Note that because this is a string-stream, it will not be dealing with external-formats and stream-external-format will return :default.
This is the basic workhorse routine for rot13. The input character is
rotated within the alphabet if it is an alphabetic character, and
left alone otherwise. Due to the nature of the rot13 algorithm,
given any character char, (rotate-char (rotate-char
char))
will always return char.
Base64 encoding is part of the MIME specification as rfc1521 (see http://www.faqs.org/rfcs/rfc1521.html). It uses a limited character set to textually encode data of any kind, even if the transmission line has as few as six bits in width. This contributes to the universal usability of base64. This encoding is definitely not a compression technique; in fact data are expanded by 33% when decoded.
This encapsulation example demonstrates the capability of simple-streams to work with sub-octet sizes, and for encapsulations to pass character data through octet streams. The example only implements the decoder on the input side of the stream. An encoder/write side could easily be written as well, but would complicate the example.
First, an example run:
;; Assume the file country.txt exists with the following contents: cl-user(1): (shell "cat country.txt") Now is the time for all good people to come to the aid of their country. 0 cl-user(2): (shell "mpack -o country.mime -s test country.txt") 0 cl-user(3): (shell "cat country.mime") Message-ID: <27716.1004255130@killer> Mime-Version: 1.0 Subject: test Content-Type: multipart/mixed; boundary="-" This is a MIME encoded message. Decode it with "munpack" or any other MIME reading software. Mpack/munpack is available via anonymous FTP in ftp.andrew.cmu.edu:pub/mpack/ --- Content-Type: application/octet-stream; name="country.txt" Content-Transfer-Encoding: base64 Content-Disposition: inline; filename="country.txt" Content-MD5: 9wLn2T6WejxfbCKZsS6ALw== Tm93IGlzIHRoZSB0aW1lCmZvciBhbGwgZ29vZCBwZW9wbGUKdG8gY29tZSB0byB0aGUgYWlk Cm9mIHRoZWlyIGNvdW50cnkuCg== ----- 0 cl-user(4): :cl examples/streams/base64 ; Fast loading examples/streams/base64.fasl cl-user(5): (setq yyy (open "country.mime")) #<file-simple-stream #p"country.mime" for input pos 0 @ #x716fd2fa> cl-user(6): (setq xxx (make-instance 'base64-reader-stream :base-stream yyy)) #<base64-reader-stream for input fd #<file-simple-stream #p"country.mime" for input pos 0> @ #x716ff572> cl-user(7): (read-line xxx) "Now is the time" nil cl-user(8): (read-line xxx) "for all good people" nil cl-user(9): (read-line xxx) "to come to the aid" nil cl-user(10): (read-line xxx) "of their country." nil cl-user(11): (read-line xxx) Error: eof encountered on stream #<base64-reader-stream for input fd #<file-simple-stream #p"country.mime" for input pos 585> @ #x716ff572> [condition type: end-of-file] Restart actions (select using :continue): 0: Return to Top Level (an "abort" restart). 1: Abort entirely from this process. [1] cl-user(12):
The algorithm of this encapsulator is a simplistic one; decoding doesn't start until just after the first blank line after a line which starts "Content-Transfer-Encoding: base64". Decoding stops again when a line starts with a space or a dash.
[Allegro directory]/examples/streams/base64.cl. The important definitions are described below:
The class definition for base64-reader-stream
adds
four new slots, two are used for unconverted raw data, and two are
used to prime the stream to start decoding the base64. The raw-data
slot will contain a small string, and the raw-count slot tracks how
much of this string has been moved to the stream's buffer. The primed
slot will be set to either nil
(for unprimed)
or 'primed, which means that the priming string has been seen, or
'ready, which means that the blank line after the priming string has
already been seen.
The device-open method follows the standard from-scratch style, as described in Section 10.3 From-scratch device-open. In addition, the four new slots for this class are initialized to a starting point, and with a raw string that will be used.
The device-buffer-length method defined for this stream class allows a smaller raw string buffer to be used, since rfc1521 limits the length of a valid line to 76 characters.
This variable holds a table, built argorithmically, which allows the conversion from base64 characters to their respective 6-bit codes.
The device-read method returns as much converted data as possible, up to the requested amount. Portions of unconverted raw data are retrieved by calling get-some-raw-data, and if successful, an algorithm is performed to decode that raw data - each 4 octet set of unconverted data is changed to 3 octets of converted data. Any extra octets modulo 4 that have been read are moved to the beginning of the raw buffer and are not decoded until any further reads obtain the full 4-octet package.
This function calls j-read-chars to get as much data as possible, up to the count (but not more than the raw-buffer size). It loops until it can't read anymore, or until it has filled at least some data into the raw-data buffer.
Any data that don't represent the actual base64 text are discarded.
This is done via a state machine, which must be primed and triggered
before the buffer is actually filled. The primed slot is normally
nil
when no base64 encoding is being done,
and is primed to the value 'primed by the function prime-it
(described below). Once the stream is primed, a blank line is needed
to actually trigger the state machine and to put it into 'ready state,
after which the buffer can be filled. Once the state is ready,
characters are read into the raw-buffer until either a #\Newline is
encountered, or until the state is changed again to nil
by the occurrence either of another blank line or
a dash character ( #\- )
The state remains unchanged from call to call, in case no data (or even no header) are read.
This function's purpose is to match arbitrary input to the string "Content-Transfer-Encoding: base64". Such matching can be done one character at a time or from multiple characters, using the prime-count slot which keeps track of the current number of matched characters from previous calls. New data in the buffer argument always starts at the 0th element.
As it exists, the base64 example is a toy only. There are a number of things that might be done to make it less of a toy:
nil
in such a circumstance, even though at least one
of the result octets should have been decodable.
It is possible to create composed external-formats using encapsulating streams. (See Composed External-Formats in iacl.htm for a description of composed external-formats.) Encapsulated-based composed external-formats operate by melding two or more streams together. This is as opposed to macro-based composed external-formats, which operate by combining the composer and composee external-format conversion macros to create a single new external-format conversion macro. Macro-based composed external-formats are defined using compose-external-formats. At this time, functions similar to compose-external-formats which define encapsulated-based composed external-formats are not available, but are planned for future Allegro CL releases.
Allegro CL includes an encapsulated-based composing external-format for translating Ascii return/linefeed octet codes into Common Lisp #\Newline characters. This external-format is called :e-crlf. Its implementation is described in detail in this section.
find-external-format accepts a two-element list argument. The first element names an encapsulated-based composer external-format, and the second element names an external-format. Note that the second element itself can be a list, thus recursively denoting an inner composition.
Thus, (find-external-format '(:e-crlf
:foo-base-ef
))
returns an encapsulated-based
external-format which is the same as the
:foo-base-ef
external-format except that the
Common Lisp #\Newline character is converted to Ascii return/linefeed
codes and vice-versa. (Such external-formats are useful/necessary for
text files native to DOS/Windows which use the two Ascii octet
codes to terminate lines.)
Because a stream's external-format can be switched dynamically, the style of stream encapsulation for external-formats is much different than the normal encapsulation style of attaching a stream handle. One reason for the difference is normal encapsulations can cause identity confusion. If stream xxx is opened and then encapsulated by stream yyy, then any variables that once referred to xxx would have to be changed to refer to yyy instead. If a user does
(setf (stream-external-format xxx) (find-external-format '(:e-crlf :foo-base-ef)))
or, equivalently,
(setf (stream-external-format xxx) '(:e-crlf :foo-base-ef))
then the identity and class of the stream which xxx holds must not change.
The melded-stream slot contains the next stream in a composition instead of the handle slots in external-format encapsulation. And besides that slot, the melding-base slot always contains the base stream of the composition (which is the stream that retains its identity no matter what the composition looks like).
The composing-stream
class is introduced to implement the external-format composition. It
has string-stream
as
superclass, and has no extra slots. An exception to the general rule
that all streams are buffered, composing-streams are not buffered
themselves, but read and write one character at a time; A
specialization on a composing-stream
might have tables to work
with, such as translation tables, in the same way that encapsulating
streams use them, but any buffering is done vicariously through its
base stream.
It is easiest to explain the intricacies of the composing-external-format model by demonstrating what the model would have looked like if the identity problem had not in fact been a problem. Note that this entire section below is for demonstration purposes only, and does not describe the actual composition structure.
A composing-external-format like (:e-crlf
:foo-base-ef
)
would have been
represented as two streams - a composing-external-format stream, whose
melded-stream and melding-base both contain the base stream. The
composing-external-format stream would contain the composing
external-format character strategy functions (in this case, for
example, the j-read-char slot would contain #'crlf-read-char) and the
base stream would contain its own strategy functions.
Reading a character would consist of funcall'ing the j-read-char function of the composing-stream. That function, crlf-read-char, would read and/or unread individual characters from the melded-stream (i.e. the file stream) by funcalling the j-read-char and j-unread-char functions of that stream as appropriate, and by then combining those characters (this composing-external-format will combine #\Return followed by #\Linefeed into a single #\Newline character). The j-read-char functions of the file stream would get its characters in the usual way, by operating on its own buffer (and possibly thus calling one of the device functions).
Although Allegro CL includes only two composing-formats,
:e-crlf
; and the less commonly used
:e-crcrlf
(see #\newline discussion in
iacl.htm for more about the crcrlf
external-format), it is possible that new composing-formats will be
defined in the future. For example, if some kind of ligature combining
external-format called, say, :e-ligature
were
created (though such a external-format currently does not exist) then
it could be combined with others via
(setf (stream-external-format xxx) '(:e-crlf (:e-ligature :foo-base-ef)))
In our hypothetical architecture, we would have set this up as three
streams; if xxx is the stream with the
:foo-base-ef
strategies and the buffer, and
yyy is the composing stream with crlf strategies, and if zzz is the
composing stream with the (hypothetical) ligature strategies, then the
above setf would attach the streams as follows:
yyy -> zzz -> xxx
where the arrow represents the connection made by the melded-stream slot. A read-char on yyy would funcall yyy's j-read-char, which would combine characters obtained by calling j-read-char on its melded-stream zzz, which in turn would return characters formed by combining or expanding ligatures recognized by reading characters from its melded-stream xxx, which finally obtains its characters from its buffer (filling it if necessary).
In reality, the above straw model using a hypothetical architecture
will not work, because the setting of external-formats should never
change the identity of a stream. The above hypothetical architecture
requires that the read-char pass yyy as the stream, but identity
requirements need xxx to be the stream to be passed to read-char. Thus, the real
model is somewhat convoluted; if we still keep our hypothetical
:e-ligature
external-format, then if
(setf (stream-external-format xxx) '(:e-crlf (:e-ligature :foo-base-ef)))
is done, with xxx as the same base stream as before, and if we then say
(with-stream-class (stream) (setq yyy (sm melded-stream xxx)) (setq zzz (sm melded-stream yyy)))
then the following picture would apply:
xxx -> yyy -> zzz
file-simple-stream
whose melding-base is
itself, and whose melded-stream is yyy. Also, xxx's
character-strategy slots will be filled with
:e-crlf
strategies
(e.g. #'excl::crlf-read-char
, etc.)
#'(efft sc-read-char
:foo-base-ef
)
].
Finally, xxx's external-format slot will contain the results of
(find-external-format '(:e-crlf (:e-ligature :foo-base-ef)))
and yyy's external-format slot will contain the results of
(find-external-format '(:e-ligature :foo-base-ef))
and zzz's external-format slot will contain the results of
(find-external-format :foo-base-ef)
The rotation of the strategy functions implies that all strategy functions must be aware of the rotation and must consider where the respective slots are in these cycles; all external-format related slots (such as those for holding last character octet size and external-format state information) are in the current stream, whereas stream-related slots (such as buffers, pointers, etc) are in the melded-stream (the next stream down in the cycle). Highest-level functionlity such as dribble, control-handlers, etc, always remain in the base stream.
Note also that since this kind of "melding" encapsulation is in a different direction than regular encapsulation, the handles of the base-stream are not modified and point to the next "real" encapsulation outward.
So all strategies everywhere take an indirection through the melded-stream slot, to get to the next stream down for its real operation. A standard pattern of coding in use is shown in this example:
(defun crlf-read-char (e-stream eof-error-p eof-value block) (with-stream-class (stream e-stream) (let ((stream (sm melded-stream e-stream)) ...
So in fact the j-read-char in xxx is going to have xxx as e-stream, and yyy as stream.
As an example, consider the crlf algorithm:
The crlf-read-char strategy does this by calling j-read-char and j-unread-char on stream (rather than on e-stream). Note that if there is a third stage, yyy's strategies will be ligature strategies, but will end up operating by calling j-read-char on zzz, which are in fact the base-stream strategies. Now, remembering that all strategies take the indirection, the base-stream strategies will take the melded-stream slot of zzz which is xxx, which has the buffers which these strategies expect.
There are built-in methods for stream classes. Some are described in the following subsections.
The print-object method affects all Common Lisp objects, and streams are no different. Of the two kinds of streams, Gray and Simple, the simple-streams usage of print-object is the most involved.
Some of the general items that might be printed in a stream print-object method are listed by name in each individual description, and might be:
The action of the method on different stream classes is an follows:
excl::file-gray-stream
or excl::socket-stream
, the general items (listed
above -- printing status, open status and position) are also printed.
simple-stream
which is the default
simple-stream method; it prints the stream unreadably with the text
"[not completely built]". This method should always be overridden; its
presence indicates a missing print-object method on the
simple-stream class. Most simple-streams have their own print-object
methods, which will only include the "[not completely built]" text if
the stream does not yet have the correct state after
the device-open method has
fully completed its tasks. If there is not a print-object method for
that simple-stream class, however, this default method indicates that
the stream is not (and will never be) completely built - it represents
a design error in the simple-stream class.
The three major subclasses of simple-stream
treat print-object
slightly differently:
dual-channel-simple-stream
: this class is
listed first because it has a print-object method directly on it; no
print-object methods need be provided for subclasses of dual-channel
simple-streams. This print-object method prints the relevant general
items listed above: printing status and open status (position is
never printed in a dual-channel stream because dual-channel streams
don't tend to have positions). In addition, if the stream has an
input or output handle (or both) they are printed as "fd:
<handle(s)>" where <handle(s)> is one or both handles
separated by /.
single-channel-simple-stream
: because
single-channel streams are so diverse, there is not one style of
printing that can handle all situations, so of the
single-channel-simple-streams listed
in Section 11.0 The simple-stream class hierarchy illustrated,
only the following (and their subclasses) have print-object methods; if subclassing on
any other single-channel stream is desired, a print-object method may need to be
provided as part of the stream implementation:
file-simple-stream
:
this method provides the general items: printing status, open status,
position, as well as the filename and "mapped" if the file is a mapped
file.
direct-simple-stream
:
those direct-simple-streams which are not also mapped-files tend to be
buffer streams. The print-object method for these streams
is simplistic, providing only printing-status and position if
applicable. More complex direct simple-streams should provide their
own print-object methods.
probe-simple-stream
: although this class
is a subclass of file-simple-stream
, it only prints the
filename.
string-simple-stream
: although it might
seem that string-simple-stream
s should be consistent,
they are not; the class is the only class which provides completely
transparent transfer of characters between the API and the device. So
only those streams which are listed below (and their subclasses) have
print-object methods:
string-input-simple-stream
: provides only
the position general item (of the general items listed above). In
addition, a sample of the current contents of the string buffer is
printed as a string; if the string is 15 characters or longer, ... is
printed at the end.
string-output-simple-stream
: provides only the
position general item (of the general items listed above). In
addition, the printing status general item is provided, but instead of
showing an underlying handle, the actual string being built is shown,
with elipses for strings 15 characters or more.
composing-stream
:
provides only the position general item (of the general items listed
above). Also, if the stream has a handle it prints "encapsulated by
<encapsulator>" where <encapsulator> is the stream
encapsulating this one. Also, if the encapsulating stream is a file or
socket, the name is included.
Ever since the inception of simple-streams, there has been a small need for the capability to unread individual octets from a stream. When the Ansi CL spec was being develped around 1990, multi-octet characters were a relatively new concept, and so Common Lisp still held to the concept that the charater was the basic unit of data transfer, and that the "byte" was a variable size. But nowadays, the roles have reversed, and we tend to use the term "octet" (8-bit byte) to refer to what most people think of as bytes, and characters have become the larger unit, sometimes taking 3, 4, or even more octets to encode in various encoding systems, which are handled in Allegro CL by external-formats.
We have always had the peek-char and unread-char functions, and because there is at least one external-format (named :latin1-base, also nicknamed :octets) which upon reading draws a one-to-one correspondence between characters and octets, so that peek-char and unread-char can be used indirectly to peek at and unread bytes. However, in situations where external-formats other than :latin1-base are being used by default, this poses the inconvenience of having to switch external-formats to latin1-base and then back again every time it is desired to unread a single octet. So these two functions are provided which act on octets only and which do not require an external-format change:
Copyright (c) 1998-2022, Franz Inc. Lafayette, CA., USA. All rights reserved.
This page was not revised from the 10.0 page.
Created 2019.8.20.
| Allegro CL version 10.1 Unrevised from 10.0 to 10.1. 10.0 version |