MacroPackage: exclToCDocOverviewCGDocRelNotesFAQIndexPermutedIndex
Allegro CL version 10.1
Unrevised from 10.0 to 10.1.
10.0 version

def-octets-to-char-macro

Arguments: name arglist &body body

name should be a symbol, arglist the macro lambda list (state-loc &key get-next-octet octets-count-loc unget-octets external-format) (this argument is present to make a def-octets-to-char-macro form similar to a defmacro form, the required argument may have any name of course.). body should be a list of forms.

def-octets-to-char-macro is just like defmacro except that name names an external-format, and the actual macro name (a gensym) for the macro being defined is stored in the external-format's octets-to-char-macro slot. Also, if no external-format with name name exists, then a new external-format with name name is created.

The macro being defined must accept one required argument (which we call state-loc but may, of course, have any name), and the following keyword arguments:

Note external-format is used only if the external-format is a wrapper external-format (that is, a composing external-format).

The state-loc should be a place argument acceptable as the first argument to setf. It is used by the convertor for holding state information. The state should be immutable so it can be copied. In other words, one should not use a structure for a state and then change the state using setf.

The get-next-octet argument to the macro being defined is an expression that is invoked in the macro's expansion at the point or points where the translation procedure requests the next external element (octet) to be input.

The octets-count-loc argument should be a place argument acceptable as the first argument to setf. It is set by the convertor to hold the number of octets consumed in order to create the character being returned. The octets-to-char caller must initialize this argument to 1 (the number one). Thus, if the get-next-octet routine is invoked exactly once by the octets-to-char macro, the octets-count-loc argument is left unaltered. If get-next-octets is invoked more than once, octets-count-loc must be incremented before each call to get-next-octets.

If the external-format convertor does not use all the octets retrieved by get-next-octet (ie, it was peeking ahead), then octets-count-loc must be decremented so that its value reflects the actual count of octets used.

In addition to decrementing the octets-count-loc when not all acquired octets are used, the unget-octets argument is also used to inform the octets-to-char caller to place the unused octets back into the sequence for future calls to octets-to-char. unget-octets is a single-argument expression that is invoked in the macro's expansion at the point or points where the translation procedure requests that recently received octets be "un-received". The argument to unget-octets is the count of the most recently received octets to be considered un-received.

If the external-format named by name is a wrapper external-format, then the external-format argument to the macro being defined is the name of the external-format being wrapped.

The defined macro returns a single value, a character. If an error occurs during the conversion, a condition object is returned instead of the values. It is intended that the macro not signal errors directly but that the caller of the chars-to-octets macro (where this macro definition is used) be responsible for signaling any errors.

Examples:

;; Defines an octets -> char procedure for latin1 characters.
;;
;; Note: There's no change to octets-count-loc since it is initialized to 
;; 1 (the number one) and that's as many octets as we take.
;;
(def-octets-to-char-macro :latin1-base (state-loc
					&key get-next-octet external-format
					     octets-count-loc unget-octets)
  (declare (ignore external-format state-loc octets-count-loc unget-octets))
  `(code-char ,get-next-octet))


;; Defines an octets -> char procedure for utf8.
;;
(def-octets-to-char-macro :utf8-base (state-loc
				      &key get-next-octet external-format
				           octets-count-loc unget-octets)
  (declare (ignore external-format state-loc unget-octets))
  (let ((code-var (gensym)))
    `(let ((,code-var ,get-next-octet))
       (if* (< ,code-var #x7f)
	  then (code-char ,code-var)
	elseif (eql #xf0 (logand #xf0 ,code-var))
	  then ;; 32-bit unicode value, which we don't support.
	       (dotimes (i 3)
		 (incf ,octets-count-loc)
		 ,get-next-octet)
	       #\?
	elseif (eql #xe0 (logand #xe0 ,code-var))
	  then (sys::fixnum-to-char
		(logior (ash (logand #x0f ,code-var) 12)
			(progn (incf ,octets-count-loc)
			       (ash (logand #x3f ,get-next-octet) 6))
			(progn (incf ,octets-count-loc)
			       (logand #x3f ,get-next-octet))))
	  else (sys::fixnum-to-char
		(logior (ash (logand #x1f ,code-var) 6)
			(progn (incf ,octets-count-loc)
			       (logand #x3f ,get-next-octet))))))))

;; Defines a wrapper external-format that looks for ASCII <CR> followed by
;; <LF> to combine into #\newline.
;;
(def-octets-to-char-macro :crlf (state
				 &key get-next-octet external-format
				 octets-count-loc unget-octets)
  (let ((char1-var (gensym))		; first char.
	(char2-var (gensym))		; second char.
	(state0-var (gensym))		; original state.
	(state1-var (gensym))		; after first char state.
	(state2-var (gensym))		; after second char state.
	(char1-size-var (gensym))	; first char size.
	(char2-size-var (gensym)))	; second char size.
    `(let* ((,state0-var ,state)
	    (,state1-var ,state0-var))
       (let ((,char1-var (octets-to-char ,external-format ,state1-var
					 :octets-count-loc ,octets-count-loc
					 :unget-octets ,unget-octets
					 :get-next-octet ,get-next-octet
					 ;; Pass eof through
					 :oc-eof-macro nil)))
	 (if* (eq #\return ,char1-var)
	    then (let* ((,char1-size-var ,octets-count-loc)
			(,state2-var ,state1-var)
			(,char2-var
			 (catch ':crlf-eof-catch-tag
			   (incf ,octets-count-loc) ; pre-increment
			   (octets-to-char
			    ,external-format ,state2-var
			    :octets-count-loc ,octets-count-loc
			    :unget-octets ,unget-octets
			    :get-next-octet ,get-next-octet
			    :oc-eof-macro ((hard-eof)
					   `(throw ':crlf-eof-catch-tag
					      ,hard-eof))))))
		   (declare (type adim ,char1-size-var))
		   (if* (eq #\linefeed ,char2-var)
		      then (unless (eq ,state2-var ,state0-var)
			     (setf ,state ,state2-var))
			   #\newline
		    elseif ,char2-var
		      then ;; char2 is a character.  Peel back to char1, which
			   ;; is #\return.
			   (let ((,char2-size-var (- ,octets-count-loc
						     ,char1-size-var)))
			     (declare (type adim ,char2-size-var))
			     (,unget-octets ,char2-size-var)
			     (decf ,octets-count-loc ,char2-size-var)
			     (unless (eq ,state1-var ,state0-var)
			       (setf ,state ,state1-var))
			     #\return)
		      else ;; char2 indicates soft eof.  Peel back everything
			   ;; and pass on the soft eof.
			   (,unget-octets ,octets-count-loc)
			   (setf ,octets-count-loc 0)
			   (oc-eof-macro nil)
			   (error "
 The crlf external-format was not terminated by a call to oc-eof-macro")))
	    else (unless (eq ,state1-var ,state0-var)
		   (setf ,state ,state1-var))
		 ,char1-var)))))

;; The following uses the above defined wrapper external-format to create a
;; octets -> char transform of latin1 to crlf-latin1.
;;
;; Note: Using (compose-external-formats :crlf :latin1) achieves as one of its
;; side-effects the same effect as evaluating the following:
;;
(def-octets-to-char-macro :crlf-latin1-base (char state
                                             &key get-next-octet
					          external-format
						  octets-count-loc
						  unget-octets)
  (declare (ignore external-format))
  `(octets-to-char :crlf ,char ,state
                   :get-next-octet ,get-next-octet
		   :external-format :latin1-base))

See also octets-to-char.

See iacl.htm for more information on international character support in Allegro CL.


Copyright (c) 1998-2022, Franz Inc. Lafayette, CA., USA. All rights reserved.
This page was not revised from the 10.0 page.
Created 2019.8.20.

ToCDocOverviewCGDocRelNotesFAQIndexPermutedIndex
Allegro CL version 10.1
Unrevised from 10.0 to 10.1.
10.0 version