2000Oct31 Update to version 1.6.1 - ACL 6.0
2004Jun12 Update to 1.6.3 - ACL 7.0 and 8.0
The purpose of this tool is to facilitate the creation of an interface between Allegro CL and a library of C functions and subprograms.
The C library is typically defined by one or more header files that declare the functions and variables that make up the interface and define named constants significant at the interface. This tool parses the header files and generates appropriate Lisp and C code to create the interface. In some cases, warning messages are generated to point out areas that may need additional programmer intervention.
The Binder tools assume that the (header file) input is a complete and accurate representation of the _interface_ to an application library.
The Binder tools are not designed to examine a complete application and to somehow abstract out the interface definition. We assume that this abstraction step has already been taken by the time the input is presented to the tools.
In the case of third party libraries, this is typically the case. The library is defined by a set of header files that are needed to write correct C code that calls these libraries. The same headers are necessary and usually sufficient to generate the Lisp interface to the library.
In the case of user written applications or libraries, it is necessary for the user to create appropriate header files defining the interface. The binder tools will fail miserably if simply presented with the application code files. For example, the body of function definitions is entirely skipped by the source code analyzer; thus, if some type declaration occurs only in the scope of a function definition, it will never be seen.
The binding process is normally invoked with the function ff:build-c-binding
with the following arguments:
ff:build-c-binding c-or-h-file Function &key (c-args "") (lisp-out t) (c-out t) include exclude (package nil) (case ff:*decode-intern-case*) (hyphen ff:*decode-intern-hyphen*) (dash ff:*decode-intern-dash*) (res ff:*decode-intern-res*) (verbose nil)
The first and only required argument, c-or-h-file
, is the pathname
to a file of C code. This file, and any files that it includes, will be parsed in order to
determine what interface components need to be generated. This is normally a header file (.h
file type or extension), but may also be a C source file.
The keyword argument c-args
is very important to the correct
functioning of the interface generator. It is a string containing all the -D
and -I
switches required for a complete and correct compilation of the file
specified in the c-or-h-file
argument. If this argument is missing
or incorrect, the effect is usually to print many pre-processor and parser warnings and
error messages; but it may also silently result in an incorrect interface definition.
The keyword argument lisp-out
is the pathname of a file where the
generated Lisp code is placed. If this argument is NIL, nothing is generated. If this
argument is t, the output is printed to the current value of *standard-output*
.
The keyword argument c-out
is the pathname of a file where the
generated C code is placed. If this argument is NIL, nothing is generated. If this
argument is t, the output is printed to the current value of *standard-output*
.
C program text is generated when wrapper functions are needed to transmit data correctly
between Lisp and C; one situation is when we need to pass structure arguments by value
between Lisp and C.
The keyword arguments include
and exclude
determine which C files are used to generate Lisp interface components. Only one of these
arguments may be used in any call to ff:build-c-binding
. When
specified, the value must be a pathname or a list of pathnames. The effect of the include
keyword is to generate Lisp interface components only from the specified file or files;
any other C files included in the compilation are ignored by the Lisp binder but may be
essential to the C parsing stage of the process. The effect of the exclude
keyword is to ignore the specified files in the Lisp binder. For large libraries using
complex collections of include files, it may be necessary to make several passes through
the binding process in order to sort out the files needed in the Lisp interface. The file
selection arguments apply only to interface components generated from C statements. The
current implementation cannot determine the origin of a C macro and therefore all constant
definitions are always included in the generated output.
The keyword arguments case
, hyphen
, dash
,
and res
control how foreign symbol strings are translated into Lisp
symbols. The default values are taken from corresponding special variables described in
the sections on Lisp output and customization. The default behavior is to convert foreign
name strings according to the current readtable case and to signal an error if a conflict
occurs. A conflict occurs when two different foreign strings map to the same Lisp symbol
or when the Lisp symbol is already bound or defined as a function or macro.
The keyword argument package
is the name of a package where foreign
symbols are interned. The default is NIL. The effect is to use the value of the global
variable ff:*default-foreign-symbol-package*
. If the value of this
variable is NIL, the effect is to use a package defined as follows
(defpackage "C" (:use common-lisp foreign-functions))
It is a good idea to place all the foreign symbols in a Lisp package that does not use any other Lisp packages. This is the only definitive way to avoid symbol name conflicts.
The keyword argument verbose
controls the amount of information
printed when a parse or translation error is encountered. When T, a fragment of the parse
tree is printed.
The generated Lisp output consists of
ff:bind-c-function
or ff:bind-c-alternate
forms
for declared C functions ff:bind-c-type
or ff:bind-c-typedef
forms for
declared C types ff:bind-c-constant
forms for C constants defined with #define
The pupose of these macros is to emit an ff:bind-c-export
form to
cause the conditional export of Lisp symbols corresponding to foreign identifiers, and to
allow further customization of the generated code when so desired by the user.
When the C file contains a function declaration, a Lisp form such as the one below is generated:
;; c-ex01.h:7 <2> ;; struct hostent* gethostbyname( const char*, struct hostent*, char*, int, ;; int* h_errnop); (bind-c-function gethostbyname :unconverted-entry-name "gethostbyname" :c-return-type ("struct" "hostent" "*") :return-type (* hostent) :c-arg-types (("const" "char" "*") ("struct" "hostent" "*") ("char" "*") ("int") ("int" "*")) :c-arg-names (Arg0 Arg1 Arg2 Arg3 h_errnop) :arguments ((* :char) (* hostent) (* :char) :int (* :int)) :prototype t )
When compiled, this form macroexpands into a suitable ff:defforeign
form.
The leading comment identifies the C statement number in the source file and shows a reconstruction of the original C source declaration. The Lisp type specified for each argument in the :arguments list is the most specific Lisp type that includes all the possible Lisp argument types.
When the C file contains a struct declaration or a typedef, a corresponding ff:bind-c-type
or ff:bind-c-typedef
form is generated:
;; c-ex01.h:10 <3> ;; struct servent { ;; char* s_name; char** s_aliases; int s_port; char* s_proto; }; (bind-c-type servent (:struct (s_name (* :char)) ;; char* s_name (s_aliases (* :char)) ;; char** s_aliases (s_port :int) ;; int s_port (s_proto (* :char)) ;; char* s_proto )) ;; bind-c-type servent ;; c-ex01.h:17 <4> typedef unsigned long ulong; (bind-c-type ulong :unsigned-long)
Constants defined in the C source files are translated to ff:bind-c-constant
forms.
;; #define BAR 17 (BIND-C-CONSTANT BAR 17) ;; 0x11 ;; #define XST_DATARCVD 6 (BIND-C-CONSTANT XST_DATARCVD 6) ;; 0x6 ;; #define SETRGBSTRINGA "commdlg_SetRGBColor" (BIND-C-CONSTANT SETRGBSTRINGA "commdlg_SetRGBColor") ;; #define WM_DDE_FIRST 992 (bind-c-constant WM_DDE_FIRST 992) ;; 0x3e0 (bind-c-constant WM_DDE_TERMINATE (+ WM_DDE_FIRST 1)) (bind-c-constant WM_DDE_ADVISE (+ WM_DDE_FIRST 2))
When the C file contains a macro definition that defines an alternate name for a
declared function, the additional names appear in a :all-names
keyword
argument in the ff:bind-c-function
form. The value is an alist of a
symbol and foreign string pair for each name of the foreign function.
By default this list is sorted on the length of the foreign string. The first item in the list is used for the name of the primary lisp function defined by defforeign. The other names are used to define alternate macros with a bind-c-alternate form such as
;; #define AddAtom AddAtomA (BIND-C-ALTERNATE ADDATOMA (&rest args) `(ADDATOM ,@args))
When a C function is declared to receive a struct by value, we need to generate some new C program code because ACL only allows struct pointers as arguments. We create an intermediate C function that receives a pointer argument and passes the dereferenced pointer to the intended function. When this situation is encountered in the C source file, the following comment and definition are generated.
;; c-ex02.h:7 <5> void passHostEnt( struct hostent x); ;;NOTE: C wrapper needed to pass structure or union type ;; hostent ;; as argument. (bind-c-function passHostEnt :unconverted-entry-name "ACL_passHostEnt" :c-return-type ("void") :return-type :void :c-arg-types (("struct" "hostent" "*")) :c-arg-names (x) :arguments ((* hostent)) :prototype t )
Note how the foreign name in this case is not identical to the Lisp name of the C function.
The additional C definition is generated in the C output file as follows:
/* Wrapper function to dereference pointers to structure arguments. */ void ACL_passHostEnt( struct hostent * x) { passHostEnt(*x); }
When a C function returns a struct by value, a similar wrapper is generated to return a pointer to the structure in freshly allocated malloc memory.
;; c-ex02.h:9 <6> struct hostent ReturnHostEnt( int); ;;NOTE: C wrapper needed to return structure or union type ;; hostent. (bind-c-function ReturnHostEnt :unconverted-entry-name "ACL_ReturnHostEnt" :c-return-type ("struct" "hostent" "*") :return-type (* hostent) :c-arg-types (("int")) :c-arg-names (Arg0) :arguments (:int) :prototype t )
Generated C wrapper:
/* Wrapper function to return pointer to structure. */ int ACL_ReturnHostEnt( int Arg0) { int ptr = (int)malloc(sizeof(struct hostent )); *((struct hostent *)ptr) = ReturnHostEnt(Arg0); return(ptr); }
When the C function returns an unsigned value, we generate a Lisp wrapper to extract the value correctly.
The macros ff:bind-c-function
and friends are generated in the Lisp
output file in order to allow convenient user customization of the actual foreign
interface. The built-in definition in file cdbind.cl
emits a ff:defforeign
form formed by simply selecting the appropriate components of the ff:def-c-binding
form. Users with their own foreign type layer may use other components of that form to
generate a more specific foreign interface call.
The function ff::decode-intern
is used exclusively to convert
strings to symbols in the binder. It is a function of one argument, a string or symbol. If
the argument is a symbol, the function assumes it is already converted and does nothing.
If the argument is a string, this function uses the following special variables (re-bound by build-c-binding) to determine how the conversion is done:
*decode-intern-hyphen* NIL - no effect T - Insert a hyphen at every lower-to-upper transition and every case-sensitive-to-insensitive transition: MenuItemFromPoint ==> Menu-Item-From-Point Menu3 ==> Menu-3 *decode-intern-dash* NIL - no effect T - translate underscore characters to hyphens *decode-intern-case* a readtable - use the value of readtable-case for that readtable :READER - use the value of readtable-case for *readtable* :PRESERVE - keep the case of the string unchanged :DOWNCASE - convert the string to lowercase letters :UPCASE - convert the string to uppercase letters :INVERT - if all the case-sensitive letters are of one case switch them all to the other case, otherwise leave it be *decode-intern-res* :error - signal an error if a name conflict occurs :index - try adding 0, 1, 2, ... to the end of the string until the conflict is resolved a list - take suffix strings in order from the list and append to the foreign until the conflict is resolved. If the list runs out, signal an error.
If a different name translation scheme is desired, the function decode-intern must be redefined as required.
Messages of the form
... ;;WARNING: ...
are emitted when the binder detects a situation where the generated code may function incorrectly or when the binder is unable to generate any correct code at all.
Messages of the form
... ;;NOTE: ...
are emitted when the binder generates additional Lisp or C code that may be needed to use the interface effectively.
In many cases, inspection of the generated code and specific application knowledge will reveal that the generated code is adequate. In some cases it may be necessary to modify the generated code.
The binder generates possibly incorrect code with a warning to prevent a possible cascade of subsequent warnings that might be caused by generating nothing. This might be the case if a type cannot be generated correctly
(ff:bind-c-constant "foo" "StrData") ;;WARNING: C code expects wide string"
This warning is emitted when the C string is defined with the L
modifier.
;; c-ex03.h:6 <7> typedef long long LongLong; ;;WARNING: 'long long' is implemented as a struct of 2 long! (bind-c-type LongLong long-long)
The binder defines and emits the following 64 and 128-bit types:
long long -> FF:LONG-LONG unsigned long long -> FF:UNSIGNED-LONG-LONG long double -> FF:LONG-DOUBLE
These are defines as structs of two long or two double values. This definition has the correct storage size but may or may not have the correct alignment. In addition, none of these definitions have a numeric equivalent in ACL, and thus do not behave as numbers in Lisp code.
;;WARNING: Eval of above Lisp form resulted in error: ;; Lisp error condition
This message is usually emitted following a foreign type definition. This result is likely to produce a cascade of subsequent error messages.
;; c-ex04.h:7 <9> extern int exec1( const char*, const char*, ELLIPSIS); ;;NOTE: Lisp args to this function will get default conversions only. (bind-c-function exec1 :unconverted-entry-name "exec1" :c-return-type ("int") :return-type :int :c-arg-types (("const" "char" "*") ("const" "char" "*") "...") :c-arg-names (Arg0 Arg1) :arguments nil )
This foreign function may be called with any number of arguments, but all the necessary type conversions must be implicit in the Lisp type of each argument.
When the C function requires a wrapper, only a fixed number of arguments may be passed through the wrapper. In that case we generate the warning of the form
;; c-ex04.h:9 <10> extern int exec2( LongLong, const char*, ELLIPSIS); ;;NOTE: C wrapper needed to pass structure or union type ;; LongLong ;; as argument. /* Wrapper function to dereference pointers to structure arguments. */ int ACL_exec2( LongLong * Arg0, const char * Arg1) { return(exec2(*Arg0, Arg1)); } ;;WARNING: This wrapper function will only pass exactly 2 arguments (bind-c-function exec2 :unconverted-entry-name "ACL_exec2" :c-return-type ("int") :return-type :int :c-arg-types (("LongLong" "*") ("const" "char" "*")) :c-arg-names (Arg0 Arg1) :arguments ((* LongLong) (* :char)) :prototype t )
The C parser and parse-tree decoder we use is far from complete from the point of view of C language semantics. We have attempted to handle a large collection of commonly found cases, but many others are still possible. The structure of the parse-tree decoder is modular and extensible so that new cases may be added easily when necessary.
When a new unhandled case is encountered, and the :verbose argument to build-c-binding is T, we print a warning message of the form
;;WARNING Unknown STATEMENT #| :tag1 parse-tree-dump-1 :tag2 parse-tree-dump-2 ... |#
and continue. In most cases, if you send us
we may be able to create an extension to the parse-tree decoder in short order.
The parser used in this tool is derived from the GNU C compiler. It uses the Bison
grammar for C included with the distribution of GCC. The specific version of the grammar
and parser can be obtained in the usual manner by calling our modified version of cpp
with the -version
switch.
C macros are processed from the output of the GNU C pre-processor called with the -dM
switch. This produces a list of #define
lines that does not reflect the input
order of the definitions. When dependencies between definitions can be determined, we
order the corresponding Lisp definitions accordingly. Otherwise, the Lisp definitions are
ordered alphabetically by name.
Macro definitions are parsed by our own ad hoc parser that recognizes the following patterns:
;;WARNING: (args) #define funmac(x) foo(x)
123 123L 0x1a3 0x1a3L -num (num)
and many expressions consisting entirely of explicit constants such as
(1<<12) (3+5) (2+3+4+5) (3*100)
( ident + const )
;;WARNING: (expr) #define WM_DDE_UNDEF1 (17+WM_DDE_FIRST) ;;WARNING: (expr) #define WM_DDE_UNDEF2 WM_DDE_FIRST+11
;;WARNING: Undefined base constant UNDEF (bind-c-constant FOO1 (+ UNDEF 1))
;;WARNING: (undef) ...
Some other possible error messages are:
;;WARNING: Multiple definitions of symbol ;;WARNING: Ill-formed macro ;;WARNING: SysValErr: sys-info-dump
The tool consists of two Unix executable files and 3 Lisp source files. The Lisp files are loaded in the following sequence:
:ld loadcb
All three Lisp files are required in order to run the Binder.
Only one file, cdbind
, is needed to compile and/or load
the output of the binder. This file may also need to be modified
if the output of the binder needs to be customized.
The following two variables must be initialized in loadcb.cl
to the
correct pathname for the two Unix executables:
ff:*c-parser-cpp*
ff:*c-parser-cc1*
Other variables:
ff:*default-foreign-symbol-package*
The setting of this variable is discussed above.
ff:*export-foreign-symbols*
When this variable is set to a non-NIL value, all symbols created in the above package are exported. When this variable is non-NIL, foreign symbols are referenced with single-colon notation from other Lisp packages. When this variable is set to NIL, double-colon notation must be used for all symbols in the foreign-symbol package.
NOTE: in the current implementation, only symbols passed to ff:bind-c-export
are exported.
ff:*c-compiler-macro-names*
The value of this variable is a list of strings naming variables that should be ignored by the interface generator. These are typically state variables for the compilation and do not affect the usage of the interface.