The most common Green Card directive is a procedure specification. It describes the interface to a C procedure. A procedure specification has four parts:
%fun
(Section
sec:type-sig
).
The %fun
statement starts a new procedure specification, giving
the name and Haskell type of the function.
%call
(Section
sec:call
).
The %call
statement tells Green Card how to translate the Haskell
parameters into their C representations.
%code, %end
(Section sec:body ).
The %code
statement gives the body and it can contain arbitrary C
code. Sometimes the body consists of a simple procedure call, but it
may also include variable declarations, multiple calls, loops, and so
on.
The %end
statement is used to specify C code that should be
performed after having performed the code in %code
.
%result
, %fail
(Section sec:result ).
The result-marshalling statements tell Green Card how to translate the result(s) of the call back into Haskell values.
Any of these parts may be omitted except the type signature. If any
part is missing, Green Card will fill in a suitable statement based on
the type signature given in the %fun
statement. For example,
consider the sin
procedure specification again:
%fun sin :: Float -> Float
Green Card fills in the missing statements like this:
The details of the filled-in statements will make more sense after reading the rest of Section sec:proc-spec
%fun sin :: Float -> Float
%call (float arg1)
%code res1 = sin(arg1);
%result (float res1)
The rules that guide this automatic fill-in are described in Section sec:fill-in .
A procedure specification can define a procedure with no
input parameter, or even a constant (a ``procedure'' with no
input parameters and no side effects). In the following example,
printBang
is an example of the former, while grey
is an example
of the latter:
When there are no parameters, the%call
line can be omitted. The second example can also be shortened by writing a C expression in the%result
statement; see Section sec:result .
%fun printBang :: IO ()
%code printf( "!" );
%fun grey :: Colour
%code r = GREY;
%result (colour r)
All the C variables bound in the %call
statement or mentioned in the
%result
statement, are declared by Green Card and in scope
throughout the body. In the examples above, Green Card would have
declared arg1
, res1
and r
.
The %fun
statement starts a new procedure specification.
Green Card supports two sorts of C procedures: ones that may cause
side effects (including I/O), and ones that are guaranteed to be pure
functions. The two are distinguished by their type signatures.
Side-effecting functions have the result type IO t
for some type
t
. If the programmer specifies any result type other than IO t
,
Green Card takes this as a promise that the C function is indeed pure,
and will generate code that calls unsafePerformIO
.
The procedure specification will expand to the definition of
a Haskell function, whose name is that given in the %fun
statement,
with two changes: the longest matching prefix specified with a
%prefix
(Section
sec:prefix
elaborates) statement is
removed from the name and the first letter of the remaining function
name is changed to lower case. Haskell requires all function names to
start with a lower-case letter (upper case would indicate a data
constructor), but when the C procedure name begins with an upper case
letter it is convenient to still be able to make use of Green Card's
automatic fill-in facilities. For example:
%fun OpenWindow :: Int -> IO Window
would expand to a Haskell function openWindow
that is implemented by
calling the C procedure OpenWindow
.
%prefix Win32
%fun Win32OpenWindow :: Int -> IO Window
would also expand to a Haskell function openWindow
, but is
implemented by calling the C procedure Win32OpenWindow
.
The %call
statement tells Green Card how to translate the
Haskell parameters into C values. Its syntax is designed to look
rather like Haskell pattern matching, and consists of a sequence of
zero or more Data Interface Schemes (DISs), one for each
(curried) argument in the type signature. For example:
%fun foo :: Float -> (Int,Int) -> String -> IO ()
%call (float x) (int y, int z) (string s)
...
This %call
statement binds the C variables x
, y
,
z
, and s
, in a similar way that Haskell's pattern-matching
binds variables to (parts of) a function's arguments. These bindings
are in scope throughout the body and result-marshalling statements.
In the %call
statement, ``float
'', ``int
'', and
``string
'' are the names of the DISs that are used to translate
between Haskell and C. The names of these DISs are deliberately
chosen to be the same as the corresponding Haskell types (apart from
changing the initial letter to lower case) so that in many cases,
including foo
above, Green Card can generate the %call
line
by itself (Section
sec:fill-in
).
In fact there is a fourth DIS hiding in this example, the (_,_)
pairing DIS. DISs are discussed in detail in Section
sec:dis
.
The body consists of arbitrary C code, beginning with %code
.
The reason for allowing arbitrary C is that C procedures sometimes have
complicated interfaces. They may return results through parameters
passed by address, deposit error codes in global variables, require
#include
'd constants to be passed as parameters, and so on. The
body of a Green Card procedure specification allows the programmer to
say exactly how to call the procedure, in its native language.
The C code starts a block, and may thus start with declarations that create local variables. For example:
%code int x, y;
% x = foo( &y, GREY );
Here, x
and y
are declared as local variables. The local C
variables declared at the start of the block scope over the rest of the
body and the result-marshalling statements.
The C code may also mention constants from C header files, such as
GREY
above. Green Card's %#include
directive tells it which
header files to include (Section
sec:import
).
GHC specific: The GHC backend makes a distinction between safe and unsafe external calls: If the external call will cause a garbage collection, you have to call it safely. (A good example of where this is likely to occur is when the external call invokes a Haskell callback.) Green Card supports safe calls in a couple of ways:
--safe-code
will cause
the generated code to call all the %code
snippets safely.%safecode
declaration is identical
to %code
except that when generating code for GHC,
the code snippet will be called safely. Performing a safe
call involves saving away all abstract machine state before
performing the call, so it is recommended to use %safecode
over
--safe-code
, since it offers more fine grained control.
Functions return their results using a %result
statement.
Side-effecting functions, ones whose result type is IO t
,
can also use %fail
to specify the failure value.
The %result
statement takes a single DIS that describes how to
translate one or more C values back into a single Haskell value. For
example:
%fun sin :: Float -> Float
%call (float x)
%code ans = sin(x);
%result (float ans)
As in the case of the %call
statement, the ``float
'' in the
%result
statement is the name of a DIS, chosen as before to
coincide with the name of the type. A single DIS, ``float
'', is
used to denote both the translation from Haskell to C and that from C
to Haskell, just as a data constructor can be used both to construct a
value and to take one apart (in pattern matching).
All the C variables bound in the %call
statement, and all those
bound in declarations at the start of the body, scope over all
the result-marshalling statements (i.e. %result
and %fail
).
In a result-marshalling statement an almost arbitrary C expression, enclosed in braces, can be used in place of a C variable name. The above example could be written more briefly like this:
It can be written more briefly still by using automatic fill-in (Section sec:fill-in ).
%fun sin :: Float -> Float
%call (float x)
%result (float {sin(x)})
The C expression can neither have assignments nor nested braces as that could give rise to syntactic ambiguity (Section sec:record-dis elaborates).
A side effecting function returns a result of type IO t
for some type
t
. The IO
monad supports exceptions, so Green Card allows
them to be raised.
The result-marshalling statements for a side-effecting call consists
of zero or more %fail
statements, each of which conditionally
raise an exception in the IO
monad, followed by a single
%result
statement that returns successfully in the IO
monad.
Just as in Section
sec:result
, the %result
statement
gives a single DIS that describes how to construct the result Haskell
value, following successful completion of a side-effecting operation.
For example:
%fun windowSize :: Window -> IO (Int,Int)
%call (window w)
%code struct WindowInfo wi;
% GetWindowInfo( w, &wi );
%result (int {wi.x}, int {wi.y})
Here, a pairing DIS is used, with two int
DISs inside it. The
arguments to the int
DISs are C record selections, enclosed in
braces; they extract the relevant information from the
WindowInfo
structure that was filled in by the GetWindowInfo
call.
This example also shows one way to interface to C procedures that manipulate structures.
The %fail
statement has two fields, each of which is either a C
variable, or a C expression enclosed in braces. The first field is a
boolean-valued expression that indicates when the call should fail;
the second is a (char *)
value that indicates what sort of failure
occurred. If the boolean is true (i.e. non zero) then the call fails
with a userError
in the IO
monad containing the specified string.
For example:
%fun fopen :: String -> IO FileHandle
%call (string s)
%code f = fopen( s );
%fail {f == NULL} {errstring(errno)}
%result (fileHandle f)
The assumption here is that fopen
puts its error code in the global
variable errno
, and errstring
converts that error number to
a string.
UserError
s can be caught with catch
, but exactly which error
occurred must be encoded in the string, and parsed by the
error-handling code. This is rather slow, but errors are meant to be
exceptional.
Any or all of the parameter-marshalling, body, and result-marshalling statements may be omitted. If they are omitted, Green Card will ``fill in'' plausible statements instead, guided by the function's type signature. The rules by which Green Card does this filling in are as follows:
%call
statement is filled in with a DIS for each
curried argument. Each DIS is constructed from the corresponding argument
type as follows:
arg1
, arg2
, arg3
, and so on.%code r = f ( a_1, ... , a_n );
where
%call
statement. %result
statement. (There should only be one such variable if the body is
automatically filled in.) %result
statement is filled in by a %result
with a DIS constructed from the result type in the same way as for a
%call
. The result variables are named res1
, res2
,
res3
, and so on.
%fail
statements.
Some C header files define a large number of constants of a particular
type. The %const
statement provides a convenient abbreviation
to allow these constants to be imported into Haskell.
For example:
%const PosixError [EACCES, ENOENT]
This statement is equivalent to the following %fun
statements:
%fun EACCES :: PosixError
%fun ENOENT :: PosixError
After the automatic fill-in has taken place we would obtain:
%fun EACCES :: PosixError
%result (posixError { EACCES })
%fun ENOENT :: PosixError
%result (posixError { ENOENT })
Each constant is made available as a Haskell value of the specified
type, converted into Haskell by the DIS function for that type.
(It is up to the programmer to write a %dis
definition for the
function --- see Section
sec:dis-macro
.)
There's a variant way of declaring constant within the `%const
'
directive: you may specify the Haskell name that the C constant name
maps to:
%const PosixError [
% errAccess = {EACCES},
% errNoEnt = {ENOENT}
% ]
The %const
declarations allows you to map external
names/constants onto Haskell Int
s. A more typesafe
mapping of constants is provided by the %enum
declaration,
which maps a set of constants to a Haskell type.
The support for %enum
is closely based on a design suggested by
Sven Panne.
For example, here's PosixError
expressed using %enum
:
%enum PosixError Int [EACCES, ENOENT]
This creates the following data type plus marshalling functions:
data PosixError = EACCES | ENOENT
marshall_PosixError :: PosixError -> Int
marshall_PosixError = ...
unmarshall_PosixError :: Int -> PosixError
unmarshall_PosixError = ...
Additionally, it also implicitly creates a DIS:
%dis posixError x = int x
In the event you want the Haskell compiler to automatically
generate instances for the enumeration type it generates, %enum
supports this too:
%enum PosixError (Eq,Show) Int [EACCES, ENOENT]
will generate a data type with the derived instances Eq
and
Show
:
data PosixError = EACCES | ENOENT deriving (Eq, Show)
In C, some libraries give all their exported names the same prefix,
thereby minimizing the impact on the shared namespace. In Haskell
we use qualified imports to achieve the same result. To simplify the
conversion of C style namespace management to Haskell the %prefix
statement specifies which prefixes to remove from the Haskell function
names.
module OpenGL where
%prefix OpenGL
%prefix gl
%fun OpenGLInit :: Int -> IO Window
%fun glSphere :: Coord -> Int -> IO Object
This would define the two functions init
and sphere
which
would be implemented by calling OpenGLInit
and glSphere
respectively.
It is sometimes useful to be able to write arbitrary lines of C code
outside any procedure specification, for instance to include a helper
C function or define a C structure. The `%C
' statement is
provided for this purpose:
%C typedef struct _point {
% int x;
% int y;
% } point;
The C code is added directly to the generated C file.
Dis -> DisFun [ arg_1 ... arg_n ] Application
| Cons [arg_1 ... arg_n] Constructor n >=0
| Cons [arg_1 ... arg_n] Constructor n >=0
| Cons { field_1 = dis_1 ,
... ,
field_n = dis_n } Record n >= 1
| < Var / Var >
[arg_1, ... , arg_n] User defined
marshalling, n >= 0
| 'declare' [cexp Var] 'in' [Dis]
| Adis
Adis -> '(' Dis ')'
| TypeCast Cexp Result only
| TypeCast Var
| Var Bound by '%dis'
| '(' [Dis_1, ... , Dis_n] ')' Tuple n >= 0 @
Arg -> Adis
| Cexp
| Var
DisFun -> Var
TypeCast -> Cexp C Expression
Var -> Var Initial letter lower case