[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1. Introduction to SAML


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.1 What is this document about?

The Simple Algebraic Math Library (SAML) is a low-level library to handle the classic objects of symbolic calculus: arbitrary big integers, reals of arbitrary precision, rationals, polynomials, tensors, et cetera. This document describes the internals of the library and other implementation details.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.2 Copyright

The SAML package is free; this means that everyone can use it and redistribute it on a free basis. Some files in the SAML archive were not written by me, and have their own copyright notice -- they are included only for the convenience of the installer. Some files are so trivial that they've been explicitly released in the Public Domain. All other files, including this documentation, are copyrighted by me (Thierry Bousch) and redistributable according to the GNU Public Licence. Read the file `COPYING' for the details of this license.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.3 Our first SAML program

The following program creates two variables, stores 123 and 456 into them, adds the second variable to the first one and displays the result.

 
/* File example.c */
#include <stdio.h>
#include "saml.h"

int main (void)
{
        mref_t v1, v2;

        /* Create two mrefs */
        v1 = mref_new();
        v2 = mref_new();
        /* Store the initial values */
        mref_build(v1, ST_INTEGER, "123");
        mref_build(v2, ST_INTEGER, "456");
        /* v1 := v1 + v2 */
        mref_add(v1, v1, v2);
        printf("The sum is %s\n", mref_string(v1));
        /* Free the mrefs */
        mref_free(v1); mref_free(v2);
        return 0;
}

The two mref_free() calls at the end of the program could have been omitted, since the process will terminate and release all resources. They would be important if this routine was part of a larger program.

Now compile and run the program:

 
$ gcc -o example example.c -lsaml
$ ./example
The sum is 579
$

You'll easily build more impressive examples, for instance by using rationals instead of integers:

 
    mref_build(v1, ST_RATIONAL, "-777777777777777");
    mref_build(v2, ST_RATIONAL, "123/456");

Note that mref_build() with type ST_RATIONAL will automatically create rationals over the integers, whereas the ST_RATIONAL type allows rationals over any ring.

For polynomials, the situation is not so nice; it is not clear whether an expression like 3*xy+1 is a polynomial with integer or rational coefficients (or worse), and it is not clear either whether xy is a literal or the product of the two literals x and y. Thus, the "build" operation is not supported by the ST_POLY type, because it is not very well defined. You have to build polynomials "by hand", first creating literals and then promoting them to polynomials. This is obviously painful, fortunately SAML provides a simple parser for algebraic expressions (in files `simple-lexers.c' and `simple-parser.c') which can be used for polynomials. You still need to create an object of the right type (a polynomial with integer coefficients, say) before calling the parser. Our example would be:

 
    v1 = mref_new(); v2 = mref_new();
    mref_build(v2, ST_INTEGER, "0");
    mref_cast(v2, ST_APOLY);
    /* Now v2 contains the null polynomial */
    saml_init_lexer_mem("3*xy+1", 6);
    saml_parse(v1, v2);
    /* Now v1 contains 3*xy+1, v2 is unchanged */
    saml_init_lexer_mem("x-y", 3);
    saml_parse(v2, v2);
    /* And v2 contains x-y */

The parser assumes C-style identifiers (although FORM-style identifiers like [foo-bar] are also accepted) and all operators (sum, product) must be explicit. The string length argument is mandatory, because this parser should also be able to work with strings which are not zero-terminated, or could contain embedded zeros.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2. Terminology and data types


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.1 Math Nodes (mnodes)

There are two basic data types used by SAML. The most fundamental one is the mnode (math node). Just likes inodes on the UNIX system, these objects are invisible to the user, and are only used internally.

Mnodes are described in the header file `mnode.h'. Each mnode consists of a type (represented by an integer), a reference counter, and type-specific data. These data can include pointers to other mnodes. There must be no cyclic references; an mnode should only reference mnodes which have been created before.

The type field is used for dispatching; when you want to add two polynomials, SAML checks that both operands have the same type, and calls the type-specific operation, in this case poly_add(). This function will, in turn, invoke mono_add() to group similar monomials, which will in turn call the generic function mnode_add() to add the coefficients.

When you add two polynomials, there will be many common monomials between the two operands and their sum. This is why a reference counter has been added, to avoid unnecessary copies of objects. When a mnode is no longer needed, the library decrements the counter, and frees the object when the counter reaches zero (once again, it is very similar to hard links on a UNIX filesystem). If this mnode refers to other mnodes, they should have their reference counters decremented as well, and so on, recursively.

Error conditions (as the result of a calculation), and uninitialized objects, are also implemented as mnodes of the special type ST_VOID. The advantage of this scheme is that all type-specific operations on mnodes, like poly_add(), can always assume that their operands are valid, since the dispatcher will never call this function if one of the operands doesn't have the type ST_POLY.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.2 Math References (mrefs)

Mnodes must be allocated and freed explicitly. This is error-prone and can easily lead to crashes or memory leaks. Therefore, it is important to use mnodes only internally, and to expose a higher-level interface which takes care of allocation details and guarantees that user code can't crash the library.

This is done with mrefs (math references). If mnodes are similar to inodes, then mrefs are similar to file descriptors on UNIX. They are identified with a small integer, and must be created and deleted explicitly. But unlike mnodes, which represent a constant object, mrefs are really variables: you can assign to them, do operations on them, without any reference-counting hassle. Also, the library is able to check that the mrefs are valid, whereas it wouldn't be possible if objects were referred to by pointers. Thus, mrefs provide some simple form of garbage collecting: there can be no dangling pointers (since you don't have access to pointers) and mnodes can leak only if you don't keep track of your mrefs.

It is strongly recommended to declare mrefs as objects of type mref_t (which is only a typedef for int), especially in function prototypes. Otherwise you have annoying ambiguities -- see for instance the prototype of mref_power() below. It also avoids the horrible "hungarian notation" where one encodes the type into the variable names, and calls some variable `mra', say, instead of the simpler `a'.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.3 Simple Math Types

These are math types whose mnodes don't hold references to other mnodes. In other terms, they are completely self-contained. For now, the following simple math types are implemented: ST_VOID, ST_LITERAL, ST_MINT, ST_INTEGER, ST_CYCLIC and ST_FLOAT.

`ST_VOID'
Objects of type ST_VOID represent either errors, or uninitialized objects. They are couples (errno,where) where errno is a positive error number (or 0 for an uninitialized object) and where is a short string indicating in which routine the error originated. Obviously, there are almost no operations possible on these objects, and the dispatcher treats them specially anyway.

`ST_LITERAL'
These are simply strings. They have the nice property that two different mnodes will always contain two different strings. Thus, two literals are equal if and only if they have the same address. Literals are used by monomials, polynomials, and tensors.

`ST_MINT'
Machine Integers, usually 32-bit wide. This was essentially an exercise to test the basic mechanisms of SAML. Mostly useless, but still used by the library in a few places.

`ST_INTEGER'
Integers without any size limitation. They are stored in decimal form (each 32-bit word contains nine digits) which makes I/O very simple. But obviously it's not the most efficient implementation. If we were using base 2^32 instead of 10^9, most operations would be twice as fast, at least asymptotically. I'll probably do it someday, to avoid code duplication between integers and reals.

`ST_CYCLIC'
Cyclic Integers, i.e., elements of Z/nZ for some n. The modulus n cannot exceed 2^32-1. This limitation sounds annoying, but allows very fast operations. Moreover, identical numbers will always refer to the same mnode (just like literals), which greatly reduces memory usage and the overhead of malloc/free operations, even with high moduli.

`ST_FLOAT'
Floating-point numbers, with arbitrary (but fixed) precision. It should be rewritten to use 32-bit limbs instead of 16-bit ones, but the performance is still acceptable.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.4 Parametrized Math Types

These math types are only "patterns" allowing to build complicated types from simple ones. For instance, you can create fractions over any ring without divisors of zero, and polynomials over any ring. For now, the following parametrized math types are implemented: ST_RATIONAL, ST_COMPLEX, ST_MONO, ST_POLY, ST_UPOLY, ST_LINE, ST_MATRIX and ST_TENSOR.

`ST_RATIONAL'
Can be built over any commutative ring without divisors and zero, and gives the field of fractions on this ring. The ring must support the mnode_gcd() to allow simplification of fractions.

`ST_COMPLEX'
The set of a+I.b objects, where a and b are running in some commutative ring A, with the rule I^2=-1. This was essentially an exercise to test the basic mechanisms of SAML. The division gives incorrect results unless A is a field. Not too useful, but very pedagogic.

`ST_MONO'
Monomials with coefficients in some commutative ring. This type is used inside the library, in order to implement polynomials, but is probably not very useful for application code.

`ST_POLY'
Polynomials with any number of indeterminates, with coefficients in some commutative ring. Very useful. Caution: if the ring has divisors of zero, polynomial division won't work correctly if the leading coefficient of the denominator is a divisor of zero. A sparse representation is used.

I'd like to phase out this data type. It's usually slower and more memory-hungry than ST_APOLY, the code is clumsy and doesn't handle exceptional conditions, like exponent overflow (which is ignored). On the other hand, for some problems this representation can be much more efficient, so I don't want to throw it away entirely.

`ST_UPOLY'
Polynomials in one (unnamed) variable, with coefficients in some commutative ring. These polynomials are densely encoded.

`ST_APOLY'
An alternative implementation of multivariate polynomials. Usually faster, more space-efficient and more reliable than the old ST_POLY representation.

`ST_LINE'
An array of anything, even objects of different types. Very useful. The library uses this type to implement matrix lines.

`ST_MATRIX'
Rectangular matrices over some ring. Matrices are a special case of tensors, so I haven't tried very hard to support all possible operations on matrices. In particular, multiplication and inversion are not implemented. The most useful operation on this type is the determinant; it is used by the library to implement resultants.

`ST_TENSOR'
A hopefully better approach to linear and multilinear algebra. They can be built over any ring, and some operations like differentiation or substitution will be inherited. Isn't that nice? Tensors are densely encoded. The manpage of samuel(1) briefly explains them.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3. The public C interface


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.1 Initialization

In previous versions of the library, you had to call saml_init() before any other operation. Otherwise, the program would crash at the first invocation of mref_new(). This is why you'll find this function call at the beginning of some programs; but it's useless now, the initialization will automagically take place at the first invocation of mref_new().

@deftypefn{}: void saml_init (void)
Initializes the library. This is no longer needed. This function would probably be more useful if it took an integer argument to indicate which part of the library should be initialized -- with the restriction that some types, especially ST_VOID, should always be initialized.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.2 Creation and Destruction of mrefs

All library functions operate on mrefs (variables). New mrefs are created with mref_new(); the return value is a small non-negative integer which identifies the mref. When the mref is no longer needed, you can delete it with mref_free(id) where id is the value returned by mref_new(). New mrefs are initialized with the special value "void", which really means "uninitialized".

@deftypefn{}: mref_t mref_new (void)
This function takes no argument, and returns a handle to a new mref.

@deftypefn{}: void mref_free (mref_t id)
Frees the mref with identifier id. Aborts with an error message if id hasn't been obtained with mref_new(), or if it has already been freed.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.3 Operations on mrefs

All operations taking mref identifiers as arguments will abort() if any of these identifiers is invalid, with a beautiful error message. We won't precise this every time.

@deftypefn{}: char* mref_string (mref_t id)
Returns a pointer to a string containing a human-readable form of the data stored in mref id. This string is in an alloc-ring, and it will be recycled after 16 calls to this function. This is not overly elegant, but it certainly avoids memory leaks! Never try to free() these strings yourself.

@deftypefn{}: int mref_type (mref_t id)
Returns the numeric type of the data stored in mref id. The possible values are given in saml-mtypes.h. It would be better to call this the "superficial type" of id because it only reflects the type of the topmost object; thus, all polynomials have type ST_POLY, no matter the type of their coefficients.

@deftypefn{}: const char* mref_stype (mref_t id)
Returns the superficial type of the data stored in mref id, in a human-readable form. The string lies in static storage and can't be modified.

@deftypefn{}: int mref_length (mref_t id)
If the mref id is an array, returns the length of this array, and otherwise zero.

@deftypefn{}: int mref_array (mref_t id, int type, int length)
Creates an array of type type and length length, containing void objects, and stores it in mref id. Returns zero on success, a negative error number on failure. The error codes are given in the header file saml-errno.h.

@deftypefn{}: int mref_getitem (mref_t id, mref_t array, int i)
Gets the i-th element of array and store it into id. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_setitem (mref_t array, int i, mref_t id)
Replaces the i-th element of array with the current value of id. Returns zero on success or a negative error number on failure.

@deftypefn{}: int mref_build (mref_t id, int type, const char* string)
Builds an object of type type from the string string, and store the result into the mref id. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_cast (mref_t id, int type)
Modifies the superficial type of mref id. Returns zero on success, a negative error number on failure. For instance, if id contains a rational and type is ST_POLY, then id will be cast to a polynomial with rational coefficients.

@deftypefn{}: int mref_promote (mref_t id, mref_t model)
Modifies the type of mref id to match the model. This is usually different from the previous function, because a model contains much more information than simply its "superficial type". For instance, if id is an integer and model is a polynomial with rational coefficients, then id will be converted to a polynomial with rational coefficients too, whereas mref_cast(id,ST_POLY) would have given a polynomial with integer coefficients. Returns zero on success, a negative error number on failure.

@deftypefn{}: void mref_copy (mref_t dest, mref_t src)
Copies src into dest.

@deftypefn{}: void mref_swap (mref_t id1, mref_t id2)
Swaps the contents of mrefs id1 and id2.

@deftypefn{}: int mref_add (mref_t dest, mref_t id1, mref_t id2)
Adds the contents of id1 and id2 and stores the result into dest. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_sub (mref_t dest, mref_t id1, mref_t id2)
Stores the difference of id1 and id2 into dest. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_mul (mref_t dest, mref_t id1, mref_t id2)
Stores the product of id1 and id2 into dest. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_div (mref_t dest, mref_t id1, mref_t id2)
Stores the quotient of id1 by id2 into dest. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_mod (mref_t dest, mref_t id1, mref_t id2)
Reduces id1 modulo id2 and puts the result in dest. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_gcd (mref_t dest, mref_t id1, mref_t id2)
Stores the greatest common divisor of id1 and id2 into dest. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_invert (mref_t dest, mref_t id)
Puts the inverse of id into dest. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_negate (mref_t dest, mref_t id)
Puts the opposite of id into dest. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_zero (mref_t dest, mref_t model)
Puts into dest an object which has the same type type as model and is a neutral element for addition. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_one (mref_t dest, mref_t model)
Puts into dest an object which has the same type type as model and is a neutral element for multiplication. Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_notzero (mref_t id)
Returns zero if the object is zero, one if it's non-zero, or -1 if there's no such thing as a "zero" for objects of this type.

@deftypefn{}: int mref_isneg (mref_t id)
Returns one if the object is less than zero, zero otherwise, and -1 if the object can't be compared with zero (either because there's no order, or no zero).

@deftypefn{}: in mref_differ (mref_t id1, mref_t id2)
Return one if the objects differ, zero if they are equal, and -1 if it can't be decided.

@deftypefn{}: int mref_lessthan (mref_t id1, mref_t id2)
Returns one if id1 is less than id2, zero otherwise, and -1 if the objects can't be compared.

@deftypefn{}: int mref_power (mref_t dest, mref_t id, int exponent)
Raises the mref id to the power exponent. If exponent is negative, id is inverted and then raised to the power -exponent. Feature: if exponent is zero, then the result is always one. Caution: the third argument is the exponent itself, not the identifier of a mref. Thus, this function cannot be used with huge exponents (greater than INT_MAX). Returns zero on success, a negative error number on failure.

@deftypefn{}: int mref_sqrt (mref_t dest, mref_t id)
Puts the square root of id into dest. Returns zero on success, a negative error number on failure.

@deftypefn{}: int saml_parse (mref_t dest, mref_t model)
This function should only be called after the lexer has been initialized with either saml_init_lexer_fd() or saml_init_lexer_mem(), otherwise, the behaviour is undefined. The file descriptor or memory area will be parsed and the result put into dest; it will have the same type as the model. This function returns 0 on success and a negative error number on failure.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.4 Type-specific operations on mrefs

The following operations are meaningful only on certain types (polynomials, tensors). Like the other ones, they return 0 on success and a negative error number on failure.

@deftypefn{}: int mref_subs (mref_t dest, mref_t src, mref_t e1, mref_t e2)
This operation is meaningful only if src is a polynomial or a tensor with polynomial coordinates. The other operands e1 and e2 must have the same type as src. Moreover e1 must have exactly one term (so it is a monomial) with coefficient one (this restriction is not enforced; the coefficient is ignored). This function will substitute all occurences of e1 by e2 in src, and the result will be stored into dest.

@deftypefn{}: int mref_det (mref_t dest, mref_t matrix)
The determinant of the matrix (if the second argument has type ST_MATRIX) is stored into dest.

@deftypefn{}: int mref_elim (mref_t dest, mref_t lit, mref_t e1, mref_t e2)
Eliminates the literal lit between expressions e1 and e2. The resultant is put into dest. The operands must be polynomials or tensors with polynomial coordinates, and have the same type.

@deftypefn{}: int mref_diff (mref_t dest, mref_t expr, mref_t lit)
Differentiates expr with respect to the literal lit and puts the result in dest. The operands must be polynomials or tensors with polynomial coordinates, and have the same type.

@deftypefn{}: int mref_move_lit (mref_t dest, mref_t tensor, mref_t l1, mref_t l2)
Renames all occurences of the tensorid l1 in tensor with l2, maybe causing a contraction if the tensor already contains l2. The operands must have types ST_TENSOR and ST_LITERAL. If the tensor doesn't contain the first literal, an error is returned. This function is similar to mref_subs() but must be different, for namespace reasons, since the same literal could be used both as a tensorid and as a polynomial indeterminate.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.5 Handling run-time errors

We have already seen how most mref functions return a negative value when an error occurs. However, it is tedious (and error-prone) to test all return values, so other mechanisms exist.

The mnode layer, in the case of an error, returns a mnode of the special type ST_VOID, which contains the error number and the name of the routine where it occurred. The mref layer detects this condition, calls the error handler with two arguments and (unless this handler caused the program to abort) returns -errno to the caller. This handler is declared in the `saml.h' header file as follows:

 
extern void (*saml_error_handler) (int reason, const char* where)

The library provides three error-handlers as examples, but the user can install his own. Here they are:

@deftypefn{}: void saml_default_ehandler (int, const char *)
The default handler; displays an error message on standard error and increments the global variable saml_errors.

@deftypefn{}: void saml_fatal_ehandler (int, const char *)
Displays an error message on standard error and aborts. This certainly prevents the error from propagating...

@deftypefn{}: void saml_silent_ehandler (int, const char *)
Increments the global variable saml_errors. This is useful when you expect errors (for example, to verify user-supplied data) and you'd like to print a more informative error message; this is what factorint(1) does to detect malformed numbers.

Objects of type ST_VOID (uninitialized variables and run-time errors) are also handled in a special way by the mnode dispatcher. When a mref function takes several arguments and these arguments have conflicting types, the dispatcher will return a "type mismatch" error unless one of the arguments has type ST_VOID; in this case, this argument will be the return value. In other words, errors will be propagated without alteration (unless they meet another error, then we'll have to drop one of them). This behaviour is similar to NaN for floating-point numbers: errors are "sticky" and can't disappear.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.6 Error codes

`SE_ERROR'
General error. The least informative error message.

`SE_WR_TYPE'
Wrong type. Some operand didn't have the expected type for the operation.

`SE_WR_SIZE'
Wrong size. One of the operands is an array whose size is not what was expected. Also returned when you try to create an array with a negative size.

`SE_TCONFL'
Type conflict. By design, SAML only allows you to combine elements of the same type, and get a result of the same type (up to a few exceptions). This error is usually raised by the mnode layer, not by the type-specific code, but there are exceptions.

`SE_SCONFL'
Size conflict. The operands are arrays, but their sizes don't match. For example, you can't add two matrices of different dimensions.

`SE_ONSUPP'
Operation not supported. The operation is not defined for objects of this type. Usually raised by the mnode layer, when an entry is NULL in the s_mtype structure.

`SE_TBL_FULL'
Table is full. Blame some internal static table. I don't think there are any for the moment.

`SE_DIVZERO'
Division by zero. An attempt to invert zero will raise the same error.

`SE_OODOMAIN'
Out of definition domain. The operands are such that the result is undefined. For example, the square root of a negative number will raise this error.

`SE_INDEX'
Index out of range. Especially for operations on arrays.

`SE_SINGULAR'
Not invertible. Someone tried to invert, or divide by a singular element, for example a matrix of determinant zero.

`SE_NSTYPE'
No such type. Probably only returned by mnode_cast().

`SE_NOT_ARRAY'
Not an array. Thus, some operations (like indexing) are undefined.

`SE_IUTYPE'
Type already in use. Can be returned by register_mtype().

`SE_STRING'
Cannot parse string. The string argument passed to mnode_stringify() could not be transformed into something of the given type.

`SE_ICAST'
Impossible cast. The cast or promotion couldn't be performed.

`SE_NOTRDY'
Not yet implemented. Blame the laziness of the software writer. Better, write the function yourself.

`SE_EMPTY'
Not initialized. This is not really an error; this is what fresh mrefs contain when they are created, so that one doesn't attempt operations on them.

`SE_HETERO'
Heterogeneous operands. The operands have the right types, but somehow they can't be combined. For example, adding two monomials which are not proportional, or tensors which don't have the same tensorids, will cause this error.

`SE_NSCALAR'
Not a scalar. Some operations on tensors are meaningful only when the tensor is a simple element of the base field.

`SE_OVERFLOW'
Internal overflow. Some data structures have hard-coded limits (for example, the exponents in ST_APOLY polynomials can't exceed 2^32-1), and this error will be raised if the result can't fit. This is probably better than silently ignoring the rollover.

`SE_TOO_SPARSE'
Object is too sparse. Thus, it can't be converted to a dense representation without using huge amounts of memory and processor time. Polynomials can have this problem.

`SE_NSMOD'
Not the same modulus. The operands live in different quotient spaces, so it's meaningless to combine them.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.7 Growable strings

Resizable strings are extensively used by SAML, and since user programs may need them, I have included them in the public interface. Unlike C strings, they're not necessarily NUL-terminated and can contain embedded NULs; the length is part of the data structure. The gr_string structure is defined in `saml-util.h', as follows:

 
typedef struct _growable_string {
        size_t  buflen;    /* allocated length */
        size_t  len;       /* current length */
        char    s[0];      /* data */
} gr_string;

It is OK to manipulate these objects directly, provided that len (the current length) never exceeds buflen (the allocated length). To create them, or increase their length, you should use the following functions, also prototyped in `saml-util.h':

@deftypefn{}: gr_string* new_gr_string (size_t blen)
Returns a pointer to a newly-allocated growable string, initially empty, with blen bytes of preallocated storage (or some default value if blen is zero).

@deftypefn{}: gr_string* grs_append (gr_string* dest, const char* src, size_t len)
Appends len bytes, starting at address src, to the growable string pointed to by dest. The return value is the new address of the gr_string, which can be different from dest if a reallocation took place.

@deftypefn{}: gr_string* grs_prepend (gr_string* dest, const char* src, size_t len)
Similar to the above, but the data are inserted at the beginning of the growable string, not at the end.

@deftypefn{}: gr_string* grs_append1 (gr_string* dest, char c)
Appends one byte to a growable string and returns its new address.

@deftypefn{}: gr_string* grs_prepend1 (gr_string* dest, char c)
Prepends one byte to a growable string and returns its new address.

Important: Never forget that a gr_string can be relocated to a new address if you add data to it. Thus, you should never ignore the return values of the above functions. Also, don't forget to put an explicit '\0' at its end if you want to use it as a normal C string.

To conclude, let's notice that it's very easy to shrink an existing growable string; just reduce the len field. In particular, setting this field to zero will clear the string. This trick is used several times in the sample programs.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.8 The parser

SAML provides a simple parser for the common cases where you want to read symbolic expressions from a file or an area of memory. To use it, you must first initialize the lexer, with one of the following functions:

@deftypefn{}: void saml_init_lexer_fd (FILE* fd)
Asks the lexer to read from the file descriptor fd, from the current position (usually the beginning) up to the end.

@deftypefn{}: void saml_init_lexer_mem (const void* start, size_t length)
Parse the string starting at address start and with length length. The string doesn't need to be zero-terminated, and embedded zeros, if any, will be ignored.

@noindent{}and then you can call saml_parse().

The lexical conventions are simple: identifiers can be either C-style identifiers like __foo_1 or FORM-style bracketed identifiers like [z^2+c]. Whitespace is ignored. The complete grammar can be found in the file simple-parser.c.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.9 Things to do

[ Some remarks about the parser ]


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4. The mnode interface


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.1 Data structures

Mnodes are only visible inside the library. To use them, you have to include `mnode.h'. Every mnode begins with the following header, so a pointer to a mnode (which has type `s_mnode*' or `mn_ptr' for brevity) is identified with a pointer to its header:

 
typedef struct mnode_header {
        int type;              /* superficial type */
        int refs;              /* reference counter */
} s_mnode, *mn_ptr;

The first field contains the so-called "superficial" type, used for dispatching. The second field counts how many times the mnode is referenced -- the mnode is deleted when this counter reaches zero. But this is only a template, all mnodes will contain other data appended to the above header. The format of these data is type-specific; there can be integers, references to other mnodes, whatever. So-called standard mnodes are simply arrays of references to other mnodes; they are defined in `mnode.h' as follows:

 
typedef struct {
        struct mnode_header hdr;        /* see above */
        int length;                     /* length of the remaining data */
        mn_ptr x[0];                    /* remaining data (appended) */
} std_mnode, *smn_ptr;

With these types you can use the array operations of the public interface: mref_array(), mref_getitem() and mref_setitem(), but be careful: SAML won't warn you if you attempt these operations on non-standard types, and even standard types can expect class invariants, which can be violated with these functions.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.2 Initialization of new types

Each type, standard or not, is identified by a small non-negative numbers; builtin types are defined in `saml-mtypes.h'. Each type has an initialization routine, usually called from saml_init(), which is responsible for registering the new type and conversion routines, and maybe initialize some private data structures, like hash tables. If you create a new type, you should insert its initialization routine in the saml_init() routine.

@deftypefn{}: void register_mtype (int type, void* desc)
Register a new math type. The second argument should be a pointer to a structure of type s_mtype or more probably unsafe_s_mtype. These two data structures are bitwise equivalent, but the second one is more laxist on type verifications, which is often desirable.

The s_mtype structure, defined in `mnode.h', describes the operations supported by this type. A NULL value means that some operation is unsupported. Since these functions are always called via the dispatcher, the implementor can assume that all type checking has already been done, and the arguments have the right types. They should return a new mnode with the same type (or a new reference to an already existing mnode) or an error mnode.

Not everything is stored in the s_mtype structure. Type conversion is particular because its requires a double dispatching (depending on the initial and final type), and thus is implemented in a special way. You can register a conversion routine with

@deftypefn{}: void register_CV_routine (int t1, int t2, void* fn)
Registers a conversion routine fn from type t1 to type t2. You can register a generic conversion routine by passing -1 as the first argument (generic conversion routines are supposed to take an argument of any type, and convert it to type t2).

Assume that t1 and t2 correspond to types foo and var. The prototype of the conversion routine should be as follows:

@deftypefn{}: s_mnode* foo2bar (foo_mnode* arg, bar_mnode* model)
This function will always be called with arg pointing to a mnode of type foo (unless it has been declared as generic; in this case, the type of arg is unknown) and model being either NULL or the address of a mnode of type bar. It should return a new reference to an object of type bar or an error.

Generic conversion routines are usually not needed, except in rare cases where a model is needed for the conversion (matrices are an example of this, because you must know their size to convert a number to a scalar matrix).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.3 Constructors and destructors

Special functions are the constructors and destructors. File `mnode.c' provides three allocators for new mnodes:

@deftypefn{}: s_mnode* __mnalloc (int typeid, size_t len)
Creates a mnode with type typeid and len bytes long (including the struct mnode_header part) and one reference.

@deftypefn{}: std_mnode* mstd_alloc (int typeid, int len)
Creates a standard mnode (i.e., an array) with type typeid and len entries. Standard mnodes should use the standard destructor mstd_free().

@deftypefn{}: s_mnode* mnode_error (int reason, const char* where)
Creates an error mnode, i.e., a mnode with type ST_VOID. The first argument is the error code, the second argument should be the name of the function where the error occurred.

To delete or duplicate existing mnodes, it is best to use the following (inline) functions, otherwise it gets messy:

@deftypefn{}: void unlink_mnode (s_mnode* node)
Removes a reference to node; it decrements the reference counter, and calls the destructor if the counter reaches zero.

@deftypefn{}: s_mnode* copy_mnode (s_mnode* node)
Creates a new reference to node; it increments the reference counter and returns node. It is equivalent to duplicating the object, only much much faster.

Most routines (addition, for instance) return a new mnode, more precisely, they return a new reference to a mnode. Keep this in mind when you write operations for parametrized types: count how many references you create (most operations will create one reference) and how many you delete (with unlink_mnode()). If you unlink too many references, you will sooner or later get a crash because of a dangling pointer. On the other hand, if you forget to remove references to temporary objects, they will never be freed and it is a memory leak. As a debugging aid to track this kind of problems, the library provides three global variables, called:

 
extern long nb_mnodes_allocated;
extern long nb_mnodes_reserved;
extern long nb_mnodes_freed;

These three counters are initially zero. The first counter is incremented every time __mnalloc() is called (perhaps via mstd_alloc()), and the last counter is incremented every time destroy_mnode() is called (via unlink_mnode() normally). The second counter should be incremented each time you allocate a mnode you don't plan to ever release (for good or bad reasons); for instance, the module `Integer.c' preallocates two mnodes for 0 and 1 in its initialization routine, and so increments nb_mnodes_reserved by two units. Reserved mnodes should be exceptional.

If all the reference-counting is correct, the counters will verify the following relation, after all mrefs have been freed:

 
nb_mnodes_allocated == nb_mnodes_reserved + nb_mnodes_freed

Yes, I have heard about automatic garbage collecting. It is nice for the programmer but not so nice for the computer; in my experiments, the garbage-collected version of samuel used three times as much memory as the normal reference-counting version, and it was slightly slower. You can't have your cake and eat it, too.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.4 Conversion routines

Most SAML routines expect arguments of the same type. You cannot directly add an integer and a fraction; you must convert the integer to a fraction first, and then call mref_add() or mnode_add(). At the mnode layer, two functions allow you to convert objects:

@deftypefn{}: s_mnode* mnode_promote (s_mnode* arg, s_mnode* model)
Convert arg to a mnode with the same type as model and return this, or an error.

@deftypefn{}: s_mnode* mnode_cast (s_mnode* arg, int newtype)
Convert arg to a mnode with superficial type newtype and return this, or an error.

It should be noted that mnode_promote() can be useful even if arg and model have the same superficial type; you might use it, for instance, to convert a polynomial with coefficients in some ring K to a polynomial with coefficients in L, if there exists a natural map form K to L.

The only problem with mnode_promote() is that it requires a "model", i.e. a mnode which has already the right type. It's a chicken-and-egg problem: how do you create the first object with this type? This is why mnode_cast() is useful, it allows you to create new types from simpler ones.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.5 How cast and promote work

This is not pretty nor easy to describe, but I'll try anyway; in any case you should read `promote.c' for more detail. Conversion routines between types are registered through register_CV_routine() and stored into a dictionary whose keys are the (type1,type2) pairs.

Let type1 and type2 the original type and the destination type, respectively. If there is a conversion routine from type1 to type2, then we execute it (and if it fails we return the error to the caller). Otherwise, if type1=type2 or type1 is ST_VOID, we return an unmodified copy of the initial object. Otherwise, we look for a "generic" conversion routine to type2; if there is one, we call it, and if it succeeds we return the result. Finally (i.e., if there was no generic conversion routine or if it failed) then we call mnode_make() to "cast" the object to type2, and report either success or failure.

This is hairy and probably more complicated then needed. More importantly, it's basically wrong, as it supposes that cast and promote are essentially the same operation. But in fact, the role of cast is to create new types: for instance, I have an object of type A and want to create a polynomial whose coefficients have type A. The current scheme, for instance, doesn't allow you to create nested "Unary polynomials" using cast().


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.6 Things to do

Type conversion is a big mess and should be redone entirely. Also, some very important things are missing from the API. There's no way to get the numerator and denominator of a fraction, for instance, because it isn't a "standard" type (i.e., an array) like unary monomial. More generally there's no way to break a complex object like a polynomial or a tensor into simpler objects: there are no iterators. Sure, it would expose some implementation details, but I think it's unavoidable, in some cases it is really necessary.

I think the right way to fix these two problems would be to add to each "opaque" type (like "Rational"), a "broken" type which would be an array-type -- here, an array of length two, containing the numerator and the denominator. The broken type would allow no operations, apart from accessing the pieces, and conversion routines to and from the opaque type. This would be much cleaner than the current ad-hockery and would give us a safe way to peek into objects.

There should be some kind of "call by name" as in Objective-C to call arbitrary methods, but then we should choose a convention for passing arguments and the retrieving the result(s).

We should really implement mnode_info() someday. It's supposed to get "hints" about the mnode from an upper layer of the library, for example, what's its estimated complexity (this is important for the Bareiss algorithm), are multiplications really expensive, should we avoid denominators, etc.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5. Python bindings

For C programmers, mrefs are a great advantage over mnodes because they handle the gory details of reference counting, and they're mutable objects like most other C types. In the case of Python, these considerations are not relevant, because the language does the reference-counting automagically and has no problem with immutable objects. Therefore, the class Mathnode defined by the saml1 module is only a thin layer around mnodes, and the semantics are pretty much the same; in particular, the objects are immutable. The main difference is that "error" mnodes are reported as exceptions instead of class instances.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.1 The module saml1

This module defines the class Mathnode, and the functions MnAllocStats(), MnArray() and resultant(). It also defines the symbolic constants ST_* corresponding to the builtin SAML types, and the Saml*Error exceptions.

saml1.MnAllocStats()
Returns a triple of integers, containing the number of allocated, reserved, and freed mnodes (so far). This is useful to detect leaks in the library, or to evaluate the efficiency of a calculation.

saml1.MnArray(mtype, tuple)
Builds a standard Mathnode of type mtype from a tuple of Mathnodes. This only applies to standard math-types.

saml1.resultant(poly1, poly2, lit)
Computes the resultant of two polynomials with respect to some literal. Defining this function as a method seemed too error-prone (should self have been the first, second, or third argument?)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.2 The Mathnode class

Most of the interesting operations on mnodes are accessible as methods of the class Mathnode. Usually the self argument will be the first argument of the corresponding operation on mnodes. This class also offers attributes to get the math-type and the reference counter of the underlying mnode. Here is the complete list of attributes and methods (note that all attributes are read-only, since the objects are immutable):

self.refs
Returns a Python integer, containing the reference counter of the mnode (that's mainly a debugging aid).

self.type
Returns a Python integer, containing the math-type of the mnode in numeric form.

self.stype
Returns a string containing the math-type in a human-readable form.

self.__members__
Returns the list of all possible attributes (this is a standard Python convention).

self.__methods__
Returns the list of all possible methods (Python creates this list automagically).

Mathnode(mtype, string)
The constructor for this class: it takes two arguments, an integer and a string. The first argument is the type of the object to be created, the second is a representation of this object. Assuming all went well, a Mathnode is returned. This functions corresponds to mnode_build(), and is supported only for "simple" math-types like integers, rationals or literals. More complicated types (like arrays, polynomials and tensors) can't be obtained this way; you must use either MnArray() to create an array type, or cast() or promote() a simpler object.

self.cast(new@_type)
Takes an integer argument, and returns a Mathnode of type new_type, this function corresponds to mnode_cast().

self.promote(model)
Takes a Mathnode as argument, and returns a Mathnode with the same value as self but the same type as model. This function corresponds to mnode_promote().

self.parse(string)
Parses a string and returns a Mathnode with the same type as self. This is a wrapper around saml_parse(), which is not a function on mnodes but still a very useful one.

self.zero()
Returns a Mathnode whose value is zero and has the same type as self. This corresponds to mnode_zero().

self.one()
Returns a Mathnode whose value is one and has the same type as self. This corresponds to mnode_one().

self.negate()
Returns the opposite of self, this corresponds to mnode_negate().

self.sqrt()
Returns the square root of self, this corresponds to mnode_sqrt()

self.det()
Returns the determinant of self, this corresponds to mnode_det()

self.ne0()
Returns the integer 1 if self is non-zero, and 0 otherwise. This corresponds to mnode_notzero().

self.lt0()
Returns the integer 1 if self is less than zero, and 0 otherwise. This corresponds to mnode_isneg().

self.differ(arg)
Takes a Mathnode as argument, and returns the integer 1 if self and arg differ, and 0 otherwise. This corresponds to mnode_differ().

self.lessthan(arg)
Takes a Mathnode as argument, and returns the integer 1 if self is less than arg, and 0 otherwise. This corresponds to mnode_lessthan().

self.info(what)
Takes an integer as argument, and returns an integer containing some information about self. This corresponds to mnode_info() (which is not yet implemented currently, but the stubs are here).

self.gcd(arg)
Takes a Mathnode as argument, and returns the greatest common divisor of self and arg. This corresponds to mnode_gcd().

self.pow(exponent)
Takes an integer as argument, and returns self raised to this (possibly negative) power. This corresponds to mnode_pow().

self.mvl(lit1, lit2)
Takes two strings as arguments, and (assuming self is a tensor) returns a new tensor where lit1 has been moved to lit2, possibly causing a contraction. This corresponds to mnode_move_lit().

self.len()
If self is a standard type, returns the length of the array.

self.item(index)
If self is a standard type, returns the Mathnode at position index in the array.

The usual operations (addition, substraction, multiplication, division and remainder) are accessible with the normal binary operators. The binary comparison operators are also usable; the only problem is that they can't raise an exception if an error occurs. Finally, integers can be coerced to Mathnodes, so it's possible to write `z+1' or `if 3<z<4' and have the conversions performed automatically.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.3 Exceptions

Mnodes representing errors (that is, mnodes for which saml_errno() would return a non-zero value) are not considered as Mathnode objects, but are reported as exceptions. There is a one-to-one correspondance between the saml1 exceptions and the possible return values of saml_errno(), except out-of-range values which are all reported as SamlUnknownError (this is consistent with the behaviour of saml_strerror(), btw). Have a look at saml1.c for the complete list of exceptions.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.4 Some examples

Calling SAML from Python is much simpler than the same job in C, because all objects are created and destroyed automatically. For instance, our first sample program (adding 123 and 456) would be simply

 
from saml1 import *
print Mathnode(ST_INTEGER,'123') + Mathnode(ST_INTEGER,'456')

Note that the second Mathnode() wasn't absolutely necessary, we could have relied on automatic coercition, and written

 
from saml1 import *
print Mathnode(ST_INTEGER,'123') + 456

In any case, we need at least one object of the "right" type, then we can rely on promote() and parse() and automatic conversion of integers. For example, to compute (1+X+X^2/2)^100, we need polynomials with rational coefficients. A solution is

 
from saml1 import *
model = Mathnode(ST_RATIONAL,'0').cast(ST_APOLY)
print model.parse('(1+X+X^2/2)^100')

In this particular example we could have omitted the assignment to model and done the whole thing in one line, but usually we'll need the model for lots of other things, so we'll keep it preciously.

Other examples can be found in the `python' subdirectory.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.5 Things to do

Coercition should work with Python long integers as well, and there should be methods to convert a Mathnode to an integer when possible. Standard mathnodes (i.e., array types) should behave like Python arrays.

Perhaps we should ensure that different Mathnodes always point to different mnodes; thus, Mathnode identity would be equivalent to mnode identity. It would allow us to use some Mathnodes (for instance, literals) as dictionary keys.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6. Implementation details


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.1 Integers

Integers are implemented as custom mnodes (but put zero in the length field so that they can use the standard destructor), with the following data structure:

 
typedef struct {
        int blocks;
        __u32 d[0];
} big_int;

The blocks field contains the number of limbs, and the sign of the number. Zero has zero blocks. Numbers are written in base 10^9 (the highest power of 10 which fits in a 32-bit word) in little-endian format. For space-efficiency reasons, two mnodes are preallocated for "0" and "1".

Most algorithms are taken from Knuth, Seminumerical Algorithms. The square root algorithm is the simplest I could think of, and is obviously very inefficient for large numbers. I only needed it to implement Brillhart-Morrison's algorithm, so performance wasn't crucial.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.2 Reals

This is also a custom type with the standard destructor. Reals are stored as follows:

 
typedef struct {
        int blocks;
        int exponent;
        __u16 d[0];
} big_float;

Here blocks is always positive, and a power of two; it is the size of the d[?] array. The number is stored under the form m.2^e, where m is between 1/2 (included) and 1 (excluded) and e is some integer. The exponent field has the same sign as the number, and its absolute value is 2^30+e. The mantissa is expressed in base 2^16, and stored in big-endian format, i.e., d[0] is the most significant limb, in particular its most significant bit is 1, unless the number is zero.

Here again, most algorithms are taken from Seminumerical Algorithms. Someday we should rewrite this type to use 32-bit limbs, implement the square root, and all usual transcendental functions. Later...


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.3 Cyclic integers

All cyclic integers (ST_CYCLIC) are stored in a hash table, which is resized when appropriate. When a new cyclic integer must be returned, we examine if it already exists in the table; if it does, we just copy_mnode() it, otherwise we create the new object, and insert it into the hash table.

It is exactly similar to what we do for literals, but the goal here is mainly to save memory, and to avoid extra malloc/free operations which are much more expensive than a hash table lookup.

Important: The cyclic_invert() routine assumes that the modulus n is prime. Otherwise it will just raise your number to the power n-2. Maybe I should fix this someday.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.4 Multivariate Polynomials (first implementation)

Polynomials (ST_POLY) are a standard type. A polynomial with n terms (with the convention that n=0 for the null polynomial) is represented as an array of n+1 monomials, where the first one is zero. Why, you might ask? This is because we need to keep type information about the coefficients of the polynomial, even if the polynomial is zero. On the other hand, it complicates the algorithms if we allow some terms to be zero; so this term is not really part of the polynomial, it is only used to store type information.

Monomials are a custom type with a custom destructor. Their representation consists of the coefficient, followed by (literal, exponent) pairs. This representation is particularly suited to very sparse polynomials, but requires many allocations/deallocations and reference-count modifications.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.5 Univariate Polynomials

Dense unary polynomials with an unnamed variable (ST_UPOLY) are also a standard type. A polynomial of degree d is represented as an array of d+1 coefficients; the null polynomial is represented like other constant polynomials. They are used by other parts of the library to implement polynomial substitution, or polynomial gcd (using Collins' method).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.6 Multivariate Polynomials (second implementation)

The first implementation, while perfectly suited to very sparse polynomials, is not particularly time- or space-efficient when the number of literals is small. All monomials are allowed to use their own set of literals, which is usually redundant, and it complicates all operations.

In this alternative implementation (ST_APOLY), all literals are grouped at the beginning of the object -- so, they don't have to be repeated for each monomial; monomials are simply a (pointer to a) coefficient and an array of exponents. For common problems, in appears to be much faster and less memory-hungry than the previous approach. On the other hand, it will be very inefficient in some cases, for instance imagine a polynomial with is the sum of thousand different literals.

After some work, it seems to work nicely now. It is not a standard type. The set of coefficients is sparsely encoded, but the exponents are densely encoded.

[Description of how data are represented]


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.7 Tensors

Tensors are the solution to a serious limitation of SAML, the absence of an external multiplication. Since one cannot multiply objects of different types, the only solution to implement vectors, matrices, numbers, and still be able to combine them together, is to consider them as special cases of one type: the tensor.

Tensors are represented by a header containing its order (the number of literals involved), then (literal, start, len) triples, and finally the data. Special functions are provided to iterate over the indices. The representation is always dense.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

6.8 Things to do

Rewrite all the multiprecision stuff to use base 2^32 exclusively, and unify the routines for integer and real arithmetic.

The "new" implementation of polynomials (type ST_APOLY) uses too much memory and CPU time when there are too many literals. We should devise something else, allowing for moderately sparse exponent vectors, but still time- and space-efficient when the number of literals is small. It would allow us to throw away ST_POLY entirely.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7. Application programs


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.1 samuel

samuel(1) is a user-friendly front-end to SAML, and an unvaluable tool to debug the library. It can be used either in interactive mode or in shell scripts. It doesn't allow rational functions, yet. There are other limitations; look at its manpage. User-friendliness has been improved with readline(3) support, and the possibility to filter all the output through an external program. As an illustration of this, colorize(1) formats and colorizes polynomials; to use it, just pass the option `--post=colorize' to samuel and watch the fun.

The syntax was designed to avoid keywords because they might clash with literal names. The option -b (or --batch) is useful in non-interactive mode, to avoid ambiguities between variable and literal names.

The current version of samuel does no longer use the stack provided by yacc to store temporary expressions, but handles the stack itself. This has at least two advantages. The most visible one is that it avoids memory leaks when a parse error occurs. Also, it simplifies the handling of variable-length argument lists (like determinants).

Security issue: samuel can execute arbitrary code. Thus, you should not feed it with unverified data. Reading and writing to pipes is especially dangerous.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.2 induce

induce(1) is heavily inspired from make(1), but it handles algebraic expressions instead of files. It is a purely declarative language; there are no variables (once a node has been computed, its value can no longer be modified) and no loops. This has the advantage that intermediary results can be memoized, even across multiple invocations of induce. The implementation consists of several more-or-less independant parts.

When the roots are read from standard input, they are processed one line at a time (so induce can be used as a bidirectional filter, in line mode). The above procedure is repeated for each line. A consequence of this is that induce doesn't remember the dependencies and the results of the previous lines; this is intentional, and ensures that the process size remains bounded. If you want memoizing between lines, mark as `precious' the appropriate nodes.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.3 factorint

This program is rather straightforward. Factoring integers is a very specialized task, thus it deserves a separate program (at least if one believes in the Unix philosophy of "small sharp tools"). I've taken the same notations as Knuth in Seminumerical Algorithms. The coolest feature is the option -m (or --mail) which puts factorint in the background, and sends a mail to you when the factorization is complete.

See the manpage for details; factorint uses the Rabin-Miller test (aka "strong Fermat test") for pseudo-primality testing. Factorization uses Pollard's rho method, and then the Multiple-Polynomial Quadratic sieve.

Factorization can be extremely long. On my machine (a 486dx2/66 running Linux) it takes two or three days to factorize a seventy-digit number; Moreover, the program eats up to fifteen megabytes of memory and the same amount of disk space.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.4 Things to do

Put tests, loops and control structures in the samuel interpreter? I'm not sure it's really a good idea; we'd have to redesign the interpreter entirely, whereas such functionality can be provided by external tools (either using pipes, or by interfacing with scripting languages). Let's keep it simple.

On the other hand induce lacks many commodities: it would be nice to have (internal) functions and loops and sums over a range of indices, but I doubt it'll ever be implemented, because these things (and much much more) are trivial to do in Python. Maybe we could rewrite induce in python.

The program factorint still has a few quirks with numbers whose factorization contains big primes raised to exponents greater than one. Sure, it recognizes perfect squares but MPQS will choke on numbers like p^2.q^3 where p and q are two big primes. Moreover, it might be useful to implement Lenstra's algorithm which uses elliptic curves.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

8. Bibliography


[Top] [Contents] [Index] [ ? ]

Table of Contents

1. Introduction to SAML
1.1 What is this document about?
1.2 Copyright
1.3 Our first SAML program
2. Terminology and data types
2.1 Math Nodes (mnodes)
2.2 Math References (mrefs)
2.3 Simple Math Types
2.4 Parametrized Math Types
3. The public C interface
3.1 Initialization
3.2 Creation and Destruction of mrefs
3.3 Operations on mrefs
3.4 Type-specific operations on mrefs
3.5 Handling run-time errors
3.6 Error codes
3.7 Growable strings
3.8 The parser
3.9 Things to do
4. The mnode interface
4.1 Data structures
4.2 Initialization of new types
4.3 Constructors and destructors
4.4 Conversion routines
4.5 How cast and promote work
4.6 Things to do
5. Python bindings
5.1 The module saml1
5.2 The Mathnode class
5.3 Exceptions
5.4 Some examples
5.5 Things to do
6. Implementation details
6.1 Integers
6.2 Reals
6.3 Cyclic integers
6.4 Multivariate Polynomials (first implementation)
6.5 Univariate Polynomials
6.6 Multivariate Polynomials (second implementation)
6.7 Tensors
6.8 Things to do
7. Application programs
7.1 samuel
7.2 induce
7.3 factorint
7.4 Things to do
8. Bibliography

[Top] [Contents] [Index] [ ? ]

Short Table of Contents

1. Introduction to SAML
2. Terminology and data types
3. The public C interface
4. The mnode interface
5. Python bindings
6. Implementation details
7. Application programs
8. Bibliography

[Top] [Contents] [Index] [ ? ]

About this document

This document was generated using texi2html

The buttons in the navigation panels have the following meaning:

Button Name Go to From 1.2.3 go to
[ < ] Back previous section in reading order 1.2.2
[ > ] Forward next section in reading order 1.2.4
[ << ] FastBack previous or up-and-previous section 1.1
[ Up ] Up up section 1.2
[ >> ] FastForward next or up-and-next section 1.3
[Top] Top cover (top) of document  
[Contents] Contents table of contents  
[Index] Index concept index  
[ ? ] About this page  

where the Example assumes that the current position is at Subsubsection One-Two-Three of a document of the following structure:

This document was generated by Debian Alpha Buildd on January, 21 2002 using texi2html