The Yacas Inner Workings

Programming In Yacas

Yacas provides an alternative way of defining user functions, more relevant to computer algebra. User functions, or more appropriately 'rule data bases', can be defined for functions with different numbers of arguments (or arity). So there can be a function f(x) and a function f(x,y) , both with a different rules database. f(x) is said to 'have arity one' and f(x,y) has arity 2.

For each function to be used, it should first be declared with RuleBase("function",{argument list});. So, a new rules database for f(x,y) would be created by typing RuleBase("f",{x,y});

After that, rules can be added for this function. A rule simply states that, for a specific function with a specific arity, if a certain predicate is true, then evaluate some expression and return its result, which then is to be treated as its simplified form. As a simple example, consider:
In( n ) = RuleBase("f",{n});

Out( n ) = True;

In( n+1 ) = Rule("f",1,5, n=0) 1;

Out( n+1 ) = True;

In( n+2 ) = Rule("f",1,6, IsInteger(n) And n>0) n*f(n-1);

Out( n+2 ) = True;

This defines a function that returns the factorial of a number. f(4);should now return 24, and f(a); should return f(a);The 'Rule' commands specify rules for function "f" with arity 1, one with precedence 5 and predicate n=0 , and another with precedence 6 and the predicate 'n must be a positive integer'. The rules with lowest precedence get evaluated first. So the rule with precedence 5 will be tried before the rule with precedence 6.Additionally, you could tell the parser to treat this function as a postfix operator:
In( n ) = PostFix("f");

Out( n ) = True;

In( n+1 ) = 4 f;

Out( n+1 ) = 24;

Note: the standard scripts already contain a factorial function, !.

There is a function Function , defined in the standard scripts, that allows you to define simple functions. A very simple example would be
Function First(list)  list[[ 1 ]] ; 
which simply returns the first element of a list. This could also have been written as
Function First(list)

[

list[[1]] ;

];

As mentioned before, the [ and ] brackets can be used to combine multiple operations to be performed one after the other. The result of the last performed action is returned.

Also, the := operator is overloaded to also be able to define functions that way. So the function first could also have been defined by simply typing
First(list):=list[[1]] ;
Now look at the definition of the function ForEach as defined in the standard package:
Function("ForEach",{foreachitem,foreachlist,foreachbody})

[

Local(foreachi,foreachlen);

foreachlen:=Length(foreachlist);

foreachi:=0;

While (foreachi < foreachlen)

[

foreachi++;

MacroLocal(foreachitem);

MacroSet(foreachitem,foreachlist[[foreachi]]);

Eval(foreachbody);

];

];

Bodied("ForEach");

UnFence("ForEach",3);

HoldArg("ForEach",foreachitem);

HoldArg("ForEach",foreachbody);

Functions like this should probably be defined in a separate file. You can load such a file with the command Load("file")This is an example of a macro-like function. Let's look at the last few lines first. There is a Bodied(...) , which states that a for command is to be entered as ForEach(item,{list}) body; . That is, the last argument to the command ForEach should be outside its brackets.UnFence(...) states that this function can use the local variables of the calling function. This is necessary, since the body to be evaluated for each item will probably use some local variables from that surrounding.

Finally, HoldArg("function",argument) specifies that the argument argument should not be evaluated before being bound to that variable. This holds for foreachitem and foreachbody , since foreachitem specifies a variable to be set to that value, and foreachbody is the expression that should be evaluated after that variable is set.Inside the body of the function definition there are calls to Local(...). 'Local' declares some local variable that will only be visible within a block [ ... ]. The command MacroLocal, works almost the same. The difference is that it evaluates its arguments before performing the action on it. This is needed in this case, because the variable foreachitem is bound to the variable to be used, and it is the variable it is bound to we want to make local, not foreachitem itself. MacroSet works similarly, it does the same as Set, except that it also first evaluates the first argument, thus setting the variable requested by the user of this function. The Macro... functions in the built-in functions generally perform the same action as their non-macro versions, apart from evaluating an argument it would otherwise not evaluate.

To see the function in action, you could type:
ForEach(i,{1,2,3}) [Write(i);NewLine();];
This should obviously write 1, 2 and 3, each on a new line.

Note: the variable names foreach... have been chosen so they won't get confused with normal variables you use. This is a major design flaw in this language. Suppose there was a local variable foreachitem, defined in the calling function, and used in foreachbody. These two would collide, and the interpreter would use only the last defined version. In general, when writing a function that calls Eval , it is a good idea to use variable names that can not easily be mistaken. This is generally the single largest cause of bugs when writing programs in Yacas. This issue should be addressed in the future.

Syntax

Although the engine is set up in such a way that you could easily roll your own parser and use that, the standard parser is quite sufficient. It is essentially a predictive parser for infix notations. This is the notation you will generally use for mathematical expressions.

You can just type things like a+b*c , which would of course be interpreted as a+(b*c) . Typing (a+b)*c would naturally perform the addition before the multiplication. Precedence can be set for the defined operators (precedence determines whether addition is performed before multiplication, for instance, it implicitly determines where the brackets are placed internally). General function calls have the form func(arg1,arg2); etc. Function calls don't need arguments: Exit(); will suffice in that case.

Atoms are generally thought of as strings. You can always separate things with spaces or newlines. The symbolic signs like +-*= etc. are treated as separate characters from the alphabetic and numerical ones, so a+2 will be separated into the tokens a, + and 2 by the lexical analyzer. The standard lexical analyzer is case-sensitive.

Additional Syntactic Sugar

The parser is extended slightly to allow for fancier constructs.

Internal Representation

The internal representation of expressions is very much like what you find in normal Lisp interpreters. Everything is either an "atom", or a list of atoms. Atoms are just text strings. During evaluation of such expressions, atoms can be interpreted as numbers, or as variables that are bound to some value.

Lists of atoms are generally interpreted in the following way: the first atom of the list is some command, and the atoms following in the list are considered the arguments.

There is one main difference with normal Lisp, as far as evaluation is concerned. Evaluation is sometimes abused as a mechanism for simplifying an expression. In that case, if an expression can not be simplified, the unsimplified version is returned. This is behaviour specific to a computer algebra system. Also, the first element of the list, which is a command to be performed, is also considered a type specifier for the object (the list).

Scope Of Variable Bindings

When setting variables or retrieving variable values, variables are automatically bound global by default. You can explicitly specify variables to be local. When invoking a function, a new stack frame is pushed for the locals, so the code inside a function body doesn't see the locals of the caller. A statement block can have its own locals that are not visible outside the block (blocks have the form [statement1();statement2(); ] etc.).

You can tell the interpreter that a function can see the local variables from the calling environment by using the UnFence command on the specified function.

User Defined Functions (The Rules Database)

Internally user defined functions are built up through rules databases. For a function "foo" you can define several rules databases, one for each arity. So, RuleBase("foo",{a}); would declare a new rules database for functions of the form foo(a). You can define versions for different arities, so another RuleBase("foo",{a,b}); would declare one for functions with the form foo(a,b);

You can specify that arguments are not evaluated before they are bound to the parameter: HoldArg("foo",a); would then declare that the a arguments in both foo(a) and foo(a,b) should not be evaluated before bound to a. This is useful for procedures performing actions based partly on a variable in the expression, like integration, differentiation, looping, etc., and will be typically used for functions that are algorithmic and procedural by nature.

Then declaring a rule for that function goes through Rule. The arguments to Rule merit some explanation: Rule("foo",arity,precedence,predicate,body); specifies that for function foo with arity (foo(a,b) has arity 2), there is a rule that if predicate is true, then body should be evaluated, and the original expression replaced by the result. precedences go from low to high, so a rule with precedence 0 will be tried before a rule with precedence 1. You can then later on specify parsing properties for the function if you like. If none of the predicates is true, the function with its arguments evaluated is returned.

This scheme is slightly slower for ordinary functions that just have one rule (with the predicate True), but it is desired behaviour for symbolic maninpulation. You can slowly build up functions using this scheme, testing the properties underway.

Evaluation Scheme

Evaluation when performed on an atom (an object with a text string representation) goes as follows: if the variable is bound locally as a variable, the object it is bound to is returned, otherwise, if it is bound as a global variable that is returned. Otherwise, the atom is returned unevaluated.

If the object is a compound object (internally, a list), the engine tries to find out if it is an internal command. If so, that is called. Otherwise, it possibly is a user-defined function (a "rule database"), and in that case the rule database is applied to it. If none of these are true, the object is returned unevaluated.

The main properties of this scheme are: when objects are bound to variables, they generally are evaluated first. When referencing the value of that variable, it isn't reevaluated again. Default behaviour of the engine is to return the original expression if it couldn't be evaluated. This is desired behaviour if evaluation is abused for simplifying expressions.

One major design flaw in Yacas (one other functional languages like LISP also have) is that when some expression is re-evaluated in another environment, the local variables contained in the expression to be evaluated might have a different meaning. This means for now that care has to be taken not to use too obvious variable names in functions that also call Eval.