Language Documentation

For microscheme version 0.9.2

This document is not intended to be a full Scheme tutorial. It is a specification of the subset of Scheme implemented by microscheme, and the particular workings of the runtime system. This document describes the built-in types and procedures of microscheme, and the mechanism by which runtime exceptions are reported.

Users who are new to Scheme might benefit from following a desktop tutorial such as Teach Yourself Scheme in Fixnum Days first, because programs running on a PC are much easier to debug than ones running on an Arduino.


Usage


  microscheme [-aucvrio] [-m model] [-d device] [-w filename] [-t rounds] program[.ms]

  Option flags:
    -a    Assemble (implied by -u) (requires -m)
    -u    Upload (requires -d)
    -c    Cleanup (removes intermediate files)
    -v    Verbose
    -r    Verify (Uploading takes longer)
    -i    Allow the same file to be included more than once
    -o    Disable optimizations  
    -h    Show this help message 

  Configuration flags:
    -m model     Specify a model (UNO/MEGA/LEO...)
    -d device    Specify a physical device
    -w files     'Link' with external C or assembly files
    -t rounds    Specify the maximum number of tree-shaker rounds
					


Type SystemTop

The basic built-in types are numbers, pairs, vectors, procedures, characters, Booleans and the empty list. Each of these has a corresponding type predicate: (number? 5). The basic numeric type holds 15-bit unsigned integers. The figure below shows the bit-level representation of these data types:

Lists and Strings are built-in compound data types. This means that they are represented as combinations of the basic types.

Lists are represented as chains of pairs, terminating in the empty list. A list is any expression x such that (list? x) evaluates to #t, and one may be constructed using the (list x y z …) form, or manually: (cons x (cons y (cons z '()))). (Please note that microscheme has no 'quote' primitive, and '() is a special notation for the empty list.)

Strings are represented as vectors of characters. A string is any expression y such that (string? y) evaluates to #t. They may be constructed using the syntactic sugar: "Hello", or manually: (vector #\H #\e #\l #\l #\o).


Fundamental FormsTop

A microscheme program is a sequence of top-level expressions. A top-level expression is a definition or any other expression. Definitions may not appear as subexpressions.

(define var val) represents a top-level (i.e. global) binding of the name var to the value val.

(define (proc vars …) exprs …) represents a procedure named proc, with parameters vars.

Expressions can take the following fundamental forms. (The keywords that identify fundamental forms are not first-class values. For example, lambda is not bound to any value at runtime.)

(lambda (vars …) exprs …) represents a lambda abstraction, or an anonymous procedure with parameters vars.

(let ((var val) …) exprs …) represents a lambda abstraction.

(if cond conseq else) represents a control branch. If cond evaluates to #t then this expression will evaluate to conseq, otherwise it will evaluate to else, or #f if no alternative is given.

(set! var val) represents an assignment.

(begin exprs …) represents a sequence of expressions.

(and exprs …) evaluates expressions until one evaluates to false.

(or exprs …) evaluates expressions until one evaluates to true.

(free! exprs …) evaluates expressions, recovering heap memory afterwards. (Caution!)

(include “filename”) is equivalent to the code contained in filename.

(list exprs …) represents a list whose elements are expressed by exprs.

(vector exprs …) represents a vector whose elements are expressed by exprs.

(call-c-func “function” args …) invokes the Foreign Function Interface.

(procedure args …) represents a procedure application.


Primitive Procedures: Top

Primitive procedures are built-in to the language, and have efficient assembly-code implementations. The compiler performs function inlining on calls to these procedures, and they are available as first-class values at runtime.

(eq? x y)   (= x y)   (> x y)   (>= x y)   (< x y)   (<= x y)   (not x)   (¬ x)   (number? x)   (pair? x)   (vector? x)   (procedure? x)   (char? x)   (boolean? x)   (null? x)   (assert x)   (error)   (stacksize)   (heapsize)   (pause x)   (micropause x)   (+ x y)   (* x y)   (- x y)   (div x y)   (mod x y)   (zero? x)   (cons x y)   (car x)   (cdr x)   (set-car! x y)   (set-cdr! x y)   (vector-length x)   (vector-ref x y)   (vector-set! x y z)   (make-vector x)   (digital-state x y)   (set-digital-state x y z)   (register-state x)   (set-register-state x y)   (char->number x)   (number->char x)   (arity x)   (apply proc lst)   (<< x)   (>> x)   (| x y)   (& x y)   (~ x)   (^ x y)


Library Procedures:Top

Library Procedures are written in microscheme, and are included automatically by the compiler. (There is no unnecessary memory penalty, because all unused definitions are eliminated by the tree-shaker.) Library Procedures are not inlined by the compiler, and do not have hand-optimized assembly implementations, but they generally represent much richer computations than primitives.

(equal? x y)   (list? x)   (string? x)   (forever x)   (for x y z)   (for-both x y)   (set-ddr x y)   (set-pin x y)   (output? x)   (input? x)   (high? x)   (low? x)   (output x)   (input x)   (high x)   (low x)   (toggle x)   (serial-init)   (serial-write x)   (serial-read)   (serial-available)   (analog-init)   (analog-read x)   (analog-write x y)   (reverse x)   (length x)   (member x lst)   (for-each fnc lst)   (fold x y z)   (fold-right x y z)   (map x y)   (map-2 x y z)   (zip x y)   (all test lst)   (for-each-vector x y)   (for-each-vector-reverse x y)   (all-vector x y)   (vector-concat x y)   (vector-copy x y)   (vector-last x)   (sub-vector x a b)   (vector-first x y)   (vector->list x)   (list->vector x)


Memory ManagementTop

Since microscheme has no garbage collector, it is important to program in a memory-conservative style. The most significant part of that process is identifying memory-thirsty constructs, and keeping them out of any loops, so they will be evaluated as few times as possible.

The fundamental forms which allocate new heap space during their evaluation are lambda, cons, list and vector. It is easy to keep list and vector allocation out of loops by thinking of them as mutable data structures which are allocated in some outer context, and only mutated inside loops. It is tempting to use lambda to produce a simple function, perhaps to pass as an argument to some higher-order function. Though this is an elegant use of the functional paradigm, some heap space is required for a new 'closure' object every time the lambda is evaluated. In these cases, one must manually perform 'lambda lifting', by turning those anonymous procedures into named global procedures, which are allocated only once. Here is an example:


  ; Unlifted version
  ; (Heap space used every time (loop) is evaluated)

  (define (loop)
    (for 0 1
      (lambda (i)
        (print (vector-ref my-data i)))))

  ; Lifted version
  ; (No heap space used every time (loop) is evaluated)

  (define (print-my-data i)
    (print (vector-ref my-data i)))

  (define (loop)
    (for 0 1 print-my-data))
				

In the unlifted version of the above code, some heap space is used (allocated) every time (loop) is evaluated. If this loop is evaluated regularly, the program will eventually run out of memory, and behave unpredictably. If the program is working at a high frequency, it may appear to crash straight away. The lifted version, however, may be evaluated indefinitely with no memory usage.

Another way of curtailing memory usage is the (free! exprs …) construct. Using free!, the subexpressions are simply evaluated in order, but any heap space used is recovered afterwards. This must be used with great care, because it may lead to dangling pointers. It must be used in cases where the subexpressions are intended to be run in a sandbox, with no new data structures persisting outside of the free! block. (Using free! may be described as avoiding garbage collection by occasionally setting fire to the trash can!). In the following example, free! is used to eliminate the runaway heap usage in the same way as the previous example:


  ; free! version
  ; (No heap space used every time (loop) is evaluated)

  (define (loop)
    (free!
      (for 0 1
        (lambda (i)
          (print (vector-ref my-data i))))))
          					

Foreign Function InterfaceTop

Microscheme has a robust FFI (Foreign Function Interface) meaning that C code may be invoked directly from (ms) programs. The FFI is invoked using the special form: (call-c-func "function" args …). Functions with up to 10 arguments are supported.

Since Scheme is a dynamically typed language, whereas C is statically typed, typing interactions must be planned and annotated carefully. Generally, type conversion is achieved from the C side of the interface. Any C function invoked by the FFI should take arguments of type ms_value, and return a result of that type. (Values of this type may be cast directly to and from unsigned integers.) The file "microscheme_types.c" contains the necessary C framework to work with other microscheme types, including vectors and lists. The file "ffitest.c" shows how (ms) data structures may be recursed through in C code.

One benefit of the FFI is that C libraries (such as the standard Math library) are effectively available within microscheme. The following example shows how a simple wrapper to the "power" function may be called via the FFI.


  // ffitest.c

  #include <math.h>
  #include "microscheme_types.c"

  ms_value mathpow(ms_value x, ms_value y) {
      return round(pow(x, y));
  }
				

  ;; ffitest.ms

  ;; Define a scheme wrapper around our external function...
  (define (pow x y)
      (call-c-func "mathpow" x y))

  ;; We call the external C function, and send the result via Serial
  (serial-init 96)
  (serial-write (pow 2 5))
				  

The microscheme and C source files are combined during compilation at the 'linking' stage. The simplest way to achieve this is using the '-w filename' command line option when invoking microscheme. To compile the above example, one might run the command: $ microscheme -m UNO -w ffitest.c -a ffitest.ms.

Microscheme produces properly relocatable assembly files. Therefore, in a more complicated project, one may compile microscheme source files to assembly (by omitting the '-a' flag), then link and upload the program manually.

Interrupts are not used in the microscheme runtime system, but the FFI does enable or re-enable interrupts whenever C code is running. Therefore, C interrupt handlers will generally work, so long as control is periodically passed to some C function.


Runtime ExceptionsTop

Like Scheme, microscheme is strongly, dynamically typed. Exceptions are semantic errors that arise at runtime. Microscheme makes use of the Arduino's built-in LED on digital pin 13 to give on-device indications of these situations. Generally, exceptions are not recoverable, and the device will need to be reset if an exception is raised. While it is possible to use digital pin 13 for general input and output, it is highly recommended to leave it free for exception indication.

StatusMeaningIndication
RUNProgram RunningNo Light
NVPNot a Valued ProcedureSingle Flashes
NARNumber of ARguments2 Flashes
NANNot A Number3 Flashes
NAPNot A Pair4 Flashes
NAVNot A Vector5 Flashes
OOBOut Of Bounds6 Flashes
DBZDivide By Zero7 Flashes
ERRCustom ExceptionContinuous Flashes
HALTProgram CompletedContinuous Light

NVP: A procedure application takes the form (proc X1 X2 ... Xn) where proc is an expression. At the time of application, if proc does not evaluate to a (valued) procedure, such as the result of a (lambda …) form, or a variable bound to a procedure, then NVP will be raised.

NAR: A procedure application takes the form (proc X1 X2 ... Xn) where X1 X2 ... Xn are arguments. At the time of application, if proc evaluates to a procedure taking m arguments, but m ≠ n, then NAR will be raised.

NAN: Indicates that an arithmetic operator (+, -, *, /, div, mod) received an argument that did not evaluate to a number.

NAP: Indicates that a pair operator (car, cdr, set-car!, set-cdr!) received an argument that did not evaluate to a pair.

NAV: Indicates that a vector operator (vector-ref, vector-set!) received an argument that did not evaluate to a vector.

OOB: Indicates that a vector operator (vector-ref, vector-set!) received an index that was outside the dimensions of the vector given.

DBZ: Indicates an attempt to divide by zero.

ERR: This exception is raised manually by the programmer. See (error) and (assert expr) in the language guide.