The PILS programming language

1  Introduction
        1.1  Terms of use, development and redistribution
        1.2  Reading this document
        1.3  Some general characteristics of PILS
        1.4  The building blocks
        1.5  Constants and expressives
        1.6  Rules, patterns and objects
        1.7  Error handling
        1.8  The programming system
        1.9  Installing and running on MS Windows

2  Syntax and data model
        2.1  Constant syntax
                2.1.1  Numbers
                2.1.2  Strings
                2.1.3  Timestamps, datings, durations and colors
                2.1.4  Constant brackets [ ... ]
                2.1.5  Cliches, nodes and lists
                2.1.6  Tails and principal legs
                2.1.7  Words and languages
                2.1.8  Namespaces and the language node
                2.1.9  Language setting and internationalisation
                2.1.10  Regular names
                2.1.11  The apostroph
                2.1.12  Language and namespace shorthands
        2.2  Expressions
                2.2.1  Expressive nodes
                2.2.2  Name and phrase calls
                2.2.3  Object calls
                2.2.4  Calls, errors and misses
                2.2.5  Phrase breaking dot
                2.2.6  Operator precedence
        2.3  Rules, patterns and assignments
                2.3.1  Rulesets
                2.3.2  Patterns
                2.3.3  Assignments
        2.4  Syntactical sugar
                2.4.1  Hidden leg values
                2.4.2  Pattern dots
                2.4.3  Inversions
                2.4.4  Comments
                2.4.5  Numbered recurrences

3  Built-in operations
        3.1  Numeric operations
        3.2  Operations on timestamps, datings and durations
        3.3  String operations
                3.3.1  Conversions
                3.3.2  Casing conversions and the İstanbul express
                3.3.3  String splitters
                3.3.4  Operations common to strings and lists
        3.4  List operations
                3.4.1  Simple list operations
                3.4.2  Building, filtering and and folding of lists
                3.4.3  Sorting, summing, grouping and rearranging of lists
        3.5  Node and cliche operations
                3.5.1  Deep search and replace with  **=**
        3.6  Various control structures
                3.6.1  Evaluating and quoting
                3.6.2  Declaring local bindings and rules
                3.6.3  Conditionals
                3.6.4  Explicit calls
                3.6.5  Trials and loops
                3.6.6  Recall
                3.6.7  Exits
                3.6.8  Expression sequencers
        3.7  Composite objects
                3.7.1  Callarounds
                3.7.2  Aggregates
        3.8  Controlling the :who binder
        3.9  Not-filters

4  Rules and patterns
        4.1  Patterns
                4.1.1  Constants, variables and jokers
                4.1.2  Lists, nodes and escapes
                4.1.3  Type checks
                4.1.4  Searches
                4.1.5  Name splitters
                4.1.6  Aliases
                4.1.7  Comparisons
                4.1.8  Extracting the lengths of lists and strings
                4.1.9  List indexers
                4.1.10  Partial node specifications and anyway-specifiers
                4.1.11  Constant replacers
                4.1.12  System oriented extractors
                4.1.13  On the implementation of patterns
        4.2  Rule actions
                4.2.1  Responders
                4.2.2  Binders
                4.2.3  Tags
        4.3  Pinning bugs with responders

5  System-oriented features
        5.1  Supplementary constant types
        5.2  States in a changing world
                5.2.1  Channels, listeners and plugs
                5.2.2  Alien controlled minds
                5.2.3  Straps
        5.3  Simple file access
                5.3.1  File objects
                5.3.2  Folder objects
                5.3.3  Files and folders in patterns
                5.3.4  Zip files
        5.4  Worker threads, knots and latecomers
                5.4.1  Creating a worker thread
                5.4.2  Knot calls
                5.4.3  Latecomers

6  PILS programs
        6.1  Modules
                6.1.1  Module names and references
                6.1.2  Common functionality in === modules
                6.1.3  Module references in changing programs
                6.1.4  Module instantiation
                6.1.5  Datafiles and data dependent modules
                6.1.6  Module and program attributes by “this”
                6.1.7  Language modules
        6.2  Libraries
        6.3  Program straps
        6.4  Command line processing and single-instance checking

7  The PILS editor
        7.1  Creating a PILS program
        7.2  Opening an existing program
        7.3  Working with modules
                7.3.1  Editing
                7.3.2  Navigating the module tree
                7.3.3  Creating, moving and deleting modules
                7.3.4  Changing the language of a module
                7.3.5  Changing the program language
        7.4  Using PILS libraries
                7.4.1  Searching across libraries
                7.4.2  How libraries are stored
        7.5  Testing
                7.5.1  Local test rules
                7.5.2  Test modules
                7.5.3  Test projects

8  GUI programming with PILS
        8.1  Windows and panes
        8.2  Events and extenders

9  PILS graphs

1  Introduction

The PILS programming language and system has been developed by me during the years 1979 to 2008, originally as an attempt to improve on Lisp. However, under influence of Prolog, C++, XSLT and SQL, and certain features of spoken Danish, it grew into something quite different.

Like Lisp, PILS uses a unified data model for programs and data, but whereas Lisp uses lists, PILS uses attribute based nodes as the building stones of its data model. For brevity, attributes are referred to as legs. The simple functions in Lisp have, by Prolog inspiration, given way for pattern matching based rulesets, which – inspired by C++ – are treated as objects and can be combined in may ways.

PILS is usable for all sorts of projects except those that need heavy number crunching or finer synchronization primitives than those offered by PILS. A rich set of fast text and list processing operations is offered, and the programming system, though small in size, is quite mature and flexible, with support for pinning bugs in the source code. Bindings to the juce library and worker threads allow for smooth GUI applications, though the bindings and the library that wraps them are not yet full-featured.

1.1  Terms of use, development and redistribution

The source code for the PILS system is public domain (i.e. you can do what you want with it) except the Juce GUI library, see  http://www.rawmaterialsoftware.com/juce/  which is GPL 2 license unless a license is paid.

PILS was designed for open-source projects and does not support obfuscation or compilation.

Military institutions and weapon's industry should expect no cooperation from my side. Aside from that, I will help with PILS when possible.

1.2  Reading this document

I have tried to arrange the material in an order that makes sense, but due to the intertwined dependencies between various aspects of the language and programming system, some sections refer to material that is introduced in later sections. If you are new to PILS, you should probably first read it and grasp as much as you can, then try some PILS coding, then reread.

1.3  Some general characteristics of PILS

1.4  The building blocks

The principal building block of PILS data is a node, consisting of a node name and one or more uniquely named attributes, called legs. The node name and leg names can be arbitrary constants, including constant nodes. Leg values can be arbitrary PILS data.

The node name and leg names are kept in a cliche which is shared by all nodes with the same combination of node name and leg names, regardless of leg order.

Besides nodes and cliches, PILS has lists, numbers, strings and a few special types. Strings are utf-8 encoded, numbers are high precision floating point (doubles) of which integers are a special case, lists are sequences indexed from 1 and support piped operations: when a list is subjected to a series of operations, they will be performed on each list element in turn when possible, saving the need for constructing temporary list objects.

1.5  Constants and expressives

Depending on how the interpreter treats them, all PILS data can be classified as constants or expressives. Both constants and expressives are immutable, however constants are characterized by being immune to evaluation; the result of evaluating a constant is the constant itself, and the only thing that fits a constant pattern is that same constant. Expressives, on the other hand, may evaluate to something else, and can be matched by something else.

Constants have no identity and are always uniquely represented. They are registered in a global hash table on creation.

All programming constructs are represented by specially named nodes. Such nodes, and nodes and lists that contain them are expressives, all other data are constants. Expressives have identity, which is used by the programming system to pinpoint failing expressives in the text from which they were parsed, which is how failing expressions are traced to the PILS source text.

External objects are treated as constants, though they may have state.

1.6  Rules, patterns and objects

PILS objects are built from rulesets – lists of rules of the form  {pattern|action. When a ruleset is evaluated, it is bound to the current context, creating an object which can processes a call by trying rules with patterns that match the call. For the call to succeed, an action must take responsibility, typically by means of an  :ok  responder.

Rulesets can be created dynamically, simply by constructing nodes of the appropriate form – they will be compiled and indexed automatically. Bound rulesets can be combined to aggregates.

The pattern matching exploits the uniqueness of constants; often, complex structures can be recognized or rejected based on simple pointer comparisons. There is an overhead on constructing data structures, but this is mostly outweighed by the fast data recognition you get in return, except for number crunching which suffers from the boxing and hashing required to make numbers fit into a data model built for pattern-matching and searching.

1.7  Error handling

PILS has a fine-grained error handling system which, to my knowledge, is not found in similar form elsewhere.

In weakly-typed languages, bugs can be hard to locate, as failures will happen in unexpected places, far from the bugs that cause them.

PILS remedies this by means of responders and blaming – the failure of an expression can be blamed on another expression, responsible for calling it. This is accomplished by calling with the  :try  and  :need  responders, which will blame the caller of their containing rule in case of failure.

The mechanism bears some resemblance to the concept known to mainstream programmers as structured exception handling, but PILS blaming is more fine-grained and integrated with lexical scoping.

1.8  The programming system

PILS projects are stored in PILS library files. These are flat utf-8 text files that consist of a list of other libraries files required, followed by a list of named PILS modules. Module names are lists of PILS names, shown as a tree by the programming system.

When a PILS file is opened, a program strap is created, merging the library and the libraries it import, and their imports, and the libraries that constitute the programming system, and possibly some configuration files.

The executable has bindings for the juce library. These are used through a framework library that hides some of the framework specific details and allows programming in other languages than English.

1.9  Installing and running on MS Windows

PILS comes as an installer created with InnoSetup – the interface will be familiar to most users. The executable is statically linked against the run time libraries of Visual C++ 2008 Express, so no redistributables are required. An uninstall function is supplied.

PILS was developed and tested on the Danish edition of Windows XP.

2  Syntax and data model

2.1  Constant syntax

2.1.1  Numbers

Numbers are written as usual, with a leading  -  for negative numbers. The decimal point, if present, must be between digits and can be  ,  (comma) or  .  (dot). C style hexadecimal notation is supported for integers.

2.1.2  Strings

Strings are written as arbitrary character sequences delimited by  "  (double-quote) which must be doubled in strings.

            "This is a string containing a double-quote """

After the last  "  may follow a scraper – a sequence of two-digit hexadecimal byte values, doubled double-quotes and  scrapes.

The scrapes are:

            =  for LF (linefeed, standard line terminator in Unix and PILS)
            <  for CR (Carriage return, Macintosh line terminator)
            >  for TAB
            ~  or  *  for NULL
            /  for CR+LF (DOS/Windows line terminator)

Example:

            "This string ends with a line terminator"=

A scraper can be followed by more text:

            "This is a"="two-line string"=

Strings can be divided over multiple lines with  -  (hyphen).

            "This is a single line of text "-
            "that has been spread over two lines in the source."

The encoding is always utf-8, so accented letters are represented as two-byte-sequences.

Numbers and strings can be used directly as node and leg names.

2.1.3  Timestamps, datings, durations and colors

PILS has three built-in types for handling of time.

Timestamps are stored internally as GMT, but written using local time with timezone indicators

            2008-12-06T19:00:05.161+01:00

Minutes, seconds and milliseconds are optional, as are the minutes in the timezone indicator. GMT can be indicated by  +00:00 ,  +00  or  Z .

Datings are abstract time indications consisting of a date and time. They are not associated with any timezone. They are written like timestamps but without the timezone indicator.

Durations are written like:

            1001d5h20m4.567s

for 1001 days, 5 hours, 20 minutes, 4 seconds and 567 milliseconds. Weeks, months and years are not supported in durations. Parts of the duration may be omitted; a quarter of an hour can be written as:

            15m

Durations can be negative:

            -3h20m

for minus three hours and 20 minutes.

The decimal point in time indications is always  .  and is used only to separate seconds from milliseconds.

Colors are written as  #  followed by RGB (red-green-blue) values and an optional alpha value, each written as two hexadecimal digits, as in HTML. If no alpha value is given, 0xff is used.

            #ffff00            yellow
            #ffff00ff            yellow, with explicit alpha value

Presently, only 8-bit color channels are supported.

2.1.4  Constant brackets [ ... ]

Composite constants (cliches, nodes and list) are usually enclosed in constant brackets  [ ] . For cliches, the brackets are mandatory and part of their syntax, for nodes and lists, they are optional. Inside constant brackets, simplified syntax rules are used: the constructs described in the chapter on expressions are not valid, names and operators are treated as simple constants. An example:

            2 + 2    is an expression that will result in the value  4

            [2 + 2]    is a list of three constants  ,  [+]  and  2

2.1.5  Cliches, nodes and lists

Cliches:  [node-name|leg-name|leg-name ...]

Node constants:  node-name: .leg-name leg-value ...

List constants:  element, ...  (the  ,  after the last element is optional)

The empty list  []  can be omitted when this does not lead to ambiguities. Inside constant brackets,  ?  also denotes the empty list.

In constant nodes, leg values can be omitted when identical to the leg name.

Nodes are greedy:  whenever a node is started, it goes as far as possible. As a consequence of this, embedded control structures usually do not need parentheses.

Within constant brackets, list constants of two or more elements can be written as shortlists without commas. Shortlists can be embedded in comma separated lists, as in this matrix:

            [1 0 0, 0 1 0, 0 0 1]

Generally, string lists should not be written as shortlists, as this leads to confusing syntax. Write:

            s *=* ["bad", "good"]

rather than

            s *=* ["bad" "good"]

2.1.6  Tails and principal legs

A tail is leg whose name is the empty list,  [] . The PILS interpreter links many structures by their tail. The tail can be written in any position – directly after the node head:

            node: tail

or later, using ;  (semicolon)

            node: .name value; tail

Other legs may follow the tail.

The tail must be separated from the preceding  :  or  ;  by whitespace.

A principal leg is a leg whose name is the same as the node name. This has no special significance to PILS except in namespace-nodes as described below, but the convention is often useful and is supported by a special notation: if the principal leg is written first, the name can be omitted:

            message: . "Hi"    is the same as    message: .message "Hi"

2.1.7  Words and languages

The PILS data model is very liberal with names: any constant, including node and list constants, can be used as a node name or leg name, though name cliches are commonly used.

A name cliche is a cliche of two strings  [namespace-identifier|name-string] , which is produced by parsing a token, with an optional namespace prefix, through a language node.

The language node maps prefixes to namespace-identifiers, and can map specific name-strings to specific constants for a given prefix, adapting PILS to use keywords of your preferred language. Language objects are used for user interfaces of PILS applications as well.

All built-in names recognized by PILS interpreter kernel have  "pils.org/ns/sne" as their namespace-identifier –  sne  is an acronym for Scandinavian Nerd-English. Native speakers of the English language should not hesitate to create language objects with a more natural terminology if the  sne  conventions seem inappropriate to them – and speakers of other languages should seriously consider using or creating mappings for their language.

Localized – or polyglot – programming languages have been shunned since localized macro languages messed up word processing and spreadsheets long ago; however, PILS has been designed with theses issues in mind.

2.1.8  Namespaces and the language node

Parsing is always controlled by a language node:

            [language: language-definition-node]

The language-definition-node is a string-named node constant whose string-named legs hold namespace-nodes. The leg names are used as namespace prefixes. A principal leg is required and used to resolve names with no prefix.

A namespace-node is a node constant whose name is used as namespace-identifier for untranslated names with this namespace prefix. String-named legs of the namespace-node legs define translations. If the node name is  - , only translated identifiers are allowed; this is useful for restricting namespaces.

The tail of a namespace-selector is a two-byte write-control string. Its first byte holds flags that controls the use of various syntactic conventions when writing, the last byte is used as decimal point and should be  "."  or  "," .

The write control flags are:

0x01
            control characters are escaped when writing strings
            (protects against dos/unix newline conversion and null characters)
0x02
            all non-ascii characters are escaped
            (allows writing PILS expressions in pure ASCII)
0x04
            non-utf-8 compliant data go unescaped (normally, such data will be escaped)
            (saves space when writing binary data as strings)
0x10
            all names are written as cliches with string literals
            (verbose but independent of language)
0x20
            sugar-free, expressions and rulesets are written using the basal node syntax
            (instructional – shows how PILS parses your expressions)

All combinations are allowed; 0x02 overrules 0x04. The escape flags affect both names and string literals; if a name has characters that must be escaped, it is written using the string convention.

Parsing is not affected by the write-control string; in particular, both  "."  and  ","  are valid decimal points, regardless of which one is specified in the write-control string.

PILS is booted with a language node similar to this:

            [ language:
              "system":  ! default namespace prefix
              ."system" ["pils.org/ns/sne":]  ! default namespace identifier
              ."pils-configuration"  ! auxillary namespace
                [-:  ! only the following translated names are accepted in this namespace
                  ."platform". "juce"  ! values used by the boot process
                  ."system". "Win32"  ! replace with your favourite OS
                        ! (other configuration values omitted here for brevity)
                ]
              ;
              ""01"."  ! writing flags, decimal point
            ]

In this particular case, names like  pils-configuration:framework  are translated to strings which are used in the boot process to decide which PILS libraries to load.

2.1.9  Language setting and internationalisation

PILS can be localized to Danish by copying the library  <lib>/pils/danish/system/config.pils  to the user's  Application Data folder. This library will then be included in all running programs, redirecting certain functions (notably  say  and  saying) to use the danish language object defined in  <lib>/pils/danish/system/danish.pils .

Presently, only English and Danish are supported. To support another language, such as French, you should  add these files to the system:

            <lib>/pils/french/system/config.pils

which controls the indirection, and

            <lib>/pils/french/system/french.pils

which defines the language object. A simplistic French language object might look like:

            [ language:
              "fr":
              ."fr"
                    [ "pils.org/ns/fr":
                ."bon" good
                ."un" one
                ."di" say
                ."blanc" white
              ]
              ."system" ["pils.org/ns/sne":]
              ;
              ""01","
            ]

This will translate the name  un  to  one  (or, to spell it out:  ["pils.org/ns/sne"|"one"] ),  while  vin  would become  ["pils.org/ns/fr"|"vin"]  since wine is not included in this simple vocabulary.

If an French-speaking programmer writes this:

            di [un bon vin blanc]

an English user will see:

            One good vin white

which isn't quite what they teach at Harvard, but still intelligible.

It is still possible to define rules that refine the translation somewhat.

The danish language library defines a rude rule that tries to translates the English word no differently according to the context: no as an answer (No thanks) and no as a qualifier (No smoking) must be translated differently to make sense in Danish.

For an example of a full-blown language node, see  lib/danish/system/danish.pils . Note that even language neutral operators should be defined as translations; otherwise they will get a national namespace identifier and not be recognized as built-in operators by PILS – though you could still use namespace prefixing to refer to the built-in symbols.

The  İstanbul express  workaround – described in the section on text operators – explains how translations can be used to adapt the case conversion operators for Turkish, where the letter  i  preserves its dot in upper case.

2.1.10  Regular names

Arbitrary strings can be used as name-strings, but most of the time, regular names are used. These are written without string quotes.

A regular name is any sequence of letters, operator symbols  * / \ ^ # $ % & , additive symbols  + - , relational symbols  < > = ~ , digits  0 1 2 3 4 5 6 7 8 9  and  .  (dot), treating ' @ ` _ and all non-ascii characters (i.e. utf-8 multibyte characters) as letters, with the following exceptions:

The general idea is, anything goes unless it means something else to the parser.

Names ending in an operator, additive or relational symbol are classified as operators, additive operators or relational operators, respectively. Inside constant brackets  []  and for node or leg names, this classification is irrelevant, allowing operators to be used as node and leg names. The assignment symbol  :=  is formed by using a relational operator being as a leg name.

A prefix is a regular name and a  :  with no spaces. The prefix name is looked up in the namespace node,  and this value is used to translate the following name or combine it with a namespace-identifier.

When no prefix is used, or when the dummy prefix  ?:  is used with string literal names, the default namespace will be used.

A PILS expression cannot define namespace prefixes locally, the only valid namespace prefixes are those defined by the language object. It is still possible to read and write names of other namespaces, using the cliche notation:

            ["namespace-uri"|"local-name"]

2.1.11  The apostroph

A single apostroph  '  is interpreted as a repetition of the last read name which is neither an operator or relational operator.

            { ok | :ok ' + 1 }

is the same as

            { ok | :ok ok + 1 }

or

            { ok | :' ' + 1 }

Further, all operators can be escaped by adding an apostroph  ' . This causes them to be read as ordinary names, not including the apostroph.

Further apostrophs can be added to such names; the last is ignored and the rest is taken as an ordinary name. The same applies to names constisting of nothing but apostrophs. However, the final apostroph is included in the name if the last non-apostroph is not some kind of operator symbol.

2.1.12  Language and namespace shorthands

As a convenience, the parser accepts  {}  as a shorthand for the language node.

The namespace-uri for a namespace prefix can be referred to by:  prefix:?  or, for the default namespace,  ?:?

2.2  Expressions

2.2.1  Expressive nodes

Generally, nodes are evaluated by evaluating their legs and creating a node from them with the same head and leg names (for node constants, this always results in the exact same node and is equivalent to simply returning the node).

Nodes with names  []  (the empty list) or  [|action]  are interpreted differently, and are written with special syntax:

            ;name value ...    instead of    []: .name value ...
            ;: value ...    instead of    []: value ...
            :name ...    instead of    [|action]: .name ...
            : value ...    instead of    [|action]: value ...

Empty-named nodes are used for declaring local bindings:

            ;name value; ...

Action nodes are used for control structures:

            :if condition; expression .else else-expression

2.2.2  Name and phrase calls

A standalone name

            name

is read as:

            :call [name]

In patterns, this binds the name to the corresponding value. In expressions, the current context will be searched for bindings or rules that match the name.

If the name is followed by a constant, a ruleset or an expression in parentheses, a node is formed – this is called a phrase.

            a 3    is read as    :call [a: 3]

            b {x|y}    is read as    :call b: {x|y}

Phrases can have named legs

            line (.from a .to b)    is the same as:    :call line: .from a .to b

Principal legs and hidden leg values can be used:

            message (.)    is the same as:    :call message: .message message

Constant brackets can be used to form phrases:

            message [. "Hello"]    is the same as:    :call [message: .message "Hello"]

In patterns, constant phrases work like names but expressive phrases are treated as ordinary nodes, which rarely makes sense. In  expressions, expressive phrases are evaluated before the call is executed.

2.2.3  Object calls

Two consecutive items are read as an object call:

            (a) (b)    is the same as    :who a .call b
            or, to spell it out:    :who (:call [a]) .call :call [b]

If the second item is a name or phrase, it is read as a method call:

            a b     is the same as    :who a .call [b]
            or, to spell it out:    :who (:call [a]) .call [b]

Object calls are chained from the left:

            a b c    is the same as    (a b) c

In patterns, object calls are not generally meaningful. Some method calls are reserved for typechecks and various other purposes. To match an object call node, a pattern should specify an escaped object call, such as  : (a) (b)  or, using a more explicit syntax,  ::who a .call b .

2.2.4  Calls, errors and misses

An expression can return to its caller in the following way:

  1. The expression succeeds, and a value is returned.

    2 + 2    returns    4

  2. The expression misses, that is, PILS has no methods of evaluating it.

    2 + "two"    misses, assuming no rule processes it

  3. The expression fails, that is, a built-in operator or a supplied rule explicitly fails.

    {} read "[shit}"    fails with    [error: . ?:"Missing constant" .start 5 .end 5]

Failures signal that something is wrong with the program or its data, and should generally be reported to the user, while misses can be treated more lightly, depending on the context.

When a  :call  node is evaluated, this happens:

  1. The  call  leg is evaluated.

  2. The context is searched for rules or bindings that can process the call.

  3. If no rules or bindings process the argument, a miss is signaled.

When a  :who .call  node is evaluated, this happens:

  1. If the node is a built-in operation, special logic takes over. If the built-in operator cannot deal with the data, it will simulate the following steps 2 and 3 using already evaluated parts of the expression, so that things are as if the built-in operator was never tried. If a piped operation like  every  or  fold  misses, this rollback is not possible and the miss is treated as in 6.

  2. The  call  leg is evaluated, this is the argument.

  3. The  who  leg is evaluated, this is the object.

  4. The object is searched for rules or bindings that can process the argument.

  5. If no rules or bindings take effect, a  :who .call  node is constructed and treated as a call in the current context, allowing fallback rules to supplement built-in operators and object calls with locally defined rules.

  6. If this errs or misses and the expression was the action of a  :try  or  :need  responder, the error or miss is treated by the responder.

  7. If this does not apply, an  :error -call is made in the current context. For misses, an error value is constructed, for errors, the supplied value is used.

  8. If the  :error  call misses, the error value is returned and execution continues.

Note that the argument is evaluated before the object. This convention was chosen mainly because it fits in with the optimization strategies used in the interpreter, especially the piping of lists. It is possible that future PILS implementations might evaluate them in parallel.

2.2.5  Phrase breaking dot

An open-air dot can be used to separate a name from a following constant, ruleset or parenthesised expression, to prevent the forming of a phrase.

            a . 3

is read as:

            :who (:call [a]) .call 3

(If a yields a list, this will get the 3rd element.)

An open air dot can also be used to separate a name or phrase from a preceding element, to indicate that the name or phrase should be embedded in a  :call  node.

            a . b

is the same as

            (a) (b)

or:

            :who (:call [a]) .call :call [b]

(If  a  and  b  yield texts, this will concatenate them.)

2.2.6  Operator precedence

When an operator starts a sequence or follows another operator or relational operator, it is read as a prefix operator. Prefix operators have the highest operator precedence.

            + n

is read as:

            :who n .call [+]

When an operator follows an operand, it is read as an infix operator.

            a * b

is read as

            :who a .call *: b

Operators ending in  * / \ ^ # $ % &  have the same high precedence as sequence calls.

Additive operators, ending in  +  or  - , have medium precedence.

Relational operators, ending in  = ,  < ,  >  or  ~ , have low precedence.

Within the same precedence, expressions are assembled left-to-right.

2.3  Rules, patterns and assignments

2.3.1  Rulesets

The building blocks of PILS objects are rulesets. A ruleset such as

            { pattern2 | action2 }
            { pattern1 | action1 }

is represented by a :ruleset

            :ruleset (;match pattern2 .action action2), (;match pattern1 .action action1)

When a ruleset is evaluated in a given context, it is bound to the context by creating a node

            :ruleset .where

which can be used as an object. When a call is performed on a bound ruleset, the rules are tried, last-written rule first, using an internal index for fast skipping of irrelevant rules. When trying a rule, the call is first matched into the pattern, and if this succeeds, the action is evaluated, with bindings established by the pattern match.

For the rule to have any effect, the action must respond to the call by performing a responder statement, typically  :ok  but other responders allow refined control. If no action responds, control passes to the next rule. If no rule responds to the call, it falls through.

Rules can query various aspects of the call by the implicit binders, of which  :who  is the most important and roughly corresponds to the this  pointer of C++ and the like.

2.3.2  Patterns

Patterns are data with a structure similar to that of the data to be matched. Any constant is a pattern that accepts itself and nothing else. Expressive lists match lists with the same length and matching elements. Expressive nodes generally match nodes with the same cliche and matching legs, though some nodes have special interpretations in patterns.

Variables are bound to the corresponding value, testing for identity if a variable occurs more than once in a pattern. They are written simply as names, represented by nodes  :call [name.

The joker – written as  ?  and represented as  :call []  – is not bound or tested.

Other special nodes allow named sub-patterns, typechecks, compares with numeric literals, text and list search constraints and length extraction, matching specific positions in a list of unspecified length, simple translations of scalar values, and partial specifications of nodes with default values.

The special nodes can be used as ordinary nodes by escaping them:

            : special-node

2.3.3  Assignments

Though generally a pure functional language, PILS has an assignment statement which is used in various list processing commands, and to set attributes of special objects.

It looks like:

            target := value
            target := value; tail

Assignments are evaluated by first evaluating value, then target is evaluated with a special variant of calling that searches for assignment rules or assignment-sensitive built-in operations.

If tail is present, the result is discarded after performing the assignment, and tail is evaluated and returned. The tail form is equivalent to:

            (target := value) and: tail

Rules for assignments look like:

            { target := value | action }

These rules are dealt with separately by the rule compiler. General rules like  {x|...}  cannot be invoked by assignments, and assignment rules cannot be invoked by ordinary calls, even with arguments that look like assignments.

The PILS parser does not recognize assignments as such; they are parsed as sequences that end with a greedy node, and represented as:

            :who (target) .call (:= value)
            :who (target) .call (:= value; tail)

2.4  Syntactical sugar

The following conventions apply to expressions only; they are not valid within constant brackets [] .

2.4.1  Hidden leg values

In the common case where a named leg holds a call to its name, the leg value can be omitted.

This is used in patterns as well as expressions, in the very common case when the name of an leg is bound to its value, i.e. named parameters.

The rule also applies to principal legs, but not to tails.

            message: .    is the same as    message: .message message

A hidden leg is often used with the  :ok  responder for simple extraction rules. To extract the tail of a node:

            {* ;: tail|:ok tail}

or shorter:

            {* ;: ok|:ok}    which is the same as    {* ;: ok|:ok ok}

2.4.2  Pattern dots

If a rule pattern starts with a dot, an immediately following phrase or name is not made into a call. This is handy for specifying parameterless operations:

            { .name | ... }

is easier to type than

            { [name] | ... }

2.4.3  Inversions

A sequence of the form

            completion .(incomplete-expression)

is read as:

            incomplete-expression completion

In the original form, completion is parsed with the priority of a sequence; it can be a series of operations. When incomplete-expression is parsed and  )  is reached, completion is inserted as a unit.

            a b .(c d:)    is the same as:    c d: a b

An abbreviated form exists for single-leg action nodes, escapes and double-escapes:

            expression .:name    is the same as    :name expression

            expression .::    is the same as    :: expression

The  :need  and  :list  constructs are often used in this form:

            expression .:need    is the same as    :need expression

            (... list := ...) .:list    is the same as    :list (... list := ...)

Inversions can be used to give complex expressions a narrative flow: you can write an expression and test it, then embed it in a control structure by adding an inversion at the bottom, without the need to change the part you already wrote.

The construct is modeled after a feature of natural language, as in this English phrase:

            Things I like to do

Strictly speaking, things should follow do, but natural languages allow us to put essentials first, and so does PILS, to some degree.

2.4.4  Comments

PILS supports 3 types of comments:

            ! comment (until the first CR or LF character)

            !! comment (may include newlines) !!

            :- comment-expression; expression

!  and  !!  comments are treated as white space when parsing;  :-  comments are parsed as action nodes, allowing comment-expressions to be embedded in expressions and patterns.

They correspond vaguely to  //comment newline,  /*comment*/  and  #pragma  of C++.

2.4.5  Numbered recurrences

Often, the same substructure is used in many places in a larger structure. Numbered recurrences were introduced for the purpose of saving space when such structures are stored as text. They are generally not used in PILS source code.

Strings, rulesets and expressions in square or round braces can be prefixed by a positive integer, and later referred by same integer followed by a  .  (full stop).

            1"toot", 1., 1.    is the same as    "toot", "toot", "toot"

            123(x) + 123.    is the same as    x + x

3  Built-in operations

Most built-in operations are implemented as intercepted object calls. They are represented by nodes of form   :who .call  which are given special treatment by the interpreter.

Most of these operations only work when called directly, not through other constructions such as the  (argument) call (object)  form. However, list element extraction by index and the operations defined on language objects also work when called indirectly.

Note the difference: This is a direct call:

            {.cliche|:ok [my-cliche]} cliche    returns    [[|action]|where|ruleset]

The builtin cliche operation gets the cliche of the node used to represent a bound ruleset.

This is an indirect call and invokes the  [cliche]  method defined by the ruleset:

            [cliche] call {.cliche|:ok [my-cliche]}    returns    [my-cliche]

3.1  Numeric operations

All numeric operations are done using doubles (typically 64 bit floating point with 52 bit mantissa, but implementations may differ). After a chain of operations, a trial conversion of the result is made to a 32 bit signed integer value and back to double. If this results in the same double as before, the number is recognized as an integer. Both integers and non-integers are hashed and boxed, though the integers 0 – 216-1 have prefabricated boxes, to speed indexing calculations.

The overhead of interpretation, hashing and boxing makes numeric performance slow. This is by design: fast pattern matching and node lookup are of greater importance.

Infinities, not-a-number etc. are not supported by PILS. Handling them has only been sporadically tested, so they may create havoc.

Binary operations are

            +    addition
            -    subtraction
            *    multiplication
            /    floating point division (even when supplied with integer arguments)
            \    integer division
            %    modulo

Beware: all infix operators have the same precedence: the result of  1 + 2 * 3  is  9 , not  7 .

Prefix  -  (minus) works as you expect:   ;x 7; - x    returns  -7 .

            abs  returns the absolute value

            round  returns the nearest integral value, exact halves are rounded away from 0

            trunc  truncates towards 0

These postfix operators are like the corresponding C++ functions with double values:

            sin cos tan asin acos atan sqrt log exp

The  round  and  truncate  operators can be used with a unit argument:

            x round 10

rounds x to the nearest 10.

Comparisons  = <> < <= > >=  succeed and return the first argument if the comparison is true, but fail if it is false; PILS has no boolean type.

PILS attempts to write numbers so that parsing them will produce the exact same number but unfortunately this cannot be relied on for large exponents.

There is no support for format strings.

3.2  Operations on timestamps, datings and durations

Ordinary arithmetic operations can be used with timestamps and durations in these combinations:

            timestamp + duration
            dating + duration
            duration + timestamp
            duration + dating
            duration +  duration

            timestamp - timestamp
            timestamp - duration
            dating dating
            dating duration
            duration - duration

            duration * number
            number * duration

            duration / number
            duration / duration

            duration \ duration

            timestamp round duration
            dating round duration
            duration round duration

            timestamp truncate duration
            dating truncate duration
            duration truncate duration

Two values of the same type can be compared using  = <> < <= > >=

            timestamp comparison timestamp
            dating comparison dating
            duration comparison duration

The system operations  now  and  timestamp  both get the current system time, but  timestamp  will produce unique values within each process, incrementing when necessary.

3.3  String operations

3.3.1  Conversions

s utf-8    converts between utf-8 encoded strings and lists of unicode values as integers.

s utf-16    converts between utf-16-encoded strings with byte order mark, and integer unicode value lists

s utf-16le    same, for little-endian utf-16 without byte order mark

s utf-16be    same, big-endian

s bytes    converts between strings and list of byte vales 0 – 255.

This concatenates 2 strings:

            (s) (t)

This concatenates a list of strings ss  using a separator string t:

            ss splice (t)

To split it again:

            s split (t)

The replacement operator

            s *=* ss

expects ss to be an even-length list of strings and interprets them as replacement pairs. For each position in s, the pairs are tried. If a match is found, the replacement is done and the search advances to the position after the replaced text, starting from the beginning of ss.

The prefix/suffix-replacement operator

            s (<)$*=* ss

only covers matches at the beginning/end of s and never replaces more than once.

Strings can be compared with the relational operators  = <> < <= > >= .  As always,  =  and  <>  test for identity; the unique constants mechanism ensures that equal strings are identical. The other operators use bytewise unsigned comparisons – for utf-8 data, this is equivalent to character-wise comparison. For case-insensitive comparisons, use the  lower operation on both operands. For proper alphabetical sorting, use the  order  operation with a key-function that produces collation keys, i.e. strings that, when sorted by raw comparisons, get sorted in the proper localized alphabetical order of the original strings.

3.3.2  Casing conversions and the İstanbul express

To convert a string to upper or lower casing:

            s upper
            s lower

To convert the first character of a string to title or upper casing, leaving the rest untouched:

            s title

The casing operations are implemented by code generated from unicode tables to work directly on utf-8, and should work with most languages.

Unfortunately, the Turks won't like this:

            "istanbul" title    returns    "Istanbul"

To get the correct capital letter, the operator must be supplemented by a replacement operation:

            "istanbul" $*=* ["i", "İ", "ı", "I"] title    returns    "İstanbul"

The İstanbul Express is a workaround designed to make this easier:

The  [===,]  module, available to all PILS modules, defines these operator replacements, combining the casing conversion operators with appropriate replacements:

            { text . ([upper] // replace) | :try text *=* replace .:need upper }
            { text . ([lower] // replace) | :try text *=* replace .:need lower }
            { text . ([title] // replace) | :try text $*=* replace .:need title }

By replacing the untranslated principal namespace with a translating version that intercepts the  upper ,  lower  and  title  operators as follows:

            {} **=** ;[[?:?]:]
            [ [?:?]:
              ."upper" [upper|"i", "İ", "ı", "I"]
              ."lower" [lower|"I", "ı", "İ", "I"]
              ."title" [title|"i", "İ", "ı", "I"]
            ]

and using this language to parse our test expression, we get

            "istanbul" title    returns    "İstanbul"

So the casing conversion is configurable by means of the language object.

3.3.3  String splitters

The splitter is an extended  split  operation, designed for writing tokenizers but useful for text analysis in general. The splitter was added to PILS mainly because of difficulties with integrating regular expressions with the uniquely represented strings of PILS.

A splitter is a constant node specifying a grammar for recursive descent parsing. The tail is a prioritized list of top level target names (start nonterminals); the corresponding legs specify roads to the targets (productions). The roads consist of targets, text snippets and control instructions. Targets can be invoked recursively, including internal targets not found in the priority list.

A road can have alternate lanes, each of which can have several steps and tests. Tests are steps that test a condition without advancing.

            s split
            [ target, ...
              .target road
              ...
            ]

This walks through s and produces a list of pairs (target, substring), identifying top level matches. When no targets match, (, singlebyte-string) is produced.

Empty matches are rejected at the top level and by the repeaters  [*: ...]  and  [+: ..., to prevent infinite loops, but allowed elsewhere.

Valid match instructions are:

            1 – wildcard, matches any single byte
            "a-z" – a 3 byte string, matches one byte in the range
            any other string – requires exact match
            [*: road] – zero or more road runs
            [+: road] – one or more road runs
            [=*: road] – skip while road is blocked, then take it
            [^*: road] –  skip while road is blocked, don't take it

Valid test instructions are:

            [-: road]road must fail
            [/: target] – the last successful top-level target must be target...
            [/: target, ...] – ...or one of them.  $  means no top level targets emitted yet.

Lists are used in two levels with different interpretation:

            lane, ... – one of the lanes must be taken.
            A lane can be a list of steps that must be taken consecutively.

For clarity, lanes should be separated by commas while multiple steps should be written as shortlists.

In the common case where a step list is the only lane in a road, append a comma to make it into a 1-element alternative list, like the  .hexnumber  below.

This example recognizes C-style hexadecimal numbers:

            s split
            [ hexnumber,
              .hexnumber "0x" [+: hexdigit],
              .hexdigit "0-9", "A-F", "a-f"
            ]

This splits s in single utf-8 characters (valid) and lumps of non utf-8 conformant data (invalid):

            s split
            [ valid invalid
              .valid ""00"-"7f, ""c0"-"df x,  ""e0"-"ef x x, ""f0"-"f7 x x x
              .x "80"-"bf
              .invalid [^*: valid]
            ]

The  -:  instruction is useful for tests like these.

            [-: 1]     end of string
            [-: -: road]    lookahead, road must be possible but is not taken

3.3.4  Operations common to strings and lists

The operations in this section work on strings as well as lists. Strings are processed bytewise, so certain operations can produces unexpected results with utf-8 multibyte characters.

String and list handling is optimized to avoid creating and hashing of temporary objects. Binding the temporaries to variables breaks these optimizations – if you have performance problems, use chained operations when possible.

In the following, s and t are strings or lists, n a non-negative integer and op an operator. An  (<)  in front of an operator – such as   (<)+#  – indicates that the operator  +#  has a pendant  <+#  that works in the reverse direction, counting indexes backwards.

s . n    – the nth byte/element of s, counting from 1, fails unless 1 <= n <= s count .

s count    yields the byte/element count of s.  Use  s utf-8 count  for character count.

s count (t)    counts non-overlapping occurrences of t within s.

s reverse    – s backwards. Use  s utf-8 reverse utf-8  if you need character based reversal.

s (<)+# n    – the first n  elements of s; fails if  n > s count .

s (<)-# n    – s without the first n elements; fails if  n > s count .

s (<)++# n    – the first n elements of s, or all of s if  n > s count .
For lists,  ++#  is piped and only evaluates the first n elements of s.

s (<)--# n    – s without the first n elements; "" if  n > s count .

In the following,  (<)(+#/-#)op  indicates that op has 2 pendants:
            s +#op t    =     s +# (s op t)
            s -#op t    =     s -# (s op t)
as well as reverse direction pendants <op  <+#op  <+#op

These are safe with utf-8 multibyte characters:

s (<)(+#/-#)=* t    – count of s to and including first occurrence of t, or  0  if t is not found.

s (<)(+#/-#)^* t    –  count of s to but excluding first occurrence of t, or  s count  if t is not found.

s (<)(+#/-#)$* t    – count of t if s begins with t, else  .

These are unsafe if t has utf-8 multibyte characters:

s (<)(+#/-#)#* t    – count of common start of s and t.

s (<)(+#/-#)~* t    – count of s to and including last element of sparse occurrence of t, or  0 .

s (<)(+#/-#)+* t    – count of initial elements of s also found somewhere in t.

s (<)(+#/-#)-* t    – count of initial elements of s not present in t.

To illustrate the workings of the combined operators, the expression below extracts the name part of a filename by excluding directory and extension, allowing for  \  or  /  as directory separators:

            fn <+#-* "\/" <-#=* "."

The <+#-* operator is safe with utf-8 filenames because ASCII characters  \  and  /   never occur in utf-8 multibyte sequences.

3.4  List operations

The following paragraphs describe operations specific to lists.

3.4.1  Simple list operations

The built-in list operations of PILS work by piping: – instead of constructing lists directly, elements are passed one by one to pending list operations. If no list operations are pending, the elements are collected in an internal structure and the list is built when the operation is finished.

In the following, m and n are integers, s and t are lists unless otherwise mentioned.

listwise  and  singlewise  are convenience operations are used for handling situations where a single element or a list of elements may be passed interchangeably, such as parameter lists vs. passing a single parameter.

            s listwise    is like    s call {e|:ok e,} {& ok|:ok}

If s is a list, it is simply passed on. If not, it is wrapped in a single element list.

            s singlewise    is like    s try {ok,|:ok}

The opposite of  listwise : if s is a single-element list, the element is extracted; all other values are simply passed on.

            s & t  concatenates  s listwise  and  t listwise .

            s first (e)    prepends e to s, for use with the  fold  operation

The  up  and   down  operations produce lists of increasing and decreasing integers:

            m up (n)    m, m + 1, m + 2 ... n  or  []  if  n < m
            m down (n)    m, m - 1, ... n  or  []  if  n > m
            n up    same as    1 up (n)
            n down    same as    n down 1

Lists can be split in sublists of a certain length:

            s split (n)
            s split 2    splits s in pairs.

A list of lists can be joined to a simple list:

            s splice

Non-list elements of s are included in the spliced list.

3.4.2  Building, filtering and and folding of lists

Lists can be built by the list builder:

            :list ... list := value ... ...
            :list [tag]; ... list [tag] := value ...

This produces a list of all the values assigned. The tagged form allows selective writing in nested list builders.

List builders are often written as inversions: (... list := value ... ...) .:list

List can be filtered by the operations below. The filters are usually rulesets and do not need parentheses.

            s each (filter)

Try to apply the filter function to all elements of s in turn. Pass a list of the results, ignoring misses.

            s except (filter)

Like each, but pass only the elements that miss the filter. Results returned by the filter are ignored.

            s legs (assign-filter)

Like each, but for each element e at position n (starting from 1 as always), the assignment  n := e  is tried, instead of just e.

            s every (filter)

Like  each  but requires the filter function to succeed for all elements.

            s find (filter)

Like  a each (filter) 1 but faster – returns the first filtered element, fails if the filter never succeeds.

To eliminate dublets from a list:

            s distinct

To eliminate dublets based on a key function rather than the element:

            s distinct (filter)

All elements of s will be tried by the filter. If this produces a constant that has not been used, the original element is passed through. If a  :name name; value  node is produced and the name has not been used, the value is passed through, instead of the original element.

To consume a list:

            s fold (assign-filter)

Implements a list consumer with state. s must have at least one element; the first element will be the initial state. This is usually specified like

            s first (initial-state) fold (assign-filter)

For all following elements e, assign-filter is required to process the assignment  {state := e|...}, resulting in a new state. Finally, the state is returned.

The  fold operation is often combined with a list builder to create stateful filtering. Example:

            ["Rene" male "John" "Peter" female "Jane" "Susan"]
            first [unknown] fold
            { gender := $ name | :ok list := gender, name; gender }
            { ? := / gender | :ok gender }
            .:list

 The result is:

            [unknown "Rene", male "John", male "Peter", female "Jane", female "Susan"]

($  and  /  are typechecks, as explained in the section on patterns.)

3.4.3  Sorting, summing, grouping and rearranging of lists

To sort a list by keys, use:

            s order (key-function)
            s order [ok]    (simple sorting using a trivial key-function)

key-function is required to process all elements of a. A sorted copy is then passed, based on comparing the resulting keys. Numbers and time values are compared numerically; for strings, binary comparison is used. To achieve collated ordering, the key-function must transform the strings to collation keys. For multiple-key sorting, the key-function should return a list.

To extract a single element of a list by smallest/largest key:

            s smallest (key-function)
            s largest (key-function)

To simply get the smallest/largest element:

            s smallest
            s largest

To get the longest string in a list:

            s largest {? $ ok|:ok}

If the largest key is shared by several elements, the first of them is returned.

To sum a list of numbers:

            s sum

To group data by key:

            list-expression groups (key-function)

The key-function is tried for all list elements, and should either return a constant name or a  :name name; value  node or fail; returning an expressive other than a  :name name; value  node is an error. Returning a constant name is equivalent to returning a   :name name; value  node with that name, and the original element as a value.

This resulting lists of values for each name are assembled in a node:

            groups: .name value-list ...

If no values are present,  []  is returned.

The expression

            15 up groups {n|:ok {} write (n) count}{13|?}

will group the numbers from 1 to 15 by digit count, leaving out 13. The result is:

            [groups: .2 10 11 12 14 15 .1 1 2 3 4 5 6 7 8 9]

For each name, the values are listed in the order in which they occur in the original list, however the order of the keys is arbitrary, as is always the case with attribute names.

For the common case where only the first or last value for each key is wanted:

            list-expression firsts (key-function)
            list-expression lasts (key-function)

works like  groups , but records only the first/last element for each key, instead of a list of all its elements.

            list-expression singles (key-function)

like  firsts  but fails if a key is used more than once

            list-expression folds (combined-key-function-and-assign-filter)

like  firsts  but when an already-used key is encountered, the old and new value are folded, using assign rules of the same filter that was used for key extraction. The whole expression misses if a fold misse.

This can be used for summing etc.:

            [apples 1, oranges 2, bananas 3, apples 4]
            folds {kind, count|:ok :name kind; count} {a := b|:try a + b}

returns

            [folds: .apples 5 .bananas 3 .oranges 2]

as the apples  counts are added in the folding rule.

The  traverse  operation rearranges a rectangular two-dimensional list, swapping dimensions:

            [one 1, two 2, three 3] traverse    returns    [one two three, 1 2 3]

3.5  Node and cliche operations

Named nodes can be constructed by simple evaluation:

            oops: 1 + 2    returns the node constant    [oops: 3]

Action and binding nodes can be constructed similarly with escapes.

            : 4 + 5 + 6

is the same as

            ::who (:who 4 .call +: 5) .call +: 6

and returns

     9 + 6
            :: 4 + 5 + 6    returns    : 15    – the double escape constructs an escape.

An escaped ruleset evaluates all patterns and actions, and constructs a new ruleset from them.

To process the legs of a node in turn:

            node legs (assign-filter)

For each leg, the assignment  name := value  is tried in the filter; the results are passed as a list, ignoring misses.

To copy a node but without a specific leg:

            node without (leg-name)

To add or replace a leg:

            node merge (leg-name, leg-value)

To merge two nodes a and b:

            a merge (b)

When merging nodes a and b this way, the legs of b override legs of a with the same names. The head of b is used unless it is  [] ; in that case the head of a is used.

If either argument of  merge  is  [] , the other argument is returned, except for this case:

            [] merge (nonempty-list)

which is undefined unless nonempty-list is a pair  (name, value)  where name is a constant. In that case, the result is a node  anyway: .name value

This convention serves for building nodes with  merge,  starting with the empty list.

To replace the name of a node or cliche:

            a head (h)

To build a single-leg cliche as used for names:

            namespace-identifier // local-name

This operation is mostly used with text strings but works with other constants as well.

3.5.1  Deep search and replace with  **=**

To search a structure of nodes or lists recursively, possibly replacing some parts:

            s **=** filter

First, filter gets a chance to process s. If this succeeds, the result is returned with no further processing. If it misses and s is a list or node, its leg values or elements are processed by first calling the assignment  n := value ,  n being the name or number of the leg, and then simly  value , in filter; if both miss and value is a node or list, the process is applied recursively. When the filter calls succeed and return other values than the original, new nodes or lists are generated if the result of the operation is to be used.

This example will zero all numbers except price legs which are doubled. Note that the assignment is tried first, regardless of rule order.

            s **=** {[price] := # ok|:ok . * 2} {#?|:ok 0}

Similarly, when transforming a list element, the  assignment  count := value  is tried before  value ;  count  being the position in the list, starting from 1.

To search a structure without transforming it, simply do a transformation in a statement where the result is not used – typically a list builder. To list all node names in s:

            s **=** {name * ?|list := name} .:list

Tip: If the filter only has assignment rules, the top level will not be matched. If the topmost rule (which is the last to be searched) is:

            { ? := ok | :ok }

only direct legs of the top node or list will be searched.

3.6  Various control structures

3.6.1  Evaluating and quoting

PILS expressions can be evaluated by the evaluate-operator  ---  which can be used as a prefix or infix operator:

            --- e    evaluates the value of e in the current context (e gets evaluated twice).

            e --- c    evaluates the value of e using the value of c as context.

Expressions can be quoted with the  :quote  statement:

            :quote e    or, using an inversion:    e .:quote

This returns e as it stands, without evaluation.

Note:  :quote is rarely needed in PILS. There is no reason to use it for constant nodes or listes.

3.6.2  Declaring local bindings and rules

To bind names to values locally:

            ;name expression; tail

expression is evaluated in the current context cc, and tail is evaluated in a context  ;name evcc  where ev is the value of expression.

Several names can be bound in one go:

            ;name1 expr1 .name2 expr2 ...; tail

All the exprs are evaluated in the original current context, in undefined order. (Future PILS implementations might evaluate them in parallel.)

All constants except  []  are valid names for binding.

To use a ruleset locally:

            :use rules;
            expression

or

            expression
            ===
            rules

These have the same effect: rules is evaluated in the current ruleset and used to extend it, then  expression is evaluated in the extended context. Typically, rules is a ruleset and gets bound by the evaluation, but any expressions – in particular, references to PILS modules – can be used.

3.6.3  Conditionals

A conditional has this general form:

            :if condition; success-expression .else else-expression

The else-expression is optional:

            :if cc; e    is like    :if cc; e .else []

If  condition  succeeds, its value is ignored and  success-expression  is evaluated. If  condition  fails, the failure is ignored and else-expression is evaluated.

Conditions can be combined with  and  and  or .

            :if (item price <= maxprice) and ((item quality = [good]) or (item quality = [excellent])); buy (item)

3.6.4  Explicit calls

As an alternative to the form  (object) (argument) , object calls can be specified with the  call  operation:

            (argument) call (object)

The operation applies  object  to  argument .

This differs from  (object) (argument)  in the following ways:

object gets evaluated before argument

3.6.5  Trials and loops

A call can be attempted with the  try  operation:

            (argument) try (filter)

This is like the  call  operation, but if  object  does not accept  argument ,  argument  is simply passed on.

Looping is done with the  repeat  operation:

            (argument) repeat (filter)

This is like the  try  operation, but  filter  is repeatedly applied on the last value. When filter fails, the last value is returned.

This example lists the integers from 1 to 10  (though  10 up  is an easier way):

            1 repeat {n <= 10|:ok list := n; n + 1} .:list

To iterate a constant as long as it changes:

            (argument) again (filter)

argument is evaluated and must return a constant, or the operation will fail. Then, filter is repeatedly applied and must always return a constant, or the operation will fail. When this constant is the same as the last value, it is returned.

            1 again {x|:ok 1 / x + 1}

is similar to:

            1 repeat {x|;new 1 / x + 1; :if new <> x; :ok new}

and returns  1.618033988749895  on the x86 (it may fail on architectures with different floating-point rounding characteristics).

To prevent infinite loops, the  again  operator has built-in cycle detection: at iterations 64, 128, 256, 512, 1024... a value is sampled and compared against the following values. If it reappears after at least one other value, a cycle has been detected and the  again  operation fails. In examples like the above, floating point rounding errors may cause the loop to end up flipping between a two or more values – without the cycle detection, this would result in an endless loop.

3.6.6  Recall

Mindful recursive analysis is supported by the  recall  operation:

            (initial-constant) recall (analysis-rules)

Like the  call  operation, but occurrences of the the  :who  binder inside analysis-rules refer to a wrapper object that remembers both ongoing and completed calls; only constants are allowed as calls. If an already completed call is repeated, the old value is returned. If an ongoing call directly or indirectly repeats itself recursively, the inner call misses.

3.6.7  Exits

The  :exit  statement looks similar to the  :list  statement:

            :exit ... exit := value ... ...
            :exit [tag]; ... exit [tag] := value ...

Like the list statement, exit statements are often written using inversion:

            (... exit := value ... ...) .:exit

If the assignment is performed, value is returned immediately from the  :exit  statement.

3.6.8  Expression sequencers

Two expressions can be combined by the operators  and ,  or  and  anyway :

            e1 and: e2

e1 is evaluated. If successful, the result is discarded and control passes to e2. If e1 fails, an error is raised and e2 is not evaluated.

            e1 or: e2

e1 is evaluated and returned if successful. If e1 fails, control passes to e2.

            e1 anyway: e2

 e1 is evaluated. Whether it fails or succeeds, control is passed to e2.

3.7  Composite objects

3.7.1  Callarounds

Callarounds - nodes of form

            call: argument

have a special interpretation when used as objects: they switch the argument with the object. To illustrate its use, consider this expression:

            list each {item|:try item price}

The expression above will return a list of the prices of those list items that have a price. However, a shorter and faster way to get it is:

            list each [call: price]

This works as follows: each item from list is given as argument to the object  [call: price] , which deals with them by trying to call their  price  methods.

3.7.2  Aggregates

Two objects a and b can be combined into an extending aggregate with the  +++  operator:

            a +++ b

When the extending aggregate is called upon, b gets a try first. If b does not handle the call, a gets a try.

Extender aggregates are similar to subclassing in traditional object oriented languages, but aggregates are a runtime concept and work directly on objects, as PILS has no class concept.

If  b = [] , the  +++ operator simply returns  .

A filtering aggregate can be created with the  ->  operator:

            a -> b

When this aggregate is called upon, a must process the call, and b must process the result, or the call fails.

Serial compounds cannot process assignments.

Note: the  ->  operator has the priority of a relational operator, associating from the right. The priority follows from the last character.

3.8  Controlling the :who binder

The  :who  binders will refer to the entire aggregate if used in a or b, allowing the component objects to call methods on the aggregate; this is similar to virtual methods. To isolate an aggregate component, wrap it like this:

            who: object

Any  :who  binders in object will now refer to object, rather than the containing aggregate.

For use with the programming system's wrapping of platform specific objects, an extended form is supported:

            who: . who-binding ; object

where  who-binding  specifies the value of  :who -bindings inside object.

3.9  Not-filters

Aggregates of form

            not: object

will miss calls that are processed by object. If however object misses the call, the call is successfully returned unchanged.

4  Rules and patterns

4.1  Patterns

Patterns are used in rules; they specify a structure into which data can be fitted.

4.1.1  Constants, variables and jokers

A constant matches the exact same constant and nothing else.

A standalone name (written simply as  name  but represented by a node  :call [name] ) is a variable that matches anything and binds it to the name. If a name is bound more than once in the same pattern, the values must be identical.

The joker  ?  or, to spell it out:  :call [] , matches anything and ignores it.

Any other node  :call constant  is treated as a variable.

Beware: this rule:

            { fac (0) | :ok 1 }

may not do what you expect. The phrase  fac (0)  is read as  :call [fac: 0]  which is a variable with the name  [fac: 0] .

4.1.2  Lists, nodes and escapes

A list pattern matches a list of the same length, with matching elements.

A node pattern matches a node with the same name and leg names, and matching leg values, unless otherwise specified in the following paragraphs. As always, leg order has no significance.

For constant nodes and lists, this is equivalent to matching the exact same constant, as specified for constants.

Action nodes should generally be escaped when used in patterns with the intention of matching a similar node, as some action nodes have special meaning in patterns.

An escaped pattern

             : pattern

works the same as if  pattern   had been specified directly, except that if pattern is a node, any special conventions for node patterns are ignored. This allows the use of node patterns that would otherwise mean something else. A double-escape

            :: x

matches a single escape – the outer escape tells the pattern compiler to treat the inner escape as a normal node, i.e. to match a node of the same structure, which is an escape.

4.1.3  Type checks

These operations specify type checks. s is typically a variable or  .

            # s    number
            % s    integer
            + s    integer > 0, same as  % s > 0 , see below
            s count   integer >= 0, same as  % s >= 0

            s time    – timestamp
            s duration    – duration
            s dating    – dating

            $ s    string
            +$ s    nonempty string, same as  s $ (? > 0) , see below...
            ++$ s    string of 2 or more bytes, same as  s $ (? > 1)

            & s    list
            +& s    nonempty list, same as  s & (? > 0)
            ++& s    list of length > 1, same as  s & (? > 1)

            * s    node
            / s    cliche
            = s    – any constant
            s legs    – node or list

The  =  type check can be combined with other typechecks:

            = & s    list constant

The system interface may defines additional type checks for various system objects.

4.1.4  Searches

The search operators    (<)=*    (<)^*    (<)$*    (<)#*    (<)~*    (<)+* t    (<)-*   can be used against string literals and list constants. The resulting integer cannot be extracted in the pattern but is required to be  > 0 .

To accept strings that begin with  "http://"  and end with  "/" :

            s $* "http://" <$* "/"

To accept lists containing the name [language] :

            s =* [language,]

(Note the  ,  – the search operations use lists, not list elements.)

4.1.5  Name splitters

Single attribute cliches like  [namespace|namestring]  can be split by a pattern like:

            namespace // namestring

namespace  and  namestring  can be any constants, but typically they are strings.

This pattern will match a name in the default namespace, binding  name  to the unsplit name and  namestring  to the namestring..

            name = ?:? // namestring

Splitting of cliches with more than one leg name is, by design, not supported since the undefined order of the legs would lead to ill-defined semantics.

4.1.6  Aliases

A pattern like  alias = pattern  requires the data to match both alias and pattern. Typically, alias is a variable, used to bind a node or list that is processed further by pattern.

4.1.7  Comparisons

A value can be required to differ from a constant

            s <> constant

or compared against numeric literals:

            s compare-operator literal
            literal compare-operator s

where literal is a numeric literal and op is one of the operators  < > <= >=

This pattern accepts an integer in the interval 10 – 20:

            % x >= 10 <= 20

4.1.8  Extracting the lengths of lists and strings

To extract the length of a list or string:

            list & length

            string $ length

Lengths can be constrained using conpares. This pattern matches a string of no less than 5, no more than 10 bytes:

            string $ (5 <= length <= 10)

4.1.9  List indexers

Specific list elements can be retrieved, using integer literals as indexes. Positive integers specify elements counting from  1  (as usual),  0  specifies the last element, -1  the last but one etc.

This extracts the 3rd element  e  from a list:

            (e) 3

This accepts a list  q  with identical first and last elements:

            q = (e) 1 = (e) 0

4.1.10  Partial node specifications and anyway-specifiers

When specifying unescaped nodes in a pattern, the node name  []  acts as a joker:

            ;x

matches any node with an  x  leg and no other legs, binding  x  to the leg value.

To extract the node name:

            h ;x

This is as  ;x  but  h  is bound to the node name.

Using the node typecheck operator  *  with a node allows nodes with more legs than those specified in the pattern:

            * point: .x .y    will accepts nodes like   point: .x .y .z   etc.

Anyway-specifiers allow nodes with fewer legs to be accepted, specifying constant default values for the missing legs:

            (name: .leg ...) anyway [.leg ...]

This can be combined with the * operator to allow more legs as well:

            (* name: .leg ...) anyway [.leg ...]

When an anyway-specifier is used, the node name  []  is not treated as a joker. If all the specified legs have defaults, the node name itself is acceptable to the pattern as a placeholder for a node with no legs.

The anyway-specifier is designed to allow default values for named parameters.

4.1.11  Constant replacers

The constant replacer operations perform simple replacements on data before passing it to the qualified pattern. There are two variations:

            pattern call [.key replacement ...]

The keys are the only values accepted. The corresponding replacement is passed to pattern.

            pattern try [.key replacement ...]

The keys are replaced,  other values are allowed and passed unchanged.

Constant replacers are useful when dealing with enumerations in system interfacing, but they have other uses as well.

4.1.12  System oriented extractors

To extract the full paths of files and folders:

            file file: filename

            folder folder: path

To extract PILS extenders of system objects:

            systemobjekt når: pils-extender

4.1.13  On the implementation of patterns

While PILS is generally an interpreted language, patterns are compiled into an intermediate form called quicksteps, which are substantially faster than the interpreted structures. During a match, quicksteps never perform reference counting or create objects, or use the context. When a match has succeeded, bindings are created and reference counters updated.

Compilation is a fast process that happens automatically whenever rulesets are read by the parser or generated by PILS code.

To speed up rule search, the rules of a ruleset are indexed when possible. For a pattern to be indexable, the top level must be a directly specified list, a node with fully specified head and leg names, or a cliche, string or number.

Aliases do not affect indexing, so rules like  { alias = indexable-pattern | ... } are indexed as well.

Assignment rules are indexed by the left side of the assignment, and kept separate from normal rules.

Indexing is transparent to the programmer, except for the performance implications. When constructing large rulesets, such as state machines for parsers, the rules should be constructed as indexable whenever possible, and non-indexable rules should be lumped together, preferably near the top of the ruleset.

4.2  Rule actions

When a rule action is entered, the rule call is unresponded: it has not been decided whether the rule will succeed or fail. The responders and binders described below are valid in unresponded rule calls only; in a responded rule call, they will fail.

4.2.1  Responders

PILS has four responders:

            :ok tail
            :try tail
            :need expression    (often written as  expression .:need )
            :do tail
            :error expression

The lexical scoping of responders and their use for individual operations rather than statement blocks allow a more fine-grained error handling than the try-catch  approach used in mainstream languages.

The  :ok  responder is the preferred way of getting a result form a rule. The responder takes responsibility for the call and returns the evaluated  expression which should not contain further responders for the call.

This is similar to the return statement of C and its derivatives, with tail flattening granted.

Implementation note: Whenever the  :ok  responder is the first and only rule action, with nothing but binders before it, it is compiled and the stack setup required for responders is eliminated. If the following expression is a constant, a parameter value or a simple node or list construction based only on constants and parameter values, the rule will be fully compiled.

The  :try  responder tries to pass the responsibility to tail, without taking responsibility by itself. If the top level construct of  tail  succeeds or fails, the same happens to the original call, with tail flattening all the way. If the  :try  falls through, an empty list is returned. The rule will then miss, unless another responder is used.

Unlike the try statement of C++ and derivatives, the  :try  responder does not cover subexpressions that might fail or miss.

The  :need  responder is used for partial results that are required but not sufficient for the rule to succeed.

Failure will be forwarded to the original call, whereas success will be returned as result of the  :need  statement. Fall-throughs will make the innermost  :try  for the same rule miss. If no  :try  is active, the rule will miss.

If a needed  expression has subexpressions susceptible of failure, nested  :need  responders can be used to cover these. The inversion form  expression .:need  is convenient for this use.

The  :possibly  responder returns the evaluated  expression, but in contrast to the  :ok  responder,  embedded responders are allowed, especially the  :need  responder which as the same effect as when used inside a :try responder.

The  :error  responder throws an error directly to the original call.

4.2.2  Binders

These binder statements:

            :self tail
            :who tail
            :where tail
            :what tail

bind their names to data circumstantial to the call, as follows.

self  is bound to the bound ruleset.

who  is bound to the object being called, possibly an aggregate containing the bound ruleset. For object calls on a bound ruleset that is not aggregated,  who  is the same as  self .

For simple calls that do not specify an object,  who  is the context in which the call was made, i.e. the same as  where .

where  is bound to the context in which the call was made.

what  is bound to the call node that initiated the call.

The  self  and  who  binders are typically used when an object needs to use its own methods. Using the  who  binder is similar to virtual calls of OO languages.

The  where  and  what  binders are used by the bug pinner to locate failing calls. When a call is forwarded by the  :try  or  :need  responder, binders in the forwarded call will behave as follows:

            :where  and  :what  refer to the original call.

            :self  and  :who  refer to the forwarded call.

4.2.3  Tags

The binders and responders exist in tagged versions, for use within embedded rules.

The tagged binders

            :self [tag]; tail
            :who [tag]; tail
            :where [tag]; tail
            :what [tag]; tail

bind  name [tag]  instead of name.

The tagged responders

            :ok [tag]; tail
            :try [tag]; tail
            :need [tag]; expression
            :error [tag]; expression

are used with the tagger:

            :tag [tag]; tail

allowing an inner rule to act on behalf of an encompassing rule.

            { ...
            | ... :tag [tag]; ...
              ... { ... | ... :ok [tag];  ... }
            }

Generally, tagged responders are rarely used, though the PILS programming system uses them to deal with syntax errors in modules.

4.3  Pinning bugs with responders

The explicit managing of the responsibility of calls helps in locating bugs. A major problem of weakly typed languages will surface in situations as shown by the following simplistic example:

             ! Anna's application tries to double a number using Benny's doubler library
            double "MMVII"
            ===
            ! Bennys library uses Barry's library
            { double: x | :ok multiply (x  .by 2) }
            ===
             ! Barry's library routine expects numbers
            { multiply: x .by | :ok x * by }

When Anna executes her application, the operation  x * by   in Barry's library will fail, leaving Anna mystified, not knowing how Barry's code got involved in her application.

The situation is as illustrated by this nursery rhyme (by Halfdan Rasmussen, translation suggested by Ian Noble in a usenet discussion):

            Benny's breeks were burning.
            Barry roared anon.
            Barry having, namely,
            Benny's britches on.

This is a general problem with weakly typed programming languages – when the behaviour of the data is not defined by an explicit contract, programmers have to resort to stack dumps, single stepping, or test logs to find the real cause of such failures. In PILS however, a sensible use of responders can help pinning a bug in one swift move.

Using the  :try  or  :need  responder, PILS allows the library routine to perform specific operations on behalf of its caller, so when things go wrong, the caller is blamed.

In the above example, Barry should revamp his library to use a more pessimist approach that doesn't trust the arguments. Anna will now have this setup:

            double "MMVII"  ! Anna's application
            ===
            { double: x | :ok multiply (x .by 2) } ! Benny's library
            ===
            { multiply: x .by | :try x * by } ! Barry's revised library

Now, the  multiply (x .by 2)  operation in Benny's library gets the blame. Benny might revise his library, so Anna gets this setup:

            double "MMVII"  ! Anna's application
            ===
            { double: x | :try multiply (x .by 2) } ! Benny's revised library
            ===
            { multiply: x .by | :try x * by } ! Barry's revised library

And the failure is now immediately blamed on the  double "MMVII"  call – pointing right to the smoking match in Anna's hand. Mystery solved.

The above example only serves as a simple illustration of the concept; such simple calculations would really be better served by typechecks in the pattern. However, in cases where the arguments are objects supposed to implement certain methods, type checks are of little help, PILS not having classes or interfaces to test against. Instead, the methods can be called without taking responsibility, so that the caller that supplied the arguments gets the blame when needed methods are not supported by the arguments.

PILS is crafted to support this technique without the code bloat that would result if individual operations had to be dealt with by traditional try-catch blocks.

5  System-oriented features

5.1  Supplementary constant types

The PILS kernel supports extending the data model with constant types, by means of registering parser plugins and methods.

When the parser encounters a starting brace  {  a lookahead is performed, keeping trace of strings and nested parentheses. If a closing brace  }  is encountered with no unnested rule bar  |  in between, the parser plugins get a try at parsing the enclosed characters.

The supplementary types are provided by the framework bindings and may vary between frameworks.

5.2  States in a changing world

So far, all data PILS structures presented are stateless and immutable. To accommodate the needs of interacting with a changing world, a few state-bearing objects have been introduced into PILS. They have been crafted so as to minimize the risk of creating memory and resource leaks by circular structures.

5.2.1  Channels, listeners and plugs

Let channel be a node constants of the form

            [channel: key]

where key can be any constant. The operation

            channel listen (listener)

creates an opaque plug, which for its life time plugs listener into channel. The only way to unplug it is to trash the plug by loosing all references to it.

When a call is made to channel, it gets forwarded to its listeners in turn, the youngest listeners first, until one of them responds, as if the listeners had been aggregated with the  +++  operator, except that the  :who  binder will bind to the plug.

As all constants, channels are unique. If an expression like  (channel: filename)  is performed several times for the same file name, the channel will be reused. The PILS programming system uses such channels to keep file windows and module editors unique.

Channels are thread safe.

5.2.2  Alien controlled minds

When PILS is used with a windowing framework, the life span of windows are generally controlled by the user or the framework, not by PILS. Such externally controlled objects are said to be aliens. PILS creates special wrappers for them and treats them as constants, with methods and event handling defined by the framework bindings.

Aliens can have PILS objects attached to them in possibly circular ways.

PILS uses the event handling mechanism of the framework to track the destruction of aliens. When an alien is destroyed, PILS blinds its wrapper and releases any associated data.

To associate data with an alien and retrieve it:

            alien mind . key := value;
            ...
            alien mind . key

Values can be overwritten but keys cannot be deleted. There is no way to get at the values without knowing their keys.

5.2.3  Straps

Nodes of form  [strap: key]  are similar to channels, but can only have a single listener, and only when strapped to one or more aliens. Once these aliens perish, the strap looses its mind.

Straps are used by the programming system for program straps which are strapped to all windows of a particular program. When all windows of a program are destroyed, the program strap looses its mind and resources are freed.

During startup of programs, their straps are temporarily strapped to the universal key.

5.3  Simple file access

PILS standard file and folder objects allow simple operations on files and file systems.

A folder is the same as a directory.

Reading and writing of files is only supported on whole files. File and folder objects are really name wrappers with allowance to use the file system; PILS has no concept of open files.

File and folder names have full paths, using  /  as directory separator on all systems, including MS Windows. The MS Windows style separator  \   is not used in PILS filenames.

Folder names always end with  / .

The  file  and  folder  functions are methods of the universal key and do not work without it. By hiding the universal key, sandboxes can be constructed, allowing foreign scripts to execute with restricted file system access.

5.3.1  File objects

            file (filename)
            file (filename)

creates a filename wrapper object –  file  is the built-in function,  datafile  is a wrapper for use with data dependent modules.

To retrieve the filename:

            file name

To read and write its contents in one go:

            file text
            file text := text

Reading and writing is done with raw bytes – encoding konversions are handled as separate operations:

            fil text bytes utf-8

reads an ANSI encoded file and converts it to a utf-8 encoded string.

            fil text := text utf-8 bytes

writes a utf-8 encoded string to an ANSI encoded file.

To manipulate files:

            file (filename) copy (newname)
            file (filename) move (newname)
            file (filename) delete

These do the obvious thing – newname is a string; the result of copy  and  move  is a filename wrapper for newname.

To test the existence of a file:

            file ok

misses if the file does not exist, else file is returned.

            file writable    returns  1  if writing is possible and permitted,  0  if not.
            file readable    same for reading

            file count    returns the file size in bytes

            file timestamp    reads the last-modified timestamp of the file
            file timestamp (time)   sets the last-modified timestamp

There are no methods for extracting parts of the name; this is easily accomplished with the standard PILS string operators:

            fn <-#-* "/"    returns the folder name
            fn <+#-* "/"    returns the file name without folder
            fn <+#-* "/" <-#=* "."    returns the file name without folder and type
            fn <+#-* "/" <+#=* "." --# 1    returns the file type
            fn <-# (file name <+#-* "/" <=* ".")    returns the file name without type
            fn <+#-* "/" <-#=* "."    returns the file name without folder and type

5.3.2  Folder objects

            folder (folder-name)

creates a folder-name wrapper object. folder-name  must end with a  "/"  character. To retrieve the name:

            folder name

To search the folder for files and subfolders:

            folder files    returns a list of file objects
            folder folders    returns a list of folder objects

Search patterns are not supported – to get all files of type  .pils  in a folder:

            folder files each {file|:if file name <$* ".pils"; :ok}

To travel a folder tree and list all files:

            folder call {folder|:who :ok folder files & (folder folders every (who) splice)}

To create and delete empty folders:

            folder create
            folder delete

5.3.3  Files and folders in patterns

To facilitate filtering of files and folders,  file  and  folder can be used as operators in patterns.

            file file (fn)   matches a file whilst matching its full name to fn

            folder folder (fn) matches a folder whilst matching its full name to fn

This example will list all  .pils  files inside your document folder with their timestamps, excluding backup folders:

            folder (platform path documents) call
            { folder folder (?) | :who :ok folder files & (folder folders each (who) splice) }
            { ? folder (? <$* "/backup/") | ? }
            each { file file (filename <$* ".pils") | :ok filename, file timestamp }

5.3.4  Zip files

To read a zip file:

            file (zipfilename) zip

returns a list of lists  (path, data, timestamp)  for all archive entries in the zipfile.

To create a zipfile:

            file (zipfilename) zip := entries

Note: zipfile creation is currently not supported in jucePILS.

As with filenames, directories are separated by  / , though the zip format uses  \  internally.

There is presently no support for modifying zip files or reading single entries, and no other archive formats are supported.

Note: OpenOffice files use the zip format. The PILS documentation files are authored using OpenOffice writer, and converted to HTML by a PILS script.

5.4  Worker threads, knots and latecomers

Worker threads allow the PILS user interface to stay responsible while lengthy computations are going on, and allow programs to benefit from multiprocessor systems.

PILS threads and knots are designed for this use and do not offer the fine-grained synchronization mechanisms required for process control. In return, the programmer does not need to worry about deadlocks or calling objects from the wrong threads.

The  lib/pils/english/compute/compute.pils  library offers a simple wrapper for using a single worker thread for parsing of files or similar processing.

5.4.1  Creating a worker thread

            :thread expression

This construct is only valid in the main thread. A worker thread is created and started, and  []  is returned. The worker thread evaluates expression, discards the result and terminates.

System calls and user interface manipulations cannot be performed directly by worker threads. However, a worker thread can temporarily take over the main thread by calling a knot.

5.4.2  Knot calls

The only means of synchronization available for PILS threads is the knot call, that is, calling methods of an object wrapped in a knot:

            (knot: object) method

The knot wrapper effectively makes object thread safe, by forwarding calls from worker threads to the main thread, for processing in its idle time, while the worker thread is blocked.

There is no support for parking a thread to awake it later.

When the call succeeds, fails or falls through, the worker thread is restarted in a state that reflects this. The thread switch is generally transparent, except that tail flattening and piping is suppressed and responders, list builders and exits not work across thread borders.

Generally, worker threads should do lengthy computation on their own most of the time, occasionally using knots for system access, user interface updates and information exchange. If a worker thread stays in a knotted call, the PILS system will be blocked.

When used by the main thread, knotted calls pass through with no synchronization. They are still thread safe because only the main thread performs dotted calls.

The PILS module that displays error messages is knotted, so when a worker thread throws an error, the bug pinner window will be shown by the main thread.

5.4.3  Latecomers

During construction of user interfaces, some operations need to be queued for later execution, at  a time when the window being constructed has been sized and shown.

            :later expression

expression is queued with the current context, and  []  is returned immediately. When the main thread is idle, expression is evaluated in the supplied context, and then trashed.

Latecomers can be created from the main thread as well as from worker threads; their execution is always in the main thread.

6  PILS programs

A PILS program is a collection of modules, stored in one or more library files. All modules and libraries involved in running a program are gathered in a program strop.

Program straps are managed by the library  pils/english/system/system.pils  which is located by the executable at startup, searching the directory of the executable and its parent directories. When found, the module [pils system boot] is parsed and called.

As these modules have to work before the waxball is initialized, they rely on a somewhat ruder mechanism for referencing each other. You should not edit them unless you know what you are doing – if you break them, the error reporting of PILS will break down too and you will have to resort to primitive and tedious methods for sorting out what went wrong.

6.1  Modules

The basic unit of the PILS programming system is a module. Generally, a module is a named PILS expression, very often one that results in a bound ruleset of exported rules.

            { ... | } ! exported rules
            ...
            ===
            private-definitions

In the library file, each module is stored with a header containing the module name and some attributes, of which the .timestamp  attribute keeps track of when the module was last edited, and the  .language  attribute indicates a PILS language object to be used for parsing the module, as described below.

Modules are similar to singleton objects of object orientet programming, on a per program basis: when several documents are open, they will use different program straps and each of these will have separate instances of the modules. Furthermore, when a module is being edited while in use, multiple instances can sometimes exist within the same program.

6.1.1  Module names and references

A module name is a list of PILS names. The programming system presents the modules as a tree, merging common beginnings.

Modules cannot implicitly access their own exported methods. If a rule needs to access other rules of the same ruleset or aggregate, the  :who  or  :self  binder should be used.

Modules can refer to each other by absolute name:

            @@ [game board checkers]    always refers to    [game board checkers]

Or by relative names. In a module named  [game board] :

            @@ chess

refers to  [game board chess] ,  [game chess] or  [chess,]

            @@ . [chess sizer]

refers to  [game board chess sizer] ,  [game chess sizer] or  [chess sizer]

Absolute module references are implemented by a function rule  {@@: module-name|...}  whereas relative module references are implemented by a helper object named  @@ .

Both absolute and relative module references can use computed module names.

For the common case when a module exports a method of the same name, the helper object  @  implements a shorthand:

            @ board (...)    is the same as    @@ board board (...)

            @ board    is the same as    @@ board board

For specifying longer references:

            @ [board] checkers (...)    is the same as    @@ . [board checkers] checkers ()

            @ [game board] checkers (...)    is the same as    @@ . [game board checkers] checkers ()

6.1.2  Common functionality in === modules

Functionality that is commonly used throughout an application can be implemented by rules in a module with  ===  as its last name. The rules can be used directly by all other modules in that branch of the module tree.

The system library  system.pils  contains a module  [===,]  defining functionality available to all PILS modules.

6.1.3  Module references in changing programs

Module references are basically dynamic: if a module is edited and saved while in use, further references to the module will retrieve an instance of the updated module.

However, module references that were executed before the change was saved still hold the old version.

This distinction is important when experimenting with changes in a running application: the features accessed via module references will change immediately as the modules are edited, whereas features implemented by objects that were created at application startup will not change unless the application is restarted.

The latter procedure – restarting the application to test a change – is still a common way of doing things, especially with compiled languages, but rarely necessary when working with PILS.

6.1.4  Module instantiation

When a module is referenced for the first time by a running program, the programming system will instantiate it by parsing and evaluating the expression, keeping the instance in an instance mind for further references.

Instantiation happens on program basis. When several PILS programs are running, each will have its own program strap with separate module instantiations.

Whenever a module is edited and saved, it is removed from the instance mind of all affected program strops. Other modules that referenced the module during their instantiation are removed too. The old instances are still usable but future references will refer to new instances.

6.1.5  Datafiles and data dependent modules

If, during instantiation, a module uses data from text files, XML files, spreadsheets etc., changes in the files may necessitate reinstantiation of the module so that the changed data are reflected by the PILS program. This is supported by the functions

            datafile (filename)

which creates a file object and, if  if called during module instantiation, registers the module's dependency of the timestamped file, and

            datafile-check

which checks the timestamps of all datafiles registered by the program, flushing from the instance mind of the program strop all module instances that depend on files with a changed timestamp.

6.1.6  Module and program attributes by “this”

The name  this  can be used to query certain properties of the current module and program.

            this program filename
            this program path
            this program language

get the filename, path and language of the program file for which the module is instantiated.

            this module name
            this module text
            this module expression
            this module language
            this module library

get attributes of the module.

The bug pinner depends on this, and the name  this  should never be used for other purposes.

6.1.7  Language modules

Each module has an associated language attribute which stores a language name, which can be a list of names.

The language of a module is found by prepending  [pils language]  to the language name and looking this up as a relative module reference, like the expression:

            @@ . ([pils language] & language-name)

To get started, the language name  [system]  always refers to the language object used for booting PILS.

In most cases, the language module is a PILS language object, possibly with national translations of the PILS vocabulary, possibly with application or library specific namespace prefixes such as the  j:  used by the  juce  library bindings, or namespace prefixes suitable for dealing with specific XML formats such as OpenOffice documents.

Alternatively, custom parser objects can be defined in language modules. A simple example of this is the  text  language, with a trivial parser that simply returns the text, which is useful for saving text data in PILS libraries.

The PILS editor exposes language modules as menu options.

6.2  Libraries

A library is a file that contains modules and possibly references to other libraries. When a library is loaded, its referenced libraries will be loaded too if this has not already been done.

A PILS program is simply a library which is used as an application of its own.

6.3  Program straps

When the user opens a program, it is loaded with the libraries it depends on. The libraries are then merged into a program library which includes all modules present in all the libraries.

The life time of a program library is managed by a strap which is strapped to all top level windows created from that program. The program is released when the last of its windows is closed.

When a library is used as part of a program library, its module references pertain to the whole program library.

Four libraries – the system library, the editor library, the platform library and a user configuration library – must be available and loaded for the PILS system to work. The system, editor and platform libraries are located by relative paths from the executable, the user configuration library is chosen at startup, based on your computer's language settings:

If your computer is set to a language other than English, configuration and language files for that language will be linked in if they exist, and the  say  function – which should be used for all messages and labes in the user interfaces of PILS programs – will then use that language.

Currently, only English and Danish are supported.

6.4  Command line processing and single-instance checking

On startup, the PILS executable will try to detect a previous instance and pass the command line to it. If this fails for one reason or the other, a new instance is started.

At startup, PILS creates an object that must be able to process command lines. This object is defined by the module  [pils system start]  in the waxball organizer library.

This design is slightly complicated by the fact that waxballs come and go as programs are opened and closed, so a channel  [channel: pils: command-line]  is used to pass the requests to a live waxball with a suitable instance of the  [pils system start]  module.

The MS Windows installer defines operations open and edit for the PILS file type in the registry; the command lines are distinguished by an  -edit  option in the edit command line. This is recognized by a rule in  [pils system start]  and has the effect of directly opening the editor, regardless of whether another action is specified in the  [pils run start]  module.

7  The PILS editor

PILS programs are created, edited and tested using the PILS editor – a simple tabbed text editor with facilities for searching and navigating PILS libraries and executing test functions.

A PILS editor window always deals with a specific library and will only edit modules within that library. To edit multiple libraries, multiple windows are used.

The editor itself is located in a PILS library,  lib/english/pils/editor , and runs in the waxball of the library being edited, allowing libraries to modify the behavior of the editor by redefining its modules.

When a module is saved, all modules that referred it directly or indirectly during their instantiation are flushed from the instance minds..

Note: redefining of editor modules should be done with caution. If a broken editor module prevents the editor from working, you cannot use that same editor to remove or mend the module. You will then need to open the library file using another editor, or restore an earlier version of the PILS editor.

7.1  Creating a PILS program

To create a PILS program, simply create an empty file with the extension  .pils , and open it with the PILS executable. You will be asked to choose the language you want to use for programming.

The file

             lib/pils/language/pils/new.pils

serves as a simple template with the appropriate settings and will be copied over your empty file and opened. (Currently, only Danish and English are supported).

The language you choose will be used to store and show module names and will be the initial language for new modules in your program. You can change the file language later.

Note: The user interface language is controlled by your computer's language setting, not by the program language. If you write your programs using english terms and execute them on a system with Danish language settings, or the other way around, PILS will attempt to translate messages and labels.

7.2  Opening an existing program

Double-clicking a PILS file will call the  open  function in the module  [pils run command open] . If this module has not been redefined, an editor window will be opened.

Similarly, the Edit command will call the  edit  function in the module  [pils run command edit] , which will also open an editor window.

When you build PILS applications, you will usually start the application by starting the editor and executing a test function. When testing and polishing becomes more important, you can create a module  [pils run command open]  and define a rule:

            { .open | ... }

The editor window will still be available by using the Edit command.

In the command line, the option  -edit  (always in English, with a prepended hyphen) is used to distinguish the Edit command from the Open command which is the default.

7.3  Working with modules

When you open a program with the editor, the last changed module will be shown in a tab.

7.3.1  Editing

The editor is simple – the usual shortcuts apply:

            Ctrl-C    copy

            Ctrl-X    cut

            Ctrl-V    paste

            Ctrl-Z    undo

            Ctrl-Y    redo

            Ctrl-S    save

All saving goes straight to the disk file. You cannot test a module without saving it.

Attempts to save modules with syntax errors will be rejected, and you will be directed to the spot where the parser failed.

Tip: If you need to save a module with syntax errors, set the language to  text  as described below.

7.3.2  Navigating the module tree

            Ctrl-M    shows the modules of your program in a tree view.

To open a module, activate it by double clicking or hitting the  Enter  key. Use the  Esc  key to close the tree pane.

7.3.3  Creating, moving and deleting modules

To create a new module:

            Ctrl-N    creates a submodule of the current module.

To create a root module, create a submodule of an existing root module and move it down, using the Module menu.

            Module->move->down    moves the module and its submodules towards the root

            Module->move->up->sibling    moves the module and submodules up on sibling

            Ctrl-K    copies the current module and submodules to another name.

            Ctrl-R    renames the current module and submodules.

            Module->delete    deletes the current module if it is empty and has no submodules.

            Module->move->library->library    moves to another library, with submodules.

Tip: Deleting modules is – by design – a bit troublesome, to save you accidentally deleting code by hitting the wrong key. If you need to delete many modules, create a trash library, put them there and delete the trash library file.

7.3.4  Changing the language of a module

To change the language of a module, make your pick from the  Language  menu. The module must be saved after the language change.

Except for  system  which always refers to the language object used to boot PILS and to read program headers, the entries in the language menu refer to modules with the words  pils language in their name. To add a language to the menu, create a module named

            pils language name

and write a language object or an expression that creates one, or an object of your own making with  at least a  read  method, like he languages  text  and  textlist  defined by modules in the editor library –  text  simply delivers the module text,  textlist  splits it in lines.

Language modules can be specific to a branch in the module tree:

            job myapp pils language mylanguage

defines a language which is only available to modules in the  job myapp  branch of the module tree.

If you set the language of a module to a branch-specific language, you should not move the module outside that branch.

The juce binding library – which wraps the PILS juce bindings – uses a branch-specific language which associates the  j:  prefix with the namespace used internally by PILS for juce methods and classes.

7.3.5  Changing the program language

You can change the program language if you do not want to stay with the language you chose when you created the file. First, set your program to use an appropriate language library:

            lib/pils/language/pils/language.pils

Then, use the menu  File->settings->language-> language  to set the program language.

This triggers a rebuild of the editor window, to ensure that module names are shown correctly.

The language change does not affect the language setting of modules already created.

7.4  Using PILS libraries

To use PILS libraries, select

            Libraries->use

and use the  Insert  and  Delete  keys to add or remove libraries from the list.  Insert  will open a standard file picker dialog, and encode the selected file name by one of the following prefixes if applicable:

            <lib>/    – the  lib/pils/  folder

            <doc>/    – the user's document folder

            <.>/    – the folder of the importing library

            <..>/    – parent of the folder of the importing library

This helps keeping the imports valid when PILS libraries are moved.

7.4.1  Searching across libraries

When you start using multiple libraries, you will occasionally need to search through all the libraries involved in your project to find some stuff you forgot where you put. To search across libraries, press

            Ctrl-D    

This will open a Detective pane, which combines a high and a low search stripe with a module tree. Besides the usual Whole words only and  Case sensitive checkboxes, the Detective supports a Structural mode which will parse your search term as a PILS expression and use it as part of a PILS rule, allowing you to search for particular constructs without knowing the detailed contents.

The module tree will update immediately as you type your terms, marking all modules with hits. Their parent modules will also be marked, down to the root, to help finding the hits.

When you select a module with hits, a hit list will be shown, with line numbers of the hits. As you select the hits, they will be selected automatically in the editing windows of their respective libraries.

7.4.2  How libraries are stored

Libraries have the file type  .pils  and are utf-8 text files with CR+LF (DOS/Windows style) line breaks. (If needed, they can be edited with MS Windows Notepad or similar. The program strap manager ignores the extra bytes (BOM) prepended to utf-8 files by Windows applications.

A PILS program consists of an optional program-header and a  sequence of module entries stored in a utf-8 encoded text file, separated by markers:

             library-header><:module-header><;module-body><:module-header><;module-body ...

with any other occurrences of  ><  encoded as  ><.  (adding a dot) to distinguish them from markers. Their order has no significance; they are sorted by binary string comparisons of the headers, which is fast and convenient when working with non-PILS-aware text editors or versioning systems.

Module entries consists of a module header and a module body. A module header is a serialised constant node  [module: .  ...]  with a mandatory principal leg holding the name, and some optional legs such as  language  and  timestamp . The module name is a list of 1 or more PILS names.

The program-header, if present, holds information common to all modules in the program. Its  language  leg, if present, determines what language is used for the module headers.

When a library is read in, the modules are stored in a node, using the module names as leg names; the leg values are module headers with the module text in the principal leg. The library header is held in the tail leg. This allows modules to be found by fast binary search.

7.5  Testing

7.5.1  Local test rules

Rules of the form

            { .name | :ok test-expression }

are available through a module's Test menu when the module is in a saved state. When you activate the menu, the rule is called and the result – if any – is displayed in a  Result  window. If you do not want to see the result, simply omit the responder.

            { .name | test-expression }

To make life easier, you can name your test rule like this:

            { .test | :ok test-expression }

The quicktest shortcut

            Ctrl+T

will save your module if not already saved, and then call a  .test   rule if it exists.

7.5.2  Test modules

Test modules serve to define test rules that can be used from all modules in a branch.

Say you work with a module named  [a b c] . When you save the module or press  Ctrl-T, these module names will be searched for test functions in listed order:

            [a b c]
            [a b c test]
            [a b test]
            [a test]
            [test,]

For each of these module names, the whole waxball is searched. Every test function is bound to the first module in which it is found.

If you define your test functions in  [test,] , you can use them from everywhere. However, you should consider limiting their scope, or your test menus may become cluttered.

7.5.3  Test projects

When testing a library that is to be used with several programs, you may want to include a program for testing, without including it in other programs that use the library. Such a program is called a test project.

Open the library's  Use libraries  panel, add the tester, and mark it with  Ctrl-T . The tester will now be included only when the library is started directly, not when it is loaded by another library.

8  GUI programming with PILS

GUI programming is done via a PILS library  <pils>/english/system/juce/juce  built on PILS bindings to an underlying system, at present, only  Juce is supported, see http://www.rawmaterialsoftware.com . The PILS bindings were generated by a PILS program – plumming-generator.pils – and cover most of the Juce library.

The relevant PILS library is automatically included in all PILS waxballs and is accessed by  platform  which is an alias for  @ [platform juce entry] , with submodules of this module serving as entrances to the library.

8.1  Windows and panes

              platform window [test] textpane text "Once upon a time, " selection (-1, -1) focus

The expression above creates a window betitled  Test , fills it with a text pane with the text "Once upon a time" and positions the caret at the end so you can continue the story.

User interfaces are generally built by creating a window and populating it with panes, which are created by methods of the parent window or pane. Panes are created from modules  [platform juce parent xxx] , these are indirectly exposed as methods of windows and panes for which child panes make sense.

The set of panes supported by PILS and their interfaces is still subject to changes. On long terms, most of them are likely to be replaced by panes implemented in PILS, using PILS graphs for rendering. Therefore, no extensive documentation is available. To find out how the panes work and what methods they support, please consult the relevant modules in the bindings library.

Note that the Juce library allows controls to be created independent of their parents, and reparented – whereas PILS panes should be treated as belonging to their parents and not re-parented. This has to do with ensuring their proper deletion: the Juce framework was not constructed for binding to dynamic languages and does not have the notifications required for clean maintaining of wrappers. Therefore, a hack was applied: when panes are orphaned, PILS deletes them.

8.2  Events and extenders

To respond to user actions, extend the GUI objects with PILS rules using the  when  operation

            ;window platform window [test];
            window textpane
            when {.changed|:who :if who text <$* "ever after."; window close}
            text "Once upon a time, "
            selection (-1, -1) focus

This version automatically closes the window when the text ends with "ever after." . Not that the textpane extender is adjoined immediately after the pane is created and refers to the pane through the  :who  binder. As a guard against circular references, PILS will not allow you to extend a multiply referenced object so this won't work:

            ;textpane window textpane;
            textpane when {.changed|:if textpane text <$* "ever after."; window close}

In this case, the  when  operation will fail and refuse to extend the object, as this would result in a circular reference.

9  PILS graphs

PILS has a built-in lightweight vector graphic format, currently based on the graphic layer of the Juce framework.

— This HTML is generated by PILS from The PILS language.odt