SOUL
Stack Oriented Useful Language

Ian James
© December 2008,
January–April 2023

This programming language was originally designed to be integrated with a new operating system called NOUS, where a major subset of the functionality would be given to the user, which was the U in the name. It was abandoned for many years, and then during a recent period of study of the various languages currently available—with a view to adopting something more productive than the Java I’ve been using—I decided my original idea was very much worth resurrecting.

Although still a hobbyist rather than enterprise-production language at this stage, there are a few noteworthy and possibly competitive features of SOUL~

  • stack-based, so most operations are predictable & relatively easy to analyze,
  • based on PostScript, certainly the most readable of the Forth family languages,
  • more readable and intuitive than PostScript, though currently lacking its graphics capabilities,
  • concatenative paradigm makes many complex methods of the functional paradigm trivial,
  • currently interpreted via C, but once the compiler is rewritten, will run faster than C in many cases.

Yes, that’s right. I have been chafing at the slowness of Java after programming in C many years before. And now, after having followed the rigmarole and ceremony imposed by many popular languages, I felt it was time to use a language more aligned to my personal preferences – fast, concise, elegant, small. After reviewing my original notes and design for SOUL, I felt it was potentially much more suitable, more useful.


Basic Premise

Being stack-based typically means values (which may be numbers or names) are pushed onto a stack, available for operators to come along and use whatever values they require from the same stack, and possibly leave a result there. Values come off the stack (are popped) in the reverse order to how they went on. Adding two numbers with the addition operator,

2 3 add

we think of 2 and then 3 sitting on top of the stack (which starts at the left) and the operation happens, leaving 5 in their place – ready to be used by the next operator. This shows the essence of a simple, concatenative approach that can be used to build much more involved operations. In a sense, such expressions can be read like a phrase from a verb-last natural language like Japanese or Arabic, and likewise this is a natural alternative to the more common infix expressions like

2 plus 3

There is also no need for parentheses or operator precedence rules, since the order of operations is never ambiguous:

(2 plus 3) times 4 compared with 2 3 add 4 mul

The other main feature is lists of values and operators, which can become active procedures that use the stack in the same way as single native operators. This system of quotations is also found in Lisp.

[ add mul ]
If this is activated and applied to a stack holding 4, 2 and 3, it will leave 20 in their place. (This example also shows how function composition can be a simple affair.)

SOUL does type checking at compile-time, but procedures that we create do not require annotations. The example above has a stack-effect described as rrr r, showing it requires 3 rational numbers and leaves one rational number – this is automatically calculated and may be reported during compilation and debugging.

The format of SOUL code is free-form, line-breaks are not significant, although the inclusion of transparent commas and certain transparent words (semantic fillers) can help to make the flow more natural.

Stack values

The following types of values can be placed passively on the stack, until an appropriate operator or procedure is given to make use of them.

4,506 % integer (usually 32-bit) with optional transparent comma
0.456 % decimal fraction, float (usually 32-bit)
4.56e-1 % decimal fraction, exponent form
0x119A % hexadecimal integer
0b0001,0001,1001,1010 % binary integer with transparent commas
0.5*1.825 % duple (here of floats)
255*108*255 % triple (here of integers)
True False % the two booleans
Pi Rand % the float value of π, and a random integer
(Literal string between parentheses with {evaluatedNamedVariable} in curly brackets, running over several lines with a copyright sign using embedded hex 'A9 and a bar, | which marks an explicit newline.)
Literal string between parentheses with 42 in curly brackets, running over several lines with a copyright sign using embedded hex © and a bar,
which marks an explicit newline.
^m % the literal ascii letter m with byte value 0x6D (8-bit integer)
varname % name with active value (not an operator/procedure)
/varname % inactive name; may be (re)defined, or explicitly activated
[ ... ] % quotation, list of values and operators inactive until applied

Simple flow control

mul, [proc] exec, /proc exec % an active operator & 2 ways to make a passive procedure active
test [proc] ifso % applies (executes) proc if boolean value test is True
[test | proc] ifso % alternative form, a “test-apply” where test is a procedure
test /proc ifso % another possible form, using a (passive) procedure name
test [ proc1
     | proc2 ] ifnot
% executes proc1 if boolean test is True, otherwise proc2
9 [proc] repeat % does proc nine times

Operators that make loops, like repeat and the ones for arrays like forall (see below) have an implicit indexer or iota called It which increments at each iteration 0 1 2 3 ... and is available for use by the procedure being looped over.

Conversions

It is sometimes useful to change how a value is used, by changing its type or representation.

cvb, cvi, cvf, cvt % convert value on stack to be byte, integer, float, boolean
3 4 (mul) cvn % convert string to operator/procedure name; applying exec will give 12
2018915346 4bytes  //0x12 0x34 0x56 0x78 % convert 32-bit value on stack to little-endian sequence of bytes
16,383 2bytes  //0xFF 0x3F % convert 16-bit value on stack to little-endian sequence of bytes
0.5 pm2b  //0xFF 0x3F % convert & scale float in range ± 1.0 to i16, as bytes

Simple assignment

Assignment with def comes from PostScript, but here is used for immutable values and quotes. The semicolon form is for variables and lists (which look like quotes but can be mutated), taking values directly off the stack.

3 /OT ; % let 3 be assigned to variable name OT
/MinutesRate 5 def % define name MinutesRate to be the constant 5
/calcHoursPay [60 mul MinutesRate mul] def % assign the quote/procedure to name calcHoursPay
OT calcHoursPay /Overtime ; % applying values & procedure puts 900 in Overtime

Array assignment

Arrays come in several kinds, explicitly allocating for each value type. The extra “dimension” allows for tuples. Indexing is zero-based.

/arr 10 1 iarray % allocate 10 single integer spaces to arr
5 arr 4 put % put value 5 into array at index 4
< 2 3 4 arr 1 put % gather values 2, 3 and 4 into array starting at index 1
arr [7 !] forall % set/replace each value in array with value 7
arr [It 1 add !] forall % set values using the loop’s implicit indexer
arr /arr2 ; % make full copy of arr, now having 1 2 3 4 5 6 7 8 9 10
< 1 2 3 4 /arr3 ; % create quick integer array off the stack
< 4 [It 1 add] repeat /arr4 ; % create same list by gathering procedure results
arr3 RO, arr4 RW % makes arr3 immutable & arr4 mutable, from here on

Other array operations

warray, barray, farray, tarray % allocators for arrays of i16, i8, f32 and booleans
arr3 length % puts length of arr3 (4, see above) on the stack
arr 1 get % puts element from index 1 of arr on the stack
arr [It 1 add !] map % like forall but leaves ungathered new iarray on stack
arr2 [5 gt] filter % test makes ungathered new iarray with 6 7 8 9 10
arr2 [5 gt] truth % makes ungathered new tarray with 0 0 0 0 0 1 1 1 1 1
arr2 [add] reduce % applies procedure to each pair of elements, giving 55

Math & logic operations

add, sub, mul % binary ops, taking two values from stack & leaving one
neg % unary op, negating value on stack
4 3 eq % test for equality, here leaves boolean False on stack
4 3 ne % test for inequality, here leaves boolean True on stack
4 3 gt % test for greater-than, here leaves boolean True on stack
4 3 gt not % logical not, here leaves boolean False on stack
True False both % logical and, here leaves boolean False on stack
True False orr % logical or (in verb form), here leaves True on stack
0b1010 not % bitwise complement, here leaves 0b0101 on stack
0b1010 0b1100 both % bitwise and, here leaves 0b1000 on stack
0b1010 0b1100 orr % bitwise or, here leaves 0b1110 on stack
0.25 not % fuzzy-logical complement, here gives probability 0.75
0.25 0.40 both % fuzzy-logical and, here leaves 0.10 on stack
0.25 0.40 orr % fuzzy-logical or, here leaves 0.55 on stack

Output

(hello) emit 4 3 add emit % prints to the console:
% hello
% 7
cr % prints newline to the console
(filename.txt) out /file01 ; % opens file for appending, assigns its handle to name file01
file01 (goodbye) write % appends farewell text (sequence of bytes) to file
file01 close % closes the file, releasing its resources

Stack operators

Several operators are available for manipulating the top 2, 3 or 4 items on the stack; shown here in comparison with the PostScript operators (one observation with PS is the somewhat distracting use of extra values to drive the operator). Note that ideally, we will access the stack to only a few elements’ depth, while avoiding use of adhoc variables – this will make the program run extremely fast, since the fixed-depth main operation stack is held in CPU registers (if it’s full we will start using stackspace in memory).


SOUL result PostScript
a b exch b a a b exch
a b pop a a b pop
a dup a a a dup
a b dup2 a b a b a b 2 copy
a b c dup3 a b c a b c a b c 3 copy
a b c lift b c a a b c 3 -1 roll
a b c tuck c a b a b c 3 1 roll
a b c turn c b a a b c exch 3 -1 roll
a b c d rot1 b c d a a b c d 4 -1 roll
a b c d rot2 c d a b a b c d 4 -2 roll
a b c d rot3 d a b c a b c d 4 1 roll
a b 2nd a b a a b 1 index
a b c 3rd a b c a a b c 2 index
a 2more a a a a dup dup
a 3more a a a a a dup dup dup
a b ! b a b exch pop
a b c d two! c d a b c d 4 2 roll pop pop
a b c d e f three! d e f a b c d e f 6 3 roll pop pop pop
a b c d pop3 a a b c d pop pop pop
clear empty stack clear
pstack print the current stack pstack

Comments

There are a few styles of commenting, some more quickly typed, visually apt, familiar or useful than others.

code here...   % ignoring to end of line (same as PostScript, TeX)
code here...  // ignoring to end of line (same as C++, Java etc)
code here...  /' ignoring to end of line, variation
code here...  -- ignoring to end of line (same as Ada, Haskell); good for horizontal rules
code here...  ## ignoring to end of line; good for visual highlighting
code here...  NB ignoring to end of line (similar to J); “note well”
code here... BTW ignoring to end of line; a gentle reminder
''
ignored multi-line passage...
to switch off this exclusion, simply precede opener’s quote-quote with % or /,
since the closer’s commas are normally transparent:

,,

Very impressed that you’ve made it this far. Not sure how much more I can achieve without significant investment in time. In particular, features which require deeper study and extended development include
  • 64-bit assembler for the compiler
  • unique memory management system
  • basic libraries
  • easy-going module system
  • FFI
  • threads & concurrency
My original compiler was only a few hundred lines of 80486 assembler code, and pretty much anything needed for a modern language will be an order of magnitude beyond what I have ever attempted. If you’d like to play with the language as it exists thus far, here is the interpreter. Let me know how you go.

⇒⇒⇒  SOUL zip file (< 2Mb)

This page © Ian James.
ianrjames@hotmail.com