Tacit Programming
Contents
12. Tacit Programming#
Tacit programming is a programming paradigm that APL supports. In order to understand what tacit programming is, we need to know what the word “tacit” means, in English:
“Tacit, adjective: understood or implied without being stated.”
In tacit programming, the thing that is implied without being stated is what arguments the functions receive. In other words, in tacit programming we create functions by combining other functions without specifying where the arguments go. This sounds much more confusing than what it really is, so let us study some examples.
12.1. Combining Functions with Operators#
12.1.1. Derived Functions are Tacit#
The simplest example of tacit programming arises from the use of the primitive operators. Recall these two helper functions from a previous exercise:
Trim ← {3↑⍵}
IsLong ← {3<≢⍵}
These are two regular dfns.
Now, we can use them to create a function TrimLong
that will trim all the elements of the argument vector that are too long:
TrimLong ← {(Trim¨)@(IsLong¨) ⍵}
TrimLong ⎕← ⍳¨⍳5
As it stands, the dfn TrimLong
has nothing special about.
However, we can get rid of the braces and the omega ⍵
because those things are redundant:
TrimLong ← (Trim¨)@(IsLong¨)
TrimLong ⍳¨⍳5
This alternative implementation works because the operator at needs two operands to derive a new function, and by providing the two operands, we assign the derived function directly to the variable TrimLong
.
We do not need to wrap the derived function in a dfn.
In fact, when we learned about at in Section 11.7, we learned exactly how the derived function will handle its right argument:
the right argument will be passed directly to the right operand of at, which is
IsLong¨
; thenthe elements of the right argument for which
IsLong
evaluates to 1 are collected as passed to the left operandTrim¨
.
Looking closely at the tacit definition of TrimLong
we see that we actually have two levels of tacit programming.
Notice how the right operand of the operator at is IsLong¨
.
Why is it IsLong¨
and not just IsLong
?
The dfn IsLong
takes a vector and determines if it has more than three elements, but we already know that the right operand of at will receive a nested vector.
In our example above, that was ⍳¨⍳5
:
⍳¨⍳5
Thus, if we want to determine which nested elements of the argument vector are too long, we need to use the operator each to modify IsLong
.
The use of each modifies how IsLong
works and that modification is done tacitly because of the definition of each: we do not need to write anything to explain that the function IsLong
is going to be applied to each scalar of the argument to IsLong¨
.
Let us try another tacit definition:
MaxWindow ← {⌈/,⍺↓⍵}⌺3 3
The (tacit!) function MaxWindow
takes a matrix argument and computes the maximum value in every 3 by 3 window:
⎕← mat ← 4 6⍴(⍳3),⍳2
MaxWindow mat
Suppose that we want to modify this function so that we can apply it to higherrank arrays.
Our goal is that the function MaxWindow
gets applied to each 2cell (each submatrix), so we can do that with the operator rank:
HighRankMaxWindow ← MaxWindow⍤2
⎕← cuboid ← 2 5 7⍴(⍳3),(1+⍳3)
HighRankMaxWindow cuboid
We could have defined HighRankMaxWindow
directly:
HighRankMaxWindow ← ({⌈/,⍺↓⍵}⌺3 3)⍤2
12.1.2. Operator Binding Order#
We have seen two tacit functions that make use of multiple operators to derive successive functions:
TrimLong ← (Trim¨)@(IsLong¨)
HighRankMaxWindow ← ({⌈/,⍺↓⍵}⌺3 3)⍤2
However, both functions have superfluous parenthesis, because we have not been considering the order in which operators bind in Dyalog APL.
When we have an expression, we do not need to parenthesise from the right. For example,
1 + (⍳5)
is just
1 + ⍳5
When using multiple operators together, we do not need to parenthesise from the left.
For example, the function TrimLong
was defined as
TrimLong ← (Trim¨)@(IsLong¨)
but it could have been defined as
TrimLong ← Trim¨@(IsLong¨)
TrimLong ⍳¨⍳5
The leftmost set of parentheses was not necessary because operators bind from the left.
Thus, the expression Trim¨@IsLong¨
would have been equivalent to ((Trim¨)@IsLong)¨
.
This shows that the leftmost set of parentheses is not necessary, whereas the rightmost set is necessary, otherwise the rightmost operator each binds to the derived function Trim¨@IsLong
instead of the dfn IsLong
.
Similarly, the definition of HighRankMaxWindow
has an extra set of parentheses.
Instead of ({⌈/,⍺↓⍵}⌺3 3)⍤2
, we can write
HighRankMaxWindow ← {⌈/,⍺↓⍵}⌺3 3⍤2
HighRankMaxWindow cuboid
Now, we will learn about a tool that will help us study tacit functions and, in particular, understand what parentheses are needed and which ones are not.
12.2. Inspecting Tacit Functions#
The user command ]box
that you first learned in Section 3.6.3 can also be used to customise how tacit functions are displayed.
This customisation is done through the switch trains
.
Take a look at the help message below and read the different options for the switch trains
:
]box ?
This switch is called “trains” because trains are the more general form of tacit programming in Dyalog. We will learn about trains in Section 12.6.
Let us go through the multiple options available in the subsections that follow.
12.2.1. Box#
The option box
draws boxes that indicate the binding order of operators and operands.
Thus, inner boxes indicate that their contents bind first, and the contents of outer boxes bind later, possibly with content from inner boxes.
A couple of examples will follow.
]box trains=box
First, we will see how the version of TrimLong
that does not have superfluous parentheses is represented:
Trim¨@(IsLong¨)
As we can see, with the box diagram, we see that the operands of the operator at are the two boxes on its side:
on the left, the box contains two other boxes, the dfn
{3↑⍵}
and the operator¨
, so we get{3↑⍵}¨
as the left operand; andon the right, the box contains two other boxes, the dfn
{3<≢⍵}
and the operator¨
, so we get{3<≢⍵}¨
as the right operand.
If we drop the right set of parentheses, the box diagram changes to reflect the fact that the rightmost each has as left operand everything else:
Trim¨@IsLong¨
If you look closely, the box diagrams look like nested vectors. A 2element vector represents the left operand of an operator and its operator, and a 3element vector represents the left operand, a dyadic operator, and the right operand.
Working from the innermost box, the operator each binds with {3↑⍵}
to create the first derived function F1
:
⎕← F1 ← '{3↑⍵}' '¨'
Then, the operator at binds with F1
on the left and with {3<≢⍵}
on the right to create the second derived function F2
:
⎕← F2 ← F1 '@' '{3<≢⍵}'
Finally, the last each binds with F2
on the left to create the third and final derived function F3
:
⎕← F3 ← F2 '¨'
Similarly, we can see that HighRankMaxWindow
does not need any parentheses to be interpreted as we needed:
{⌈/,⍺↓⍵}⌺3 3⍤2
The operator stencil got bound with its operands first, and that derived function was the left operand to rank.
12.2.2. Tree#
The option tree
draws the tacit function in a tree structure, with the top/root of the tree being the operator that binds last.
A monadic operator gets a branch to its left operand and a dyadic operator gets two branches, one for each operand.
]box trains=tree
If we inspect the tacit definition of HighRankMaxWindow
first, it should show the operator rank at the top with a subtree on the left to represent the left operand {⌈/,⍺↓⍵}⌺3 3
and a branch on the right pointing to the right operand 2
:
HighRankMaxWindow
This tree structure shows that the function HighRankMaxWindow
is a function derived from the operator rank.
Then, to interpret the left operand, we have to inspect the subtree on the left:
⌺
┌───┴────┐
{⌈/,⍺↓⍵} (2⍴3)
The left subtree shows that the left operand is a function derived from the operator stencil with a left operand dfn and a right operand vector.
We can also inspect the tree structure of the expression for TrimLong
without any parentheses:
{3↑⍵}¨@{3<≢⍵}¨
The fact that the root of the tree is the operator each shows that we needed a set of parentheses somewhere. The tree should have the operator at at the root with another derived function on each branch:
{3↑⍵}¨@({3<≢⍵}¨)
12.2.3. Parens#
The option trains=parens
will always add as many parentheses as possible, even if superfluous, to make explicit the binding of the operators and their operators:
]box trains=parens
{3↑⍵}¨@{3<≢⍵}¨
{⌈/,⍺↓⍵}⌺3 3⍤2
Note that if any of the elements to be displayed take up multiple lines, then the function will be displayed as if ]box
were OFF.
This display format may look unusual, so we show two functions in that format so you get acquainted with it:
]box OFF
{3↑⍵}¨@{3<≢⍵}¨
{⌈/,⍺↓⍵}⌺3 3⍤2
]box ON
12.2.4. Def#
The option trains=def
will show the simplest expression that still defines the same function.
]box trains=def
For example, if we display our original implementation of TrimLong
that contained superfluous parenthesis, this option will remove them:
(Trim¨)@(IsLong¨)
Whenever you are writing a derived function and are not sure if the operands will bind like you need them to, use these tools to inspect the derived function and understand what you need to do to make sure you define your function correctly.
Let us reset ]box
to using the option trains=tree
:
]box trains=tree
Deriving functions from operators is the simplest form of tacit programming. In the sections that follow, we will learn about function composition and trains which provide other mechanisms for tacit programming.
12.3. Function Composition#
Function composition refers to the act of defining a new function at the expense of smaller functions that get applied in the pattern specified by the combining operators.
Trains, explained in detail in Section 12.6, can also be thought of as a form of function composition, but this section will focus on the three operators that Dyalog provides for function composition.
The three operators introduced here, beside, atop, and over, are dyadic operators that take their operands and produce a single, composite operation. One can regard these operators as easy ways of specifying inline “minifunctions”. As such, these operators do not add functionality to the language that could not be obtained by other means; they just represent a very convenient notation to express some common patterns.
We start by introducing beside.
12.3.1. Beside#
Beside is a dyadic operator represented by jot ∘
, which you can type with APL+j, as you already know.
In this section, we will cover the forms of beside where the operands are functions.
The section on argument binding (here) will cover the forms in which beside has an array operand.
Beside takes the two operand functions and applies them to the argument(s) with a pattern that depends on the valence of the derived function:
F∘G ⍵ ←→ F G ⍵
, bothF
andG
are used as monadic functions.⍺ F∘G ⍵ ←→ ⍺ F G ⍵
,F
is used dyadically andG
is used monadically.
The patterns shown above can also be represented in diagrams like the ones in Fig. 12.1.
The composition with beside, F∘G
, can be interpreted as “preprocess the right argument of F
with G
”.
The operator beside is rarely used alone.
After all, the expressions above show that instead of writing F∘G ⍵
one could just write F G ⍵
or, instead of writing ⍺ F∘G ⍵
, one could just write ⍺ F G ⍵
.
Beside is often used together with the operator each, as this may give important advantages for execution time and memory consumption.
Another example usage of beside is to create a derived function to be used together with the operator reduce.
We will show examples of these usages below.
12.3.1.1. Monadic derived function#
Quite often you would like to apply two monadic functions to each item of an array. This is very easy to do with the help of the powerful operator each.
Let us look at the simple example in which we just want to find the rank of each item of the variable weird
:
⎕← weird ← 2 2⍴456 (2 2⍴ 'Dyalog' 44 27 (2 2⍴8 6 2 4)) (17 51) 'Twisted'
The rank of array
is ≢⍴array
, so the rank of each item of weird
is:
≢¨⍴¨weird
In the expression above, the ⍴¨
creates a (potentially big) array containing the shape of each item of weird
.
Then, the ≢¨
gets the length of each vector of the intermediate result.
Remember: the rank of an array is the length of the shape of the array.
This is inefficient for two reasons:
Firstly, APL must allocate memory to hold the intermediate array, which will be discarded as soon as the entire expression has been evaluated.
We can see this intermediate result if we insert a ⎕←
between ≢¨
and ⍴¨
:
≢¨ ⎕← ⍴¨ weird
Secondly, internally APL must loop through a potentially large number of items twice.
With the help of beside, we can eliminate both problems: APL only needs to traverse the array once, applying both functions to each item in succession. During the processing of each item, only a very small intermediate array will be created holding the shape of each item, and it will be discarded before processing the next item:
≢∘⍴¨ weird
The expression ≢∘⍴¨
is another example of tacit programming.
From the example above, we know that ≢∘⍴¨
computes the rank of each item in a nested array, if applied monadically.
However, that expression contains two functions and two operators, and in no way we specify explicitly what arguments go where.
Hence, ≢∘⍴¨
is an example of tacit programming.
As another example usage of beside, consider the expression below:
+/∘⍳¨ 2 4 7
This expression adds up the items of ⍳2
, those of ⍳4
, and finally those of ⍳7
.
Using beside to compose the plus reduction and the index generator functions uses up less space than using the operator each twice because the intermediate result would be a nested vector with all the vectors created by the index generator:
+/¨ ⎕← ⍳¨ 2 4 7
If the initial argument contained more integers and they were all large integers, the intermediate result would be unnecessarily long.
In the third example that follows, both operands to beside are userdefined functions we have seen before:
Sqrt ← {⍵*0.5}
Average ← {(+/⍵)÷≢⍵}
Sqrt∘Average¨ (11 7)(8 11)(21 51)(16 9)
12.3.1.2. Dyadic derived function#
Here is an example of composition of times and index generator with the operator beside:
1 10 100 ×∘⍳¨ 2 4 3
The expression above multiplies 1
with ⍳2
, then it multiplies 10
with ⍳4
, and finally, it multiplies 100
with ⍳3
.
Another example involves the approximation of the golden mean, which can be calculated by this infinite series:
As you can see, we have inserted +÷
between the items of a series of ones.
This operation is a reduction by +÷
, but the operator reduce only accepts a single function on its left.
To overcome this, we can use the operator beside to compose the two functions +
and ÷
together, thereby creating a single, derived function that may be used together with reduce:
+∘÷/ 1 1 1 1 1 1
+∘÷/ 50⍴1
12.3.2. Atop#
Atop is a dyadic operator represented by jot diaeresis ⍤
, which you can type with APL+Shift+J, which is the same glyph as the one used for the operator rank.
The difference between the operator rank and the operator atop lies in the right operand:
for the operator rank, the right operand is an array; and
for the operator atop, the right operand is a function.
Atop takes the two operand functions and applies them to the argument(s) with a pattern that depends on the valence of the derived function:
F⍤G ⍵ ←→ F G ⍵
, bothF
andG
are used as monadic functions and this is exactly the same asF∘G
.⍺ F⍤G ⍵ ←→ F ⍺ G ⍵
,F
is used monadically andG
is used dyadically.
The patterns shown above can also be represented in diagrams like the ones in Fig. 12.2.
The left operand is applied atop the right operand.
In other words, the left operand function is applied after the right operand is applied to the available arguments.
Yet another way of describing the operator atop is by saying that F⍤G
postprocesses the result of G
with F
.
We will show some example usages below.
Much like beside, seen before, and over, which will be shown next, the operator atop is typically used in conjunction with other operators, for example each or reduce.
12.3.2.1. Example usages of atop#
Suppose that you need to determine whether a number comes before or after another number.
This type of comparison can be made with one of the many comparison primitives: <
, ≤
, ≥
, and >
.
However, the comparison primitives return Boolean results, which are either 0 or 1.
If you wanted a more finegrained comparison that distinguishes whether the left argument is before the right argument, after the right argument, or is the same as the right argument, you could use a derived function with atop:
5 ×⍤ 2 5 8
The result of ⍺ ×⍤ ⍵
is one of three values:
1
if⍺
comes after⍵
;0
if⍺
is the same as⍵
; and¯1
if⍺
comes before⍵
.
The dyadic derived function ×⍤
can be interpreted as “the sign of the difference”, which can be seen as postprocessing the difference of the two arguments with the function sign.
It is true that the expression shown above could be rewritten without atop:
× 5  2 5 8
But that is not necessarily a better alternative to using the derived function ×⍤
and, in some cases, rewriting ×⍤
without the atop may not be an alternative.
12.3.2.2. Function atop tack#
There is a common usage pattern for the operator atop that involves using a tack.
Consider the monadic derived function ≢⍤⊢⌸
that uses the operator key and the operator atop, where the left operand to ⌸
is ≢⍤⊢
because operators bind from the left.
The monadic derived function ≢⍤⊢⌸
counts how many times each unique item appears in its argument:
≢⍤⊢⌸ 'MISSISSIPPI'
The result above means that one of the letters show up 1 time, two other letters show up 4 times each, and the fourth letter shows up 2 times. These counts are in the order of the unique elements, so we can easily find out the letters associated with the counts:
∪'MISSISSIPPI'
The point of using F⍤⊢⌸
is that the operator key passes two arguments to its left operand (to give more flexibility to the user) but we only care about one of those, so we use the appropriate tack to select the argument we want, and then apply F
to that argument.
A similar pattern is exhibited in this alternative implementation of nwise reduction as a dop:
_NWiseReduce ← {⍺⍺/⍤⊢⌺⍺⊢⍵}
2 +_NWiseReduce ⍳10
⍺⍺/⍤⊢⌺⍺
is a single derived function and we can inspect its structure:
{}/⍤⊢⌺0
In the expression above, we substituted the left operand ⍺⍺
of the dop with {}
because writing ⍺⍺
outside of a dop gives a SYNTAX ERROR
.
Similarly, we used 0
as the right operand of ⌺
because we cannot write ⍺
outside of a dfn/dop.
So, in inspecting the structure of ⍺⍺/⍤⊢⌺⍺
, we see that ⍺⍺/⍤⊢
is the left operand of the operator stencil.
Recall that the left operand of the operator stencil takes two arguments:
the left argument gives information about the padding of the current subarray; and
the right argument is the subarray being processed.
Because we do not need the information about the left argument, we use ⍺⍺/⍤⊢
to apply ⍺⍺/
directly to the right argument.
12.3.3. Over#
Over is a dyadic operator represented by circle diaeresis ⍥
, which you can type with APL+Shift+O (that is the letter “Oh”, and not the number zero).
Over takes two operand functions and applies them to the argument(s) with a pattern that depends on the valence of the derived function:
F⍥G ⍵ ←→ F G ⍵
, bothF
andG
are used as monadic functions and this is exactly the same asF∘G
orF⍤G
.⍺ F⍥G ⍵ ←→ (G ⍺) F (G ⍵)
,F
is used dyadically andG
is used monadically.
The patterns shown above can also be represented in diagrams like the ones in Fig. 12.3.
The usage F⍥G
of the operator over can be interpreted as “apply F
after preprocessing all arguments with G
”.
We will show some example usages below.
Of the three compositional operators discussed so far, beside, atop, and over, the operator over is the one that is more commonly used alone.
We will show some example usages below.
How can we check if two words start with the same letter?
word1 ← 'banana'
word2 ← 'bat'
We can use first to get the first character of each word and see if they match:
(⊃word1) = ⊃word2
Alternatively, we can check for equality over the first character:
word1 =⍥⊃ word2
Suppose that we represent an interval of numbers with a 2item vector with the two endpoints.
For example, 0 3.34
would be the interval of all the numbers between 0
and 3.34
.
The centre of an interval can be computed as such:
Centre ← {(+/⍵)÷2}
Centre 0 3.34
The distance between two intervals can be defined as the distance between the centres, which we can compute with ⍥Centre
.
So, the distance between the intervals ¯1 1
and 0 3.34
is
0 3.34 ⍥Centre ¯1 0
If we swap the order of the two arguments, we see that the distance becomes negative:
¯1 0 ⍥Centre 0 3.34
This does not make much sense, so we might want to fix this by saying that we want the absolute value after the difference of the centres, which is done by using the operator atop to “postprocess” the result of the subtraction:
¯1 0 ⍤⍥Centre 0 3.34
⍤⍥Centre
12.3.4. Comparison of the Three Operators#
The three compositional operators beside, atop, and over, all behave in the same way if the derived function is used monadically. The difference lies in the dyadic use of the derived function, as the table below shows.
Operator 
Monadic use 
Dyadic use 










These differences can also be summarised in Fig. 12.4:
12.3.5. To Compose or Not To Compose#
As stated in the beginning of this section, the three operators beside, atop, and over, do not add new primitive behaviour. In fact, for each of those operators, we have shown equivalent expressions that do not use the operators and that achieve the same effect.
However, there are advantages to using compositional operators to create derived functions. Here are some of those advantages:
the derived function can be assigned a name;
function composition with these operators works inside trains (which will be introduced soon); and
these operators clarify the meaning of your programs when used correctly. In other words, good usage of beside, atop, and over, can make it easier to read and understand a program.
With practice, you will develop a better understanding for when using these operators explicitly is a good choice. In part, it will also come down to personal taste: some prefer to use plenty of compositional operators and others prefer to use none, but optimal usage of these operators lies somewhere in between those two extremes.
12.4. Binding#
The glyph jot ∘
has yet another use as a dyadic operator.
If one of the operands, and only one, is an array, then ∘
stands for the operator bind.
The operator bind is used to set a fixed argument (the array operand) to a given function (the function operand). Depending on whether the array operand is on the left or on the right of the function operand, the operator bind sets the left or right argument of the function, respectively.
More explicitly, in array∘F
, the operator bind sets the left argument of the function F
to be array
, and in F∘array
, the operator bind sets the right argument of the function F
to be array
.
Next, we take a look at a few examples of the operator bind:
3∘↑¨ (⍳5) 'Houston' (21 53 78 55) (11 22)
This expression applies 3↑
to each of the items of the right argument.
So far, this is not a very good example, as the expression would work and give the same result even without the operator bind:
3 ↑¨ (⍳5) 'Houston' (21 53 78 55) (11 22)
However, binding the value 3 to take makes it possible to combine the function with yet another function, so that we can again obtain the advantage of not creating unnecessary intermediate values:
⌽∘(3∘↑)¨ (⍳5) 'Houston' (21 53 78 55) (11 22)
Moreover, if the array operand is not a scalar, it may be impossible to omit the operator bind.
In the example that follows, the operator bind must be present, otherwise we get a LENGTH ERROR
:
2 2∘⍴¨ (⍳5) 'Houston' (21 53 78 55) (11 22)
2 2 ⍴¨ (⍳5) 'Houston' (21 53 78 55) (11 22)
LENGTH ERROR
2 2⍴¨(⍳5)'Houston'(21 53 78 55)(11 22)
∧
When we use the operator bind, we create a derived function that is monadic, which means the derived function always takes a single right argument.
Thus, an expression like (F∘arr1) arr2
evaluates to arr2 F arr1
, because the operator bind created the derived function F∘arr1
where the right argument of F
is set to arr1
.
For example,
(*∘0.5) 16 81 169
Once bound to 0.5, the function power behaves like the function square root, which expects a right argument (the number(s) to compute the square root of).
In this form, the derived function must be parenthesised so that the operand 0.5 is separated from the right argument vector. Another alternative, that we saw in the chapter about operators, is to use a tack:
*∘0.5 ⊢ 16 81 169
12.5. Commute, Selfie, and Constant#
The three operators commute, selfie, and constant, are the three usages of the glyph tilde diaeresis ⍨
.
By now, you should be able to guess that the key combination to type ⍨
is APL+Shift+T.
After all, the function without ~
is APL+t.
12.5.1. Commute and Selfie#
As its name implies, commute is a monadic operator which commutes the arguments of its derived function.
For example,
4 ÷ 2
but if we use the operator commute, then
4 ÷⍨ 2
That is, x F⍨ y
is equivalent to y F x
.
When the derived function F⍨
is used monadically, as in F⍨ y
, then the same argument gets used on both sides of the function.
Thus, F⍨ y
is equivalent to y F y
.
For example, ⍴⍨ 3
is equivalent to 3⍴3
:
⍴⍨ 3
Based only on these simple examples, one might think that the operators commute and selfie are useless (typing ⍴⍨3
is no easier than typing 3⍴3
).
However, both may be used to reduce the number of parentheses needed in an expression.
For example, suppose we want to create a vector like 3⍴3
or 5⍴5
, using the last item of an arbitrary vector v
.
A direct approach would be to write ((≢v)⌷v)⍴(≢v)⌷v
or (⊃⌽v)⍴⊃⌽v
:
v ← 8 3 6 7 4
((≢v)⌷v)⍴(≢v)⌷v
(⊃⌽v)⍴⊃⌽v
The operator selfie allows a simpler expression:
⍴⍨⊃⌽v
It is not only for “cosmetic” reasons that it is desirable to avoid repeating an expression. It also means that the interpreter only has to evaluate the expression once, possibly saving some execution time. Furthermore, avoiding a verbatim repetition of a piece of code improves maintainability considerably. If the expression needs to be modified, it is simply too easy to forget to modify all instances of it, or to make mistakes in some of the modifications.
Some APL programmers still prefer to use an intermediate variable or an inline direct function to obtain the same benefits in terms of efficiency and maintainability:
last ← ⊃⌽v
last⍴last
It is mostly a matter of taste which of the possible solutions different programmers prefer. The case illustrates that the APL language typically allows the same task to be solved in many different ways.
The operator commute can also allow for simpler expressions.
For example, if we wanted to compress the even numbers of the vector v
, we would write something like:
(~2v)/v
With the operator commute, we can write
v/⍨~2v
The pattern array /⍨
can be read as “the items of array
that…”.
The operators commute and selfie can also be helpful in trains. This will be understandable when we study trains in Section 12.6.
12.5.2. Constant#
The operator constant is a monadic operator which takes an array operand array
.
The derived function is an ambivalent constant function that always returns array
.
Thus, array⍨
is equivalent to the dfn {array}
:
1 2 3⍨ v
1 2 3⍨ (⍳5) 'Houston' (21 53 78 55) (11 22)
'Cat' (1 2 3⍨) 'Dog'
12.6. Function Trains#
A function train, often referred to as just a train, is a function that is derived from a sequence of two or more functions. This sequence of functions must be isolated from its arguments. Notice the difference in results between this uninteresting APL expression:
10 ,+ 2
And this one, where the three functions are parenthesised:
10 (,+) 2
Throughout this section you will learn what (,+)
means as a function train and you will understand why the result of 10 (,+) 2
is the vector 8 12
.
A function train with two functions is called an atop (or a 2train) and a function train with three functions is called a fork (or a 3train). They are the two building blocks of trains with arbitrary length, and we will start by looking at them.
12.6.1. 2train Atop#
A train with two functions is called an atop, which is also the name of an operator introduced in Section 12.3.2. The 2train and the operator atop share their name because they function in the same way:
(F G) ⍵ ←→ F⍤G ⍵ ←→ F G ⍵
; and⍺ (F G) ⍵ ←→ ⍺ F⍤G ⍵ ←→ F ⍺ G ⍵
.
For example, ()
is the absolute value of the difference:
10 () 5
5 () 10
Much like with functions that are combined with operators, we can inspect trains by typing them in the session:
()
Trains can also be assigned to names. When we do so, we do not need to parenthesise the sequence of functions because the functions are already isolated from the arguments:
AbsDiff ← 
5 AbsDiff 10
12.6.2. 3train Fork#
A train with three functions is called a fork.
In the fork (F G H)
, the two outer functions F
and H
are applied first, and then the function G
in the middle is applied to the results of the two outer functions.
If we type a fork in the session, we see a diagram that hints at the fact that origin of the name “fork”, because the diagram looks like a fork with three tines.
If we use an “empty” dfn {}
as a placeholder for a function, we can see the forklike diagram:
{}{}{}
12.6.2.1. Monadic Fork#
In the monadic case, we have that (F G H) ⍵ ←→ (F ⍵) G (H ⍵)
.
For example, the fork (⌽≡⊢)
, when used monadically, checks if the argument vector is a palindrome.
(Recall that a palindrome is a sequence that reads the same when reversed.)
(⌽≡⊢) 'TACOCAT'
(⌽≡⊢) 1 2 3 4 3 2 1
(⌽≡⊢) 'MISSISSIPPI'
Notice that, in a fork that is used monadically, the two outer functions are used monadically but the middle function is always used dyadically.
12.6.2.2. Dyadic Fork#
In the dyadic case, we have that ⍺ (F G H) ⍵ ←→ (⍺ F ⍵) G (⍺ H ⍵)
.
So, if a fork is used dyadically, both arguments get passed to both outer functions, and then the results are given as arguments to the middle function.
The train (,+)
, from the beginning of this section, was used dyadically, so now we can understand it:
10 (,+) 2
Is the same as:
(10  2) , (10 + 2)
Another good example of a dyadic fork is (≠⊆⊢)
, which can be used to split a vector on a separator.
Below, you can see this fork being used to split a sentence into words:
' ' (≠⊆⊢) 'this is a sentence with some words'
This fork is equivalent to the following expression:
sentence ← 'this is a sentence with some words'
(' ' ≠ sentence) ⊆ (' ' ⊢ sentence)
Of course, the right tack is used to say that the right argument to ⊆
should be the right argument of the fork, unchanged.
In fact, the expression above can be simplified to
(' ' ≠ sentence) ⊆ sentence
12.6.2.3. Arrays in Forks#
The left tine of a fork does not have to be a function. In fact, it can be any array, which will then be used as the left argument to the function in the centre of the fork.
For example, the fork (1=×)
checks if a number is positive:
(1=×) 13.4
(1=×) 0
(1=×) ¯73.42
However, the right tine of a fork cannot be an array. The right tine of a fork must be a function. If it were an array, then APL would not interpret that as a fork, but as a normal APL expression that happens to be parenthesised.
For example, one might think that (⊢*0.5)
is a fork that implements the function square root.
However, if you use this “fork”, this is what happens:
(⊢*0.5) 16
APL sees the expression (⊢*0.5) 16
as a 2item vector, where the first item is ⊢*0.5
, which is just *0.5
:
*0.5
If we wanted to insist on writing the function square root as a train, we could fix this by using the operator constant, to turn the value 0.5 into a function that always returns 0.5:
(⊢*0.5⍨) 16
12.6.3. Longer Trains#
A function train does not have to be limited to two or three functions. Function trains can have an arbitrarily large size.
In a function train with four or more functions, APL starts combining functions into 3trains from the right.
For example, to parse the train (≢≠⊆⊢)
, APL starts by looking at ≠⊆⊢
as a 3train, which creates a derived function T ← ≠⊆⊢
.
Now, we can look at (≢≠⊆⊢)
as (≢T)
which is a 2train, an atop.
T
is the train that we used to split a sentence into its words, so we see that the train (≢T)
can be used to count how many words are in a sentence:
' ' (≢≠⊆⊢) 'this sentence has five words'
As a beginner, it is recommended that you use the session to inspect the structure of longer trains:
≢≠⊆⊢
The diagrams that the session draws will help you inspect the structure of the derived function.
As another example, consider the train (5<(≢≠⊆⊢))
.
In this train, the parentheses make the structure explicit:
the outer train is a fork where the left tine is actually an array (the scalar 5), the middle tine is the function
<
and the right tine is another train; andthe right tine is the 4train from before.
Again, the session helps us visualise this structure:
5<(≢≠⊆⊢)
With the train above, we can check, for example, if a sentence has more than five words:
' ' (5<(≢≠⊆⊢)) 'this sentence has five words'
' ' (5<(≢≠⊆⊢)) 'this longer sentence contains a total of nine words'
If we look at a 5train, we see that a 5train is a fork with a right tine that is also a fork.
In the expression below, we use {Fi}
as a placeholder for an arbitrary function:
{F5} {F4} {F3}{F2}{F1}
When APL parses the expression above to create the appropriate derived function, it starts by taking the three rightmost functions and creating a fork, that we will call T1
:
T1 ← {F3}{F2}{F1}
Then, APL uses that fork T1
as the right tine of the fork that uses the fourth and fifth functions.
{F5} {F4} T1
Much like in regular expressions, in function trains we can use parenthesis to change the way APL parses things. For example, if we parenthesise the three central functions, we create a 3train where the middle tine is itself a fork:
{F5} ({F4}{F3}{F2}) {F1}
If we type a 6train, we see that the sixth function is applied atop the corresponding 5train:
{F6} {F5} {F4} {F3}{F2}{F1}
As you may have guessed by now, the length of the function train determines whether we have an atop or a fork, and that distinction depends on the parity of the length of the train:
a function train with an odd number of functions is a fork; and
a function train with an even number of functions is an atop.
Naturally, a typical train will get increasingly more difficult to understand (for humans) as it grows, so you should keep that in mind when writing your own trains. Nonetheless, even long trains have a uniform structure that the tree diagram makes clear:
{F9}{F8}{F7}{F6}{F5}{F4}{F3}{F2}{F1}
We can exploit the uniformity in the structure of (long) trains to write the specification of how trains of arbitrary length work:
Rule
In trains, the functions in odd positions are the functions that receive the arguments of the train directly and those functions are used monadically or dyadically, depending on whether the train is called monadically or dyadically.
The functions in even positions are used dyadically with the results of the surrounding functions as arguments.
If the leftmost function of a train is in an even position, that function will be applied atop the remainder of the train.
Let us inspect an arbitrary train with 8 functions:
{F8}{F7}{F6}{F5}{F4}{F3}{F2}{F1}
The functions in positions 1, 3, 5, and 7, will receive the train arguments directly:
┌────┴────┐
{F8} ┌────┼─────────┐
{F7} {F6} ┌────┼─────────┐
↑↑↑↑ {F5} {F4} ┌────┼────┐
↑↑↑↑ {F3} {F2} {F1}
↑↑↑↑ ↑↑↑↑
Then, the functions in even positions will receive, as arguments, the results of the surrounding functions.
Notice the ←
and →
next to the intersections of the tree diagram:
┌────┴────┐
{F8} ┌───→┼←────────┐
{F7} {F6} ┌───→┼←────────┐
↑↑↑↑ {F5} {F4} ┌───→┼←───┐
↑↑↑↑ {F3} {F2} {F1}
↑↑↑↑ ↑↑↑↑
Finally, if the leftmost function is in an even position, that function will be applied atop the remainder of the train.
In this example, that means F8
would be applied to the result returned by F6
.
12.6.4. Using Trains#
Tacit programming, and trains in particular, are infamous for being too complicated and convoluted. This is debatable and there are benefits to using trains. Some of the associated benefits are concrete and can be measured. For example, function trains lend themselves nicely to idiom recognition, which means they can be faster:
⎕← 10↑vec ← ?1e6⍴1e6
]runtime c (vec≥999000)⍳1 vec(⍳∘1≥)999000
Some other advantages of using trains are subjective and/or impossible to quantify. We will expose some of them here.
Trains provide a mathematically pure way of specifying data transformations, given that trains are functions derived from the composition of other functions. Thus, trains are an interesting alternative for when you need to define functions that transform your data without producing sideeffects. This “purity”, which can be hard to define and somewhat subjective, is what allows function trains to be inverted by the operator power, as is shown in Section 12.6.7.
Because function trains do not use any characters to specify the function composition or the way the arguments flow (because trains use tacit programming), it can be argued that trains are a more direct way of expressing pure computations, when compared to dfns or tradfns.
12.6.5. Carriages in a Train#
Function trains do not need to be composed solely of primitive functions. Any function can be used in a function train, namely other function trains, dfns, tradfns, and functions derived from operators. Below, you can find a couple of examples.
First, we create a fork that checks if a nonempty vector contains only a single unique element. In this fork, the right tine is a named dfn:
CountUnique ← {≢∪⍵}
(1=CountUnique) 1 1 1 1 2 1
There is nothing preventing us from using the dfn directly:
(1={≢∪⍵}) 1 1 1 1 1 1
Now, we replace the dfn with an equivalent formulation using the operator beside to compose the two functions that we need on the right tine:
(1=≢∘∪) 1 1 1 1 2 1
Because the derived function is being used monadically, we know that ≢∘∪
is the same as ≢⍤∪
and ≢⍥∪
.
On top of that, we know that ≢⍤∪
is the same as the 2train (≢∪)
, so we can replace the right tine with another train:
(1=(≢∪)) 1 1 1 1 1 1
As you can see, there is a lot of flexibility associated with function trains. Naturally, making use of this flexibility can cost you in readability.
For example, compare the two trains that follow:
' ' (≠≢⍤⊆⊢) sentence
' ' (≢≠⊆⊢) sentence
The two trains are equivalent, but they differ in the way we specify that the function tally must be applied.
In the train (≠≢⍤⊆⊢)
, we use the operator atop to specify we do the tally of the partition, whereas in the 4train (≢≠⊆⊢)
, we have the tally atop the fork (≠⊆⊢)
.
We can even see that the corresponding diagrams differ:
≠≢⍤⊆⊢
(≢≠⊆⊢)
When possible, avoid using operators inside trains because it breaks the uniform pattern of trains.
A train with 9 functions is a fork with a uniform structure, as shown by its diagram:
{F9}{F8}{F7}{F6}{F5}{F4}{F3}{F2}{F1}
If we insert some operators, the diagram loses regularity and becomes harder to follow:
{F9}{F8}⍤{F7}∘{F6}{F5}{F4}{F3}⍥{F2}{F1}
12.6.6. Hybrid Functionoperators#
Some primitives are both functions and operators, like reduce /
and reducefirst ⌿
.
One must be careful when using hybrid functionoperators in function trains if the intended purpose is to use them as functions.
Here is an expression that selects the positive numbers of a numeric vector:
v ← ¯0.4 ¯1.3 ¯0.9 3.2 4.5 0.6 4.9 ¯2 ¯2.3 0.3
(0∘<v)/v
One might try to rewrite that expression as a fork like so:
0 (</⊢) v
However, it clearly does not work.
The issue is that /
is being parsed as the operator reduce, so it is trying to use 0∘<
as its operand.
To force the glyph slash to be parsed as the function compress, one can write the function compress as the right operand of the atop ⊢⍤/
:
0 (<⊢⍤/⊢) v
This workaround forces the glyph slash to be seen as the function compress (and not the operator) because the atop ⍤
needs a right operand.
The atop ⊢⍤F
is equivalent to the function F
, so putting ⊢⍤/
in the train will not change its behaviour.
Another possible fix is to wrap the glyph in a dfn:
0 (<{⍺/⍵}⊢) v
12.6.7. Inverting Trains#
The operator power can take a negative right operand, which will invert the function and then apply it. An advantage of using compositional operators and/or function trains is that the derived functions are amenable to inversion by the operator power. The operator power can invert some of these functions because they are pure functions, and thus there are formulas that allow the interpreter to know how to invert the functions.
Below, you can find a train that adds 1 to the square of the argument:
(1+×⍨) 10
This train can be inverted by the operator power:
(1+×⍨)⍣¯1 ⊢ 101
However, the equivalent dfn cannot be inverted:
{1+×⍨⍵}⍣¯1 ⊢ 101
DOMAIN ERROR
{1+×⍨⍵}⍣¯1⊢101
∧
There are limitations to what (tacit) functions the operator power can invert, but this capability is nonetheless surprising.
12.7. Exercises#
What is the result of the expression 2∘×⍤+/ 3 5
?
What about 2∘×⍤+/ 3 5 7
?
Use the operator bind to create derived functions that accept numeric arrays and return a Boolean mask indicating:
what values are positive;
what values are equal to
¯1
; andwhat values are less than or equal to
3.5
.
Replace the underscores _
with the correct compositional operators so that the results become correct:
4 5 6 ,_≢ 1 0 1
3 3
4 5 6 ,_≢ 1 0 1
4 5 6 3
4 5 6 ,_≢ 1 0 1
1
Replace the underscores _
with the correct compositional operators so that the results become correct:
2 +/_ 11 6 7 19
3
(⍳6) /⍨_~ 1 0 0 1 0 0
2 3 5 6
2 ÷_+_÷ 4
1.33333
Rewrite the expressions below without parentheses. Use the operator commute instead.
(∪w)⍳w ← 'MISSISSIPPI'
1 2 3 3 2 3 3 2 4 4 2
≢(' '≠sentence)⊆sentence
7
(2 2,(⌊(≢⎕A)÷4))⍴⎕A
ABCDEF
GHIJKL
MNOPQR
STUVWX
Take the character vector sentence
and create a Boolean mask that indicates what elements of sentence
are blank spaces (' '
).
That mask, in conjunction with the operators at and constant, can also be used to replace all the spaces with a different character.
Replace all spaces with asterisks ('*'
) using this technique.
In typical code, this technique is useful when the Boolean mask is used for other computations other than the replacement with at.
The operator beside in F∘G
can be interpreted as “preprocess the right argument of F
with G
”.
Implement a dop _B_
such that ⍺ G _B_ F ⍵ ←→ (G ⍺) F ⍵
.
This operator is sometimes referred to as behind and is interpreted as “preprocess the left argument of F
with G
”.
Try to implement it in terms of beside and commute.
Hint: start from F∘G
.
Use these expressions to verify your implementation:
¯2 _B_+ 5
(⍳10) (3∘↑_B_,) 4 5 6
The dfn {(+/⍵)÷≢⍵}
computes the average of a numeric vector.
Rewrite it as a fork.
The monadic tacit function ∘.×⍨∘⍳
computes the multiplication table for the numbers from 1 to the right argument.
Rewrite it as a fork.
∘.×⍨∘⍳ 5
The train (FG)
behaves the same way as the derived function F⍤G
.
Write a train that behaves the same way as F∘G
when F∘G
is used dyadically.
The train (FG)
behaves in the same way as the derived function F⍤G
.
Write two trains that behave the same way as F⍥G
when F⍥G
is used dyadically:
one train that uses nested trains but no compositional operators; and
one train that uses compositional operators but no nested trains.
Write a train that behaves like G _B_ F
when G _B_ F
is used dyadically, assuming _B_
is the operator behind from a previous exercise.
Rewrite the dfns that follow as function trains:
{×⍵} ¯3.14
1
{*2×⍵} 2
54.5982
{2+5×⍵} ¯3.14
¯13.7
{⍵2*⍵} 0.5
¯0.914214
{⍵2*⍵} 0.5
0.914214
{(⍵*2)+⍵*3} 2
12
Rewrite the dfns that follow as function trains:
3 {÷⍵⍺} 4
1
3 {(⍺+⍵)×⍺⍵} 4
¯7
3 {1+(⍺+⍵)×⍺⍵} 4
¯6
3 {⍵*1+⍺} 4
256
3 {(⍳⍵)*1+⍺} 4
1 16 81 256
3 {⍵*1+⍳⍺} 4
16 64 256
The dfn {,1↑⍵+⍉⍵}
is a monadic dfn that accepts numeric matrix that are square (that is, that have the same number of rows and columns).
Of the trains below, which ones are equivalent to this dfn when called monadically?
(,1↑⊢+⍉)
(,∘1∘↑⊢+⍉)
(,⍉1∘↑⍤+⊢)
(,1↑+∘⍉⍨)
((,1∘↑)⍉+⊢)
(,∘(1∘↑)⊢+⍉)
(≢↑⍉,⍤+⊢)
You are welcome to inspect the (tree) diagrams of the trains, but do not run the trains. The objective of the exercise is to analyse the trains and reason about them.
12.8. Solutions#
Solution to Exercise 12.1
What is the result of the expression 2∘×⍤+/ 3 5
?
What about 2∘×⍤+/ 3 5 7
?
Because operators bind from the left, 2∘×⍤+/
is equivalent to ((2∘×)⍤+)/
.
The leftmost function is 2∘×
which is times bound to a left argument 2, which is the function double.
The function double is being used atop the function plus, so the reduction is a reduction by addition followed by doubling.
Therefore, the expression 2∘×⍤+/ 3 5
will result in 16:
2∘×⍤+/ 3 5
2∘×3+5
Similarly, if the vector argument is 3 5 7
, we start by adding 5 and 7 and doubling, which gives 24.
Then, we add 3 to 24 and double, which gives 54:
2∘×⍤+/ 3 5 7
Solution to Exercise 12.2
Positive values are greater than 0, which can be computed with this derived function:
>∘0
Values equal to ¯1
can be computed with:
¯1∘=
Values less than or equal to 3.5 can be found with:
≤∘3.5
Solution to Exercise 12.3
4 5 6 ,⍥≢ 1 0 1
4 5 6 ,∘≢ 1 0 1
4 5 6 ,⍤≢ 1 0 1
Solution to Exercise 12.4
Replace the underscores _
with the correct compositional operators so that the results become correct:
2 +/_ 11 6 7 19
3
(⍳6) /⍨_~ 1 0 0 1 0 0
2 3 5 6
2 ÷_+_÷ 4
1.33333
2 +/⍤ 11 6 7 19
(⍳6) /⍨∘~ 1 0 0 1 0 0
2 ÷⍤+⍥÷ 4
Solution to Exercise 12.5
Rewrite the expressions below without parentheses. Use the operator commute instead.
(∪w)⍳w ← 'MISSISSIPPI'
1 2 3 3 2 3 3 2 4 4 2
≢(' '≠sentence)⊆sentence
7
((⌊(≢⎕A)÷9),3 3)⍴⎕A
ABC
DEF
GHI
JKL
MNO
PQR
w⍳⍨∪w ← 'MISSISSIPPI'
≢sentence⊆⍨' '≠sentence
⎕A⍴⍨3 3,⍨⌊9÷⍨≢⎕A
Solution to Exercise 12.6
mask ← ' '=sentence
+/mask ⍝ number of blank spaces
The operator constant ⍨
turns the vector mask
into a function that returns the mask that determines where the asterisk is going to be put:
'*'@(mask⍨)sentence
If the mask had not been calculated previously, we could achieve the same effect with a right operand dfn that computes the mask:
'*'@{' '=⍵}sentence
We could even use a compositional operator:
'*'@(' '∘=)sentence
In this exercise, we tried to emulate the context in which the mask has already been computed because it was used for something else. In cases like this, it is not needed to recompute the mask.
Solution to Exercise 12.7
A direct implementation could be:
_B_ ← {(⍺⍺ ⍺)⍵⍵ ⍵}
¯2 _B_+ 5
(⍳10) (3∘↑_B_,) 4 5 6
Alternatively, one can implement behind in terms of beside and commute.
Using the hint, we can start with F∘G
.
We want to have ⍺ G _B_ F ⍵ ←→ (G ⍺) F ⍵
and we have ⍺ G _B_ F ⍵ ←→ ⍺ F∘G ⍵ ←→ ⍺ F G ⍵
.
The first thing that is wrong with this version is that G
is being applied to ⍵
and not ⍺
, so we can fix this by using the operator commute once.
If we have F∘G⍨
, then ⍺ F∘G⍨ ⍵ ←→ ⍵ F∘G ⍺ ←→ ⍵ F G ⍺
.
Thus, G
is being applied to the correct argument, but then F
is getting its arguments ⍵
and G ⍺
in the wrong order.
To fix this, we need to apply a second commute to the function F
alone.
If we use the pattern F⍨∘G⍨
then we have ⍺ F⍨∘G~ ⍵ ←→ ⍵ F⍨∘G ⍺ ←→ ⍵ F⍨ G ⍺ ←→ (G ⍺) F ⍵
, which is what we wanted.
We can implement this in our dop:
_B_ ← {⍺ ⍵⍵⍨∘⍺⍺⍨ ⍵}
¯2 _B_+ 5
(⍳10) (3∘↑_B_,) 4 5 6
Solution to Exercise 12.8
Avg ← +/÷≢
Avg 1 2 3 4
Solution to Exercise 12.9
(⍳∘.×⍳) 5
Solution to Exercise 12.10
When F∘G
is used dyadically, we have ⍺ F∘G ⍵ ←→ ⍺ F G ⍵ ←→ (⍺⊣⍵) F G ⍵ ←→ (⍺⊣⍵) F (G ⍺⊢⍵) ←→ (⍺⊣⍵) F (⍺ G⍤⊢ ⍵)
.
Therefore, the train (⊣FG⍤⊢)
is the same as F∘G
when both are used dyadically.
5 +∘ ¯2
5 (⊣+⍤⊢) ¯2
Solution to Exercise 12.11
When F⍥G
is used dyadically, we have ⍺ F⍥G ⍵ ←→ (G ⍺) F (G ⍵) ←→ (G ⍺⊣⍵) F (G ⍺⊢⍵)
.
Both G ⍺⊣⍵
and G ⍺⊢⍵
exhibit the pattern of an atop, which can be written as a 2train or with the operator atop.
If we use 2trains, we get ((G⊣)F(G⊢))
.
If we use the operator atop, we get (G⍤⊣FG⍤⊢)
.
These are the same as F⍥G
if used dyadically:
(⍳10) +⍥≢ ⎕A
(⍳10) ((≢⊣)+(≢⊢)) ⎕A
(⍳10) (≢⍤⊣+≢⍤⊢) ⎕A
Solution to Exercise 12.12
G _B_ F
used dyadically gives ⍺ G _B_ F ⍵ ←→ (G ⍺) F ⍵
, which is somewhat symmetric to F∘G ←→ ⍺ F (G ⍵)
.
Thus, if the fork for beside was (⊣FG⍤⊢)
we get that the fork for behind is (G⍤⊣F⊢)
:
¯2 _B_+ 5
¯2 (⍤⊣+⊢) 5
Solution to Exercise 12.13
In the case of the first dfn {×⍵}
, it is so short that we realise (×)
is just an atop:
(×) ¯3.14
When rewriting a dfn or another expression as a function train, arrays (whether the argument or constant arrays used in the expression) have to go in the odd positions. Dyadic functions that combine results that are computed from the argument tend to be the functions that go in the even positions.
For example, in the dfn {*2×⍵}
we see the arrays 2
and ⍵
and we could try to put them in the odd positions of a train: (... _ 2 _ ⊢)
.
We use the function right tack instead of ⍵
because, in tacit programming we do not mention the arguments explicitly.
Then, we see that the function times is a dyadic function combining the 2
and the ⍵
, so that can be the centre function in the fork of the first three functions: (... _ 2 × ⊢)
.
Finally, we see that the function exponential is applied atop the result, so the train becomes (*2×⊢)
:
(*2×⊢) 2
Similarly, in the dfn {2+5×⍵}
, we see three arrays and no computations are done to the left of the 2
, so we can expect to try and fit that dfn in a train that looks like (2 _ 5 _ ⊢)
.
Conveniently enough, it is enough to insert the same functions in the same positions, and the train is correct:
(2+5×⊢) ¯3.14
The same reasoning works for the dfns {⍵2*⍵}
and {⍵2*⍵}
, giving the trains below:
(⊢2*⊢) 0.5
(⊢2*⊢) 0.5
In the dfn {(⍵*2)+⍵*3}
, we see that the function +
is combining the results of computing ⍵*2
and ⍵*3
, thus +
seems like the centre function of a fork (F + G)
.
Now, we need F
to compute ⍵*2
and we need G
to compute ⍵*3
.
If you look at the expression ⍵*2
as being the function square applied to ⍵
, and if you look at the expression ⍵*3
as the function cube applied to ⍵
, then the fork (*∘2+*∘3)
might look more natural to you.
(*∘2+*∘3) 2
If you start to associate the argument(s) with the functions left and right tack, you might see ⍵*2
as the “fork” (⊢*2)
, except that forks cannot have arrays on the right.
You could fix this by writing (2*⍨⊢)
or (⊢*2⍨)
.
Thus, you would come up with a fork like ((2*⍨⊢)+(3*⍨⊢))
.
((2*⍨⊢)+(3*⍨⊢)) 2
Solution to Exercise 12.14
In trains that are used dyadically, whenever we find a function that is being used with both the left and right arguments, that is a strong indication that that function should be in an odd position of a train.
For example, for the first dfn {÷⍵⍺}
, we see the function minus being applied to both arguments, albeit in the wrong order.
This means that (... ⍨)
is a good start for our train.
Then, we can finish our train with the function reciprocal atop the subtraction:
3 (÷⍨) 4
Likewise, the dfn {(⍺+⍵)×⍺⍵}
shows the functions plus and minus being used dyadically, hinting at a fork of the form (+ ... )
.
Then, it is just a matter of inserting the function times in the centre:
3 (+×) 4
Again, we see two functions being used dyadically with ⍺
and ⍵
, that are likely to go in odd positions.
On top of that, we see the array 1
being used explicitly.
Constant arrays also go in odd positions, so we have a possible train structure that looks like (1 _ + _ )
.
Then, we fill in the gaps and arrive at the train below:
3 (1++×) 4
In dyadic trains where one of the arguments shows up isolated, that is likely to be a place to use a left or right tack, depending on whether we need ⍺
or ⍵
, respectively.
Below, by inspecting the positions of the arguments and the literal arrays, we can suspect our train will have a structure like (⊢ _ 1 _ ⊣)
.
To finish this train, we insert the functions that are missing:
3 (⊢*1+⊣) 4
When part of the expression that is being translated contains a function that is applied to only one of the arguments, we must work around that in some way.
For example, in the dfn {(⍳⍵)*1+⍺}
, we can modify the train (⊢*1+⊣)
by having the index generator atop the right tack:
3 (⍳⍤⊢*1+⊣) 4
In the example of the dfn {⍵*1+⍳⍺}
we can do the exact same thing to apply the index generator atop the left tack, resulting in the train
3 (⊢*1+⍳⍤⊣) 4
Another alternative would be to use the operator beside so that the function plus becomes plus with the right argument preprocessed by the index generator:
3 (⊢*1+∘⍳⊣) 4
(Note that, in the prose above, the “right argument” of plus that is preprocessed is actually the left argument of the train.)
Solution to Exercise 12.15
The dfn {,1↑⍵+⍉⍵}
takes a square numeric matrix, adds it with its transpose, and then returns the first row of the resulting matrix as a vector.
Here is an example:
{,1↑⍵+⍉⍵} 5 5⍴⍳25
Of the seven trains shown, the only one that is not equivalent to this dfn is the second option, (,∘1∘↑⊢+⍉)
.
The second option is not equivalent to the dfn because of the derived function ,∘1∘↑
that is applied atop the fork (⊢+⍉)
.
The derived function ,∘1∘↑
is parsed as (,∘1)∘↑
because operators bind from the left.
Thus, after we add the matrix to its transpose, we mix the matrix (which leaves it unchanged) and then we catenate the scalar 1
to its right, adding a column of ones:
(,∘1∘↑⊢+⍉) 5 5⍴⍳25
Now, we explain why the other trains are equivalent to the dfn.
The very first one, (,1↑⊢+⍉)
is a direct translation of the dfn:
(,1↑⊢+⍉) 5 5⍴⍳25
The third one is (,⍉1∘↑⍤+⊢)
, which is pretty similar to the first one, except that the part “take the first row” was moved atop the addition of the matrix argument and its transpose.
This can be seen in the tree diagram of the train:
(,⍉1∘↑⍤+⊢)
This means that, after we add the matrix argument with its transpose, we immediately take the first row.
Then, the function ravel that is atop the fork (⍉1∘↑⍤+⊢)
turns that 1row matrix into the result vector:
(,⍉1∘↑⍤+⊢) 5 5⍴⍳25
The fourth train is (,1↑+∘⍉⍨)
, which is a 4train:
(,1↑+∘⍉⍨)
What we need to do is understand the rightmost function in the train, which is +∘⍉⍨
.
The train is called monadically, so the function +∘⍉⍨
will also be called monadically, which means the operator ⍨
is the operator selfie: +∘⍉⍨ ⍵ ←→ ⍵ +∘⍉ ⍵
.
Now, the usage of the operator beside means that we preprocess the right argument of plus with transpose, so we end up with +∘⍉⍨ ⍵ ←→ ⍵ +∘⍉ ⍵ ←→ ⍵ + ⍉⍵
, which is exactly what we have in the dfn:
(,1↑+∘⍉⍨) 5 5⍴⍳25
The fifth and sixth trains, ((,1∘↑)⍉+⊢)
and (,∘(1∘↑)⊢+⍉)
, differ in two things.
The first difference is in the first three functions: (⍉+⊢)
versus (⊢+⍉)
.
However, plus is a commutative function, so this difference in the definition of the train doe not affect the result.
The second difference is in the way the leftmost function is defined. Notice how both trains are 4trains.
In one, we have a nested 2train (,1∘↑)
atop the fork.
This atop applies the functions take one and ravel consecutively to the addition of the matrix argument with its transpose.
In the other, we have the derived function ,∘(1∘↑)
atop the fork.
This derived function uses parentheses to prevent ,∘1∘↑
to bind from the left, like in the second train.
In short, both trains are equivalent to the original dfn:
((,1∘↑)⍉+⊢) 5 5⍴⍳25
(,∘(1∘↑)⊢+⍉) 5 5⍴⍳25
The final train that we have to study is (≢↑⍉,⍤+⊢)
.
This one looks more different from the original dfn because it uses primitive functions that are not present in the original dfn.
This train is a 5train that starts with the fork (⍉,⍤+⊢)
.
This fork starts by adding the argument matrix to its transpose and then ravels it, because of the function ravel atop the function plus:
(⍉+⊢) 5 5⍴⍳25
(⍉,⍤+⊢) 5 5⍴⍳25
Then, that ravel is going to be the right argument to the fork (≢↑⊢)
.
The function tally takes the original matrix as argument, so the result of that will be the number of rows (or columns, because it is a square matrix) that the matrix has.
Thus, from the ravel of the matrix addition, we will take the number of elements of a single row, which corresponds to the first row of the matrix addition:
(≢↑⍉,⍤+⊢) 5 5⍴⍳25