(Tacit-Programming-Tacit-Programming)=
# Tacit Programming

Tacit programming is a programming paradigm that APL supports.
In order to understand what tacit programming is, we need to know what the word “tacit” means, in English:

 > “Tacit, adjective: understood or implied without being stated.”

In tacit programming, the thing that is implied without being stated is what arguments the functions receive.
In other words, in tacit programming we create functions by combining other functions _without_ specifying where the arguments go.
This sounds much more confusing than what it really is, so let us study some examples.

(Tacit-Programming-Combining-Functions-with-Operators)=
## Combining Functions with Operators

(Tacit-Programming-Derived-Functions-are-Tacit)=
### Derived Functions are Tacit

The simplest example of tacit programming arises from the use of the primitive operators.
Recall these two helper functions from a previous exercise:

In [1]:
Trim ← {3↑⍵}
IsLong ← {3<≢⍵}

These are two regular dfns.
Now, we can use them to create a function `TrimLong` that will trim all the elements of the argument vector that are too long:

In [2]:
TrimLong ← {(Trim¨)@(IsLong¨) ⍵}
TrimLong ⎕← ⍳¨⍳5

As it stands, the dfn `TrimLong` has nothing special about.
However, we can get rid of the braces and the omega `⍵` because those things are redundant:

In [3]:
TrimLong ← (Trim¨)@(IsLong¨)
TrimLong ⍳¨⍳5

This alternative implementation works because the operator _at_ needs two operands to derive a new function, and by providing the two operands, we assign the _derived function_ directly to the variable `TrimLong`.
We do not need to wrap the _derived function_ in a dfn.
In fact, when we learned about _at_ in {numref}`Operators-At`, we learned exactly how the _derived function_ will handle its right argument:

 - the right argument will be passed directly to the right operand of _at_, which is `IsLong¨`; then
 - the elements of the right argument for which `IsLong` evaluates to 1 are collected as passed to the left operand `Trim¨`.

Looking closely at the tacit definition of `TrimLong` we see that we actually have two levels of tacit programming.
Notice how the right operand of the operator _at_ is `IsLong¨`.
Why is it `IsLong¨` and not just `IsLong`?

The dfn `IsLong` takes a vector and determines if it has more than three elements, but we already know that the right operand of _at_ will receive a nested vector.
In our example above, that was `⍳¨⍳5`:

In [4]:
⍳¨⍳5

Thus, if we want to determine which nested elements of the argument vector are too long, we need to use the operator _each_ to modify `IsLong`.
The use of _each_ modifies how `IsLong` works and that modification is done tacitly because of the definition of _each_: we do not need to write anything to explain that the function `IsLong` is going to be applied to each scalar of the argument to `IsLong¨`.

Let us try another tacit definition:

In [5]:
MaxWindow ← {⌈/,⍺↓⍵}⌺3 3

The (tacit!) function `MaxWindow` takes a matrix argument and computes the maximum value in every 3 by 3 window:

In [6]:
⎕← mat ← 4 6⍴(⍳3),⍳2

In [7]:
MaxWindow mat

Suppose that we want to modify this function so that we can apply it to higher-rank arrays.
Our goal is that the function `MaxWindow` gets applied to each 2-cell (each sub-matrix), so we can do that with the operator _rank_:

In [8]:
HighRankMaxWindow ← MaxWindow⍤2

In [9]:
⎕← cuboid ← 2 5 7⍴(⍳3),(1+⍳3)

In [10]:
HighRankMaxWindow cuboid

We could have defined `HighRankMaxWindow` directly:

In [11]:
HighRankMaxWindow ← ({⌈/,⍺↓⍵}⌺3 3)⍤2

(Tacit-Programming-Operator-Binding-Order)=
### Operator Binding Order

We have seen two tacit functions that make use of multiple operators to derive successive functions:

In [12]:
TrimLong ← (Trim¨)@(IsLong¨)
HighRankMaxWindow ← ({⌈/,⍺↓⍵}⌺3 3)⍤2

However, both functions have superfluous parenthesis, because we have not been considering the order in which operators bind in Dyalog APL.

When we have an expression, we do not need to parenthesise from the right.
For example,

In [13]:
1 + (⍳5)

is just

In [14]:
1 + ⍳5

When using multiple operators together, we do not need to parenthesise from the left.
For example, the function `TrimLong` was defined as

In [15]:
TrimLong ← (Trim¨)@(IsLong¨)

but it could have been defined as

In [16]:
TrimLong ← Trim¨@(IsLong¨)
TrimLong ⍳¨⍳5

The leftmost set of parentheses was not necessary because operators bind from the left.
Thus, the expression `Trim¨@IsLong¨` would have been equivalent to `((Trim¨)@IsLong)¨`.
This shows that the leftmost set of parentheses is not necessary, whereas the rightmost set **is** necessary, otherwise the rightmost operator _each_ binds to the derived function `Trim¨@IsLong` instead of the dfn `IsLong`.

Similarly, the definition of `HighRankMaxWindow` has an extra set of parentheses.
Instead of `({⌈/,⍺↓⍵}⌺3 3)⍤2`, we can write

In [17]:
HighRankMaxWindow ← {⌈/,⍺↓⍵}⌺3 3⍤2
HighRankMaxWindow cuboid

Now, we will learn about a tool that will help us study tacit functions and, in particular, understand what parentheses are needed and which ones are not.

(Tacit-Programming-Inspecting-Tacit-Functions)=
## Inspecting Tacit Functions

The user command `]box` that you first learned in {numref}`Data-and-Variables-Nested-Arrays` can also be used to customise how tacit functions are displayed.
This customisation is done through the switch `-trains`.

Take a look at the help message below and read the different options for the switch `-trains`:

In [18]:
]box -?

This switch is called "trains" because trains are the more general form of tacit programming in Dyalog.
We will learn about trains in {numref}`Tacit-Programming-Function-Trains`.

Let us go through the multiple options available in the subsections that follow.

(Tacit-Programming-Box)=
### Box

The option `box` draws boxes that indicate the binding order of operators and operands.
Thus, inner boxes indicate that their contents bind first, and the contents of outer boxes bind later, possibly with content from inner boxes.

A couple of examples will follow.

In [19]:
]box -trains=box

First, we will see how the version of `TrimLong` that does not have superfluous parentheses is represented:

In [20]:
Trim¨@(IsLong¨)

As we can see, with the box diagram, we see that the operands of the operator _at_ are the two boxes on its side:

 - on the left, the box contains two other boxes, the dfn `{3↑⍵}` and the operator `¨`, so we get `{3↑⍵}¨` as the left operand; and
 - on the right, the box contains two other boxes, the dfn `{3<≢⍵}` and the operator `¨`, so we get `{3<≢⍵}¨` as the right operand.

If we drop the right set of parentheses, the box diagram changes to reflect the fact that the rightmost _each_ has as left operand everything else:

In [21]:
Trim¨@IsLong¨

If you look closely, the box diagrams look like nested vectors.
A 2-element vector represents the left operand of an operator and its operator, and a 3-element vector represents the left operand, a dyadic operator, and the right operand.

Working from the innermost box, the operator _each_ binds with `{3↑⍵}` to create the first derived function `F1`:

In [22]:
⎕← F1 ← '{3↑⍵}' '¨'

Then, the operator _at_ binds with `F1` on the left and with `{3<≢⍵}` on the right to create the second derived function `F2`:

In [23]:
⎕← F2 ← F1 '@' '{3<≢⍵}'

Finally, the last _each_ binds with `F2` on the left to create the third and final derived function `F3`:

In [24]:
⎕← F3 ← F2 '¨'

Similarly, we can see that `HighRankMaxWindow` does not need any parentheses to be interpreted as we needed:

In [25]:
{⌈/,⍺↓⍵}⌺3 3⍤2

The operator _stencil_ got bound with its operands first, and that derived function was the left operand to _rank_.

(Tacit-Programming-Tree)=
### Tree

The option `tree` draws the tacit function in a tree structure, with the top/root of the tree being the operator that binds last.
A monadic operator gets a branch to its left operand and a dyadic operator gets two branches, one for each operand.

In [26]:
]box -trains=tree

If we inspect the tacit definition of `HighRankMaxWindow` first, it should show the operator _rank_ at the top with a sub-tree on the left to represent the left operand `{⌈/,⍺↓⍵}⌺3 3` and a branch on the right pointing to the right operand `2`:

In [27]:
HighRankMaxWindow

This tree structure shows that the function `HighRankMaxWindow` is a function derived from the operator _rank_.
Then, to interpret the left operand, we have to inspect the sub-tree on the left:

```
    ⌺
┌───┴────┐
{⌈/,⍺↓⍵} (2⍴3)
```

The left sub-tree shows that the left operand is a function derived from the operator _stencil_ with a left operand dfn and a right operand vector.

We can also inspect the tree structure of the expression for `TrimLong` **without** any parentheses:

In [28]:
{3↑⍵}¨@{3<≢⍵}¨

The fact that the root of the tree is the operator _each_ shows that we needed a set of parentheses somewhere.
The tree should have the operator _at_ at the root with another derived function on each branch:

In [29]:
{3↑⍵}¨@({3<≢⍵}¨)

(Tacit-Programming-Parens)=
### Parens

The option `-trains=parens` will always add as many parentheses as possible, even if superfluous, to make explicit the binding of the operators and their operators:

In [30]:
]box -trains=parens

In [31]:
{3↑⍵}¨@{3<≢⍵}¨

In [32]:
{⌈/,⍺↓⍵}⌺3 3⍤2

Note that if any of the elements to be displayed take up multiple lines, then the function will be displayed as if `]box` were OFF.
This display format may look unusual, so we show two functions in that format so you get acquainted with it:

In [33]:
]box OFF

In [34]:
{3↑⍵}¨@{3<≢⍵}¨

In [35]:
{⌈/,⍺↓⍵}⌺3 3⍤2

In [36]:
]box ON

(Tacit-Programming-Def)=
### Def

The option `-trains=def` will show the simplest expression that still defines the same function.

In [37]:
]box -trains=def

For example, if we display our original implementation of `TrimLong` that contained superfluous parenthesis, this option will remove them:

In [38]:
(Trim¨)@(IsLong¨)

Whenever you are writing a derived function and are not sure if the operands will bind like you need them to, use these tools to inspect the derived function and understand what you need to do to make sure you define your function correctly.

Let us reset `]box` to using the option `-trains=tree`:

In [39]:
]box -trains=tree

Deriving functions from operators is the simplest form of tacit programming.
In the sections that follow, we will learn about function composition and trains which provide other mechanisms for tacit programming.

(Tacit-Programming-Function-Composition)=
## Function Composition

Function composition refers to the act of defining a new function at the expense of smaller functions that get applied in the pattern specified by the combining operators.

Trains, explained in detail in {numref}`Tacit-Programming-Function-Trains`, can also be thought of as a form of function composition, but this section will focus on the three operators that Dyalog provides for function composition.

The three operators introduced here, _beside_, _atop_, and _over_, are dyadic operators that take their operands and produce a single, composite operation.
One can regard these operators as easy ways of specifying inline "mini-functions".
As such, these operators do not add functionality to the language that could not be obtained by other means; they just represent a very convenient notation to express some common patterns.

We start by introducing _beside_.

(Tacit-Programming-Beside)=
### Beside

_Beside_ is a dyadic operator represented by _jot_ `∘`, which you can type with <kbd>APL</kbd>+<kbd>j</kbd>, as you already know.
In this section, we will cover the forms of _beside_ where the operands are functions.
The section on argument binding ([here](#Binding)) will cover the forms in which _beside_ has an array operand.

_Beside_ takes the two operand functions and applies them to the argument(s) with a pattern that depends on the valence of the derived function:

 - `F∘G ⍵ ←→ F G ⍵`, both `F` and `G` are used as monadic functions.
 - `⍺ F∘G ⍵ ←→ ⍺ F G ⍵`, `F` is used dyadically and `G` is used monadically.

The patterns shown above can also be represented in diagrams like the ones in {numref}`fig-Beside_Diagram`.

(fig-Beside_Diagram)=
```{figure} ../res/Beside_Diagram.svg
---
name: Beside_Diagram
---
Diagram showing how the operator _beside_ composes its operands.
```

The composition with _beside_, `F∘G`, can be interpreted as "preprocess the right argument of `F` with `G`".

The operator _beside_ is rarely used alone.
After all, the expressions above show that instead of writing `F∘G ⍵` one could just write `F G ⍵` or, instead of writing `⍺ F∘G ⍵`, one could just write `⍺ F G ⍵`.
_Beside_ is often used together with the operator _each_, as this may give important advantages for execution time and memory consumption.
Another example usage of _beside_ is to create a derived function to be used together with the operator _reduce_.
We will show examples of these usages below.

(Tacit-Programming-Monadic-derived-function)=
#### Monadic derived function

Quite often you would like to apply two monadic functions to each item of an array.
This is very easy to do with the help of the powerful operator _each_.

Let us look at the simple example in which we just want to find the rank of each item of the variable `weird`:

In [40]:
⎕← weird ← 2 2⍴456 (2 2⍴ 'Dyalog' 44 27 (2 2⍴8 6 2 4)) (17 51) 'Twisted'

The rank of `array` is `≢⍴array`, so the rank of each item of `weird` is:

In [41]:
≢¨⍴¨weird

In the expression above, the `⍴¨` creates a (potentially big) array containing the shape of each item of `weird`.
Then, the `≢¨` gets the length of each vector of the intermediate result.
Remember: the rank of an array is the length of the shape of the array.

This is inefficient for two reasons:

 1. Firstly, APL must allocate memory to hold the intermediate array, which will be discarded as soon as the entire expression has been evaluated.

We can see this intermediate result if we insert a `⎕←` between `≢¨` and `⍴¨`:

In [42]:
≢¨ ⎕← ⍴¨ weird

 2. Secondly, internally APL must loop through a potentially large number of items twice.

With the help of _beside_, we can eliminate both problems:
APL only needs to traverse the array once, applying both functions to each item in succession.
During the processing of each item, only a very small intermediate array will be created holding the shape of each item, and it will be discarded before processing the next item:

In [43]:
≢∘⍴¨ weird

The expression `≢∘⍴¨` is another example of tacit programming.
From the example above, we know that `≢∘⍴¨` computes the rank of each item in a nested array, if applied monadically.
However, that expression contains two functions and two operators, and in no way we specify explicitly what arguments go where.
Hence, `≢∘⍴¨` is an example of tacit programming.

As another example usage of _beside_, consider the expression below:

In [44]:
+/∘⍳¨ 2 4 7

This expression adds up the items of `⍳2`, those of `⍳4`, and finally those of `⍳7`.
Using _beside_ to compose the _plus reduction_ and the _index generator_ functions uses up less space than using the operator _each_ twice because the intermediate result would be a nested vector with all the vectors created by the _index generator_:

In [45]:
+/¨ ⎕← ⍳¨ 2 4 7

If the initial argument contained more integers and they were all large integers, the intermediate result would be unnecessarily long.

In the third example that follows, both operands to _beside_ are user-defined functions we have seen before:

In [46]:
Sqrt ← {⍵*0.5}
Average ← {(+/⍵)÷≢⍵}

In [47]:
Sqrt∘Average¨ (11 7)(8 11)(21 51)(16 9)

(Tacit-Programming-Dyadic-derived-function)=
#### Dyadic derived function

Here is an example of composition of _times_ and _index generator_ with the operator _beside_:

In [48]:
1 10 100 ×∘⍳¨ 2 4 3

The expression above multiplies `1` with `⍳2`, then it multiplies `10` with `⍳4`, and finally, it multiplies `100` with `⍳3`.

Another example involves the approximation of the **golden mean**, which can be calculated by this infinite series:

$$
1 +\div~ 1 +\div~ 1 +\div~ 1 +\div~ 1 +\div~ 1 +\div~ \cdots
$$

As you can see, we have inserted `+÷` between the items of a series of ones.
This operation is a _reduction_ by `+÷`, but the operator _reduce_ only accepts a single function on its left.
To overcome this, we can use the operator _beside_ to compose the two functions `+` and `÷` together, thereby creating a single, derived function that may be used together with _reduce_:

In [49]:
+∘÷/ 1 1 1 1 1 1

In [50]:
+∘÷/ 50⍴1

(Tacit-Programming-Atop)=
### Atop

_Atop_ is a dyadic operator represented by _jot diaeresis_ `⍤`, which you can type with <kbd>APL</kbd>+<kbd>Shift</kbd>+<kbd>J</kbd>, which is the same glyph as the one used for the operator _rank_.

The difference between the operator _rank_ and the operator _atop_ lies in the right operand:

 - for the operator _rank_, the right operand is an array; and
 - for the operator _atop_, the right operand is a function.

_Atop_ takes the two operand functions and applies them to the argument(s) with a pattern that depends on the valence of the derived function:

 - `F⍤G ⍵ ←→ F G ⍵`, both `F` and `G` are used as monadic functions and this is exactly the same as `F∘G`.
 - `⍺ F⍤G ⍵ ←→ F ⍺ G ⍵`, `F` is used monadically and `G` is used dyadically.

The patterns shown above can also be represented in diagrams like the ones in {numref}`fig-Atop_Diagram`.

(fig-Atop_Diagram)=
```{figure} ../res/Atop_Diagram.svg
---
name: Atop_Diagram
---
Diagram showing how the operator _atop_ composes its operands.
```

The left operand is applied _atop_ the right operand.
In other words, the left operand function is applied after the right operand is applied to the available arguments.
Yet another way of describing the operator _atop_ is by saying that `F⍤G` post-processes the result of `G` with `F`.

We will show some example usages below.

Much like _beside_, seen before, and _over_, which will be shown next, the operator _atop_ is typically used in conjunction with other operators, for example _each_ or _reduce_.

(Tacit-Programming-Example-usages-of-atop)=
#### Example usages of atop

Suppose that you need to determine whether a number comes before or after another number.
This type of comparison can be made with one of the many comparison primitives: `<`, `≤`, `≥`, and `>`.
However, the comparison primitives return Boolean results, which are either 0 or 1.

If you wanted a more fine-grained comparison that distinguishes whether the left argument is before the right argument, after the right argument, or is the same as the right argument, you could use a derived function with _atop_:

In [51]:
5 ×⍤- 2 5 8

The result of `⍺ ×⍤- ⍵` is one of three values:

 - `1` if `⍺` comes after `⍵`;
 - `0` if `⍺` is the same as `⍵`; and
 - `¯1` if `⍺` comes before `⍵`.

The dyadic derived function `×⍤-` can be interpreted as "the sign of the difference", which can be seen as post-processing the difference of the two arguments with the function _sign_.

It is true that the expression shown above could be rewritten without _atop_:

In [52]:
× 5 - 2 5 8

But that is not necessarily a better alternative to using the derived function `×⍤-` and, in some cases, rewriting `×⍤-` without the _atop_ may not be an alternative.

(Tacit-Programming-Function-atop-tack)=
#### Function atop tack

There is a common usage pattern for the operator _atop_ that involves using a tack.

Consider the monadic derived function `≢⍤⊢⌸` that uses the operator _key_ and the operator _atop_, where the left operand to `⌸` is `≢⍤⊢` because operators bind from the left.
The monadic derived function `≢⍤⊢⌸` counts how many times each unique item appears in its argument:

In [53]:
≢⍤⊢⌸ 'MISSISSIPPI'

The result above means that one of the letters show up 1 time, two other letters show up 4 times each, and the fourth letter shows up 2 times.
These counts are in the order of the unique elements, so we can easily find out the letters associated with the counts:

In [54]:
∪'MISSISSIPPI'

The point of using `F⍤⊢⌸` is that the operator _key_ passes two arguments to its left operand (to give more flexibility to the user) but we only care about one of those, so we use the appropriate tack to select the argument we want, and then apply `F` to that argument.

A similar pattern is exhibited in this alternative implementation of _n-wise reduction_ as a _dop_:

In [55]:
_NWiseReduce ← {⍺⍺/⍤⊢⌺⍺⊢⍵}
2 +_NWiseReduce ⍳10

`⍺⍺/⍤⊢⌺⍺` is a single derived function and we can inspect its structure:

In [56]:
{}/⍤⊢⌺0

In the expression above, we substituted the left operand `⍺⍺` of the dop with `{}` because writing `⍺⍺` outside of a dop gives a `SYNTAX ERROR`.
Similarly, we used `0` as the right operand of `⌺` because we cannot write `⍺` outside of a _dfn_/_dop_.

So, in inspecting the structure of `⍺⍺/⍤⊢⌺⍺`, we see that `⍺⍺/⍤⊢` is the left operand of the operator _stencil_.
Recall that the left operand of the operator _stencil_ takes two arguments:

 - the left argument gives information about the padding of the current sub-array; and
 - the right argument is the sub-array being processed.

Because we do not need the information about the left argument, we use `⍺⍺/⍤⊢` to apply `⍺⍺/` directly to the right argument.

(Tacit-Programming-Over)=
### Over

_Over_ is a dyadic operator represented by _circle diaeresis_ `⍥`, which you can type with <kbd>APL</kbd>+<kbd>Shift</kbd>+<kbd>O</kbd> (that is the letter "Oh", and not the number zero).

_Over_ takes two operand functions and applies them to the argument(s) with a pattern that depends on the valence of the derived function:

 - `F⍥G ⍵ ←→ F G ⍵`, both `F` and `G` are used as monadic functions and this is exactly the same as `F∘G` or `F⍤G`.
 - `⍺ F⍥G ⍵ ←→ (G ⍺) F (G ⍵)`, `F` is used dyadically and `G` is used monadically.

The patterns shown above can also be represented in diagrams like the ones in {numref}`fig-Over_Diagram`.

(fig-Over_Diagram)=
```{figure} ../res/Over_Diagram.svg
---
name: Over_Diagram
---
Diagram showing how the operator _over_ composes its operands.
```

The usage `F⍥G` of the operator _over_ can be interpreted as "apply `F` after pre-processing all arguments with `G`".

We will show some example usages below.

Of the three compositional operators discussed so far, _beside_, _atop_, and _over_, the operator _over_ is the one that is more commonly used alone.

We will show some example usages below.

How can we check if two words start with the same letter?

In [57]:
word1 ← 'banana'
word2 ← 'bat'

We can use _first_ to get the first character of each word and see if they match:

In [58]:
(⊃word1) = ⊃word2

Alternatively, we can check for _equality over the first character_:

In [59]:
word1 =⍥⊃ word2

Suppose that we represent an interval of numbers with a 2-item vector with the two endpoints.
For example, `0 3.34` would be the interval of all the numbers between `0` and `3.34`.

The centre of an interval can be computed as such:

In [60]:
Centre ← {(+/⍵)÷2}
Centre 0 3.34

The distance between two intervals can be defined as the distance between the centres, which we can compute with `-⍥Centre`.

So, the distance between the intervals `¯1 1` and `0 3.34` is

In [61]:
0 3.34 -⍥Centre ¯1 0

If we swap the order of the two arguments, we see that the distance becomes negative:

In [62]:
¯1 0 -⍥Centre 0 3.34

This does not make much sense, so we might want to fix this by saying that we want the _absolute value after the difference of the centres_, which is done by using the operator _atop_ to "post-process" the result of the subtraction:

In [63]:
¯1 0 |⍤-⍥Centre 0 3.34

In [64]:
|⍤-⍥Centre

(Tacit-Programming-Comparison-of-the-Three-Operators)=
### Comparison of the Three Operators

The three compositional operators _beside_, _atop_, and _over_, all behave in the same way if the derived function is used monadically.
The difference lies in the dyadic use of the derived function, as the table below shows.

| Operator | Monadic use | Dyadic use |
| :- | :- | :- |
| `F∘G` | `F∘G ⍵ ←→ F G ⍵` | `⍺ F∘G ⍵ ←→ ⍺ F G ⍵` |
| `F⍤G` | `F⍤G ⍵ ←→ F G ⍵` | `⍺ F⍤G ⍵ ←→ F ⍺ G ⍵` |
| `F⍥G` | `F⍥G ⍵ ←→ F G ⍵` | `⍺ F⍥G ⍵ ←→ (G ⍺) F (G ⍵)` |

These differences can also be summarised in {numref}`fig-Compositional_Operators_Comparison`:

(fig-Compositional_Operators_Comparison)=
```{figure} ../res/Compositional_Operators_Comparison.svg
---
name: Compositional_Operators_Comparison
---
Diagram comparing the dyadic uses of the three compositional operators.
```


(Tacit-Programming-To-Compose-or-Not-To-Compose)=
### To Compose or Not To Compose

As stated in the beginning of this section, the three operators _beside_, _atop_, and _over_, do not add new primitive behaviour.
In fact, for each of those operators, we have shown equivalent expressions that do **not** use the operators and that achieve the same effect.

However, there are advantages to using compositional operators to create derived functions.
Here are some of those advantages:

 - the derived function can be assigned a name;
 - function composition with these operators works inside trains ([which will be introduced soon](#Function-Trains)); and
 - these operators clarify the meaning of your programs when used correctly. In other words, good usage of _beside_, _atop_, and _over_, can make it easier to read and understand a program.

With practice, you will develop a better understanding for when using these operators explicitly is a good choice.
In part, it will also come down to personal taste: some prefer to use plenty of compositional operators and others prefer to use none, but optimal usage of these operators lies somewhere in between those two extremes.

(Tacit-Programming-Binding)=
## Binding

The glyph _jot_ `∘` has yet another use as a dyadic operator.
If one of the operands, and only one, is an array, then `∘` stands for the operator _bind_.

The operator _bind_ is used to set a fixed argument (the array operand) to a given function (the function operand).
Depending on whether the array operand is on the left or on the right of the function operand, the operator _bind_ sets the left or right argument of the function, respectively.

More explicitly, in `array∘F`, the operator _bind_ sets the left argument of the function `F` to be `array`, and in `F∘array`, the operator _bind_ sets the right argument of the function `F` to be `array`.

Next, we take a look at a few examples of the operator _bind_:

In [65]:
3∘↑¨ (⍳5) 'Houston' (21 53 78 55) (11 22)

This expression applies `3↑` to each of the items of the right argument.
So far, this is not a very good example, as the expression would work and give the same result even without the operator _bind_:

In [66]:
3 ↑¨ (⍳5) 'Houston' (21 53 78 55) (11 22)

However, binding the value 3 to _take_ makes it possible to combine the function with yet another function, so that we can again obtain the advantage of not creating unnecessary intermediate values:

In [67]:
⌽∘(3∘↑)¨ (⍳5) 'Houston' (21 53 78 55) (11 22)

Moreover, if the array operand is not a scalar, it may be impossible to omit the operator _bind_.
In the example that follows, the operator _bind_ must be present, otherwise we get a `LENGTH ERROR`:

In [68]:
2 2∘⍴¨ (⍳5) 'Houston' (21 53 78 55) (11 22)

In [69]:
2 2 ⍴¨ (⍳5) 'Houston' (21 53 78 55) (11 22)

LENGTH ERROR
      2 2⍴¨(⍳5)'Houston'(21 53 78 55)(11 22)
         ∧


When we use the operator _bind_, we create a derived function that is monadic, which means the derived function always takes a single right argument.
Thus, an expression like `(F∘arr1) arr2` evaluates to `arr2 F arr1`, because the operator _bind_ created the derived function `F∘arr1` where the **right** argument of `F` is set to `arr1`.

For example,

In [70]:
(*∘0.5) 16 81 169

Once bound to 0.5, the function _power_ behaves like the function square root, which expects a right argument (the number(s) to compute the square root of).

In this form, the derived function must be parenthesised so that the operand 0.5 is separated from the right argument vector.
Another alternative, that we saw in [the chapter about operators](./Operators.ipynb), is to use a tack:

In [71]:
*∘0.5 ⊢ 16 81 169

(Tacit-Programming-Commute-Selfie-and-Constant)=
## Commute, Selfie, and Constant

The three operators _commute_, _selfie_, and _constant_, are the three usages of the glyph _tilde diaeresis_ `⍨`.
By now, you should be able to guess that the key combination to type `⍨` is <kbd>APL</kbd>+<kbd>Shift</kbd>+<kbd>T</kbd>.
After all, the function _without_ `~` is <kbd>APL</kbd>+<kbd>t</kbd>.

(Tacit-Programming-Commute-and-Selfie)=
### Commute and Selfie

As its name implies, _commute_ is a monadic operator which commutes the arguments of its derived function.

For example,

In [72]:
4 ÷ 2

but if we use the operator _commute_, then

In [73]:
4 ÷⍨ 2

That is, `x F⍨ y` is equivalent to `y F x`.

When the derived function `F⍨` is used monadically, as in `F⍨ y`, then the same argument gets used on both sides of the function.
Thus, `F⍨ y` is equivalent to `y F y`.

For example, `⍴⍨ 3` is equivalent to `3⍴3`:

In [74]:
⍴⍨ 3

Based only on these simple examples, one might think that the operators _commute_ and _selfie_ are useless (typing `⍴⍨3` is no easier than typing `3⍴3`).
However, both may be used to reduce the number of parentheses needed in an expression.

For example, suppose we want to create a vector like `3⍴3` or `5⍴5`, using the last item of an arbitrary vector `v`.

A direct approach would be to write `((≢v)⌷v)⍴(≢v)⌷v` or `(⊃⌽v)⍴⊃⌽v`:

In [75]:
v ← 8 3 6 7 4

In [76]:
((≢v)⌷v)⍴(≢v)⌷v

In [77]:
(⊃⌽v)⍴⊃⌽v

The operator _selfie_ allows a simpler expression:

In [78]:
⍴⍨⊃⌽v

It is not only for "cosmetic" reasons that it is desirable to avoid repeating an expression.
It also means that the interpreter only has to evaluate the expression once, possibly saving some execution time.
Furthermore, avoiding a verbatim repetition of a piece of code improves maintainability considerably.
If the expression needs to be modified, it is simply too easy to forget to modify all instances of it, or to make mistakes in some of the modifications.

Some APL programmers still prefer to use an intermediate variable or an inline direct function to obtain the same benefits in terms of efficiency and maintainability:

In [79]:
last ← ⊃⌽v
last⍴last

It is mostly a matter of taste which of the possible solutions different programmers prefer.
The case illustrates that the APL language typically allows the same task to be solved in many different ways.

The operator _commute_ can also allow for simpler expressions.
For example, if we wanted to _compress_ the even numbers of the vector `v`, we would write something like:

In [80]:
(~2|v)/v

With the operator _commute_, we can write

In [81]:
v/⍨~2|v

The pattern `array /⍨` can be read as "the items of `array` that...".

The operators _commute_ and _selfie_ can also be helpful in trains.
This will be understandable when we study trains in {numref}`Tacit-Programming-Function-Trains`.

(Tacit-Programming-Constant)=
### Constant

The operator _constant_ is a monadic operator which takes an array operand `array`.
The derived function is an ambivalent constant function that always returns `array`.
Thus, `array⍨` is equivalent to the dfn `{array}`:

In [82]:
1 2 3⍨ v

In [83]:
1 2 3⍨ (⍳5) 'Houston' (21 53 78 55) (11 22)

In [84]:
'Cat' (1 2 3⍨) 'Dog'

(Tacit-Programming-Function-Trains)=
## Function Trains

A function train, often referred to as just a train, is a function that is derived from a sequence of two or more functions.
This sequence of functions must be isolated from its arguments.
Notice the difference in results between this uninteresting APL expression:

In [85]:
10 -,+ 2

And this one, where the three functions are parenthesised:

In [86]:
10 (-,+) 2

Throughout this section you will learn what `(-,+)` means as a function train and you will understand why the result of `10 (-,+) 2` is the vector `8 12`.

A function train with two functions is called an atop (or a 2-train) and a function train with three functions is called a fork (or a 3-train).
They are the two building blocks of trains with arbitrary length, and we will start by looking at them.

(Tacit-Programming-2-train-Atop)=
### 2-train Atop

A train with two functions is called an atop, which is also the name of an operator introduced in {numref}`Tacit-Programming-Atop`.
The 2-train and the operator _atop_ share their name because they function in the same way:

 - `(F G) ⍵ ←→ F⍤G ⍵ ←→ F G ⍵`; and
 - `⍺ (F G) ⍵ ←→ ⍺ F⍤G ⍵ ←→ F ⍺ G ⍵`.

For example, `(|-)` is _the absolute value of the difference_:

In [87]:
10 (|-) 5

In [88]:
5 (|-) 10

Much like with functions that are combined with operators, we can inspect trains by typing them in the session:

In [89]:
(|-)

Trains can also be assigned to names.
When we do so, we do not need to parenthesise the sequence of functions because the functions are already isolated from the arguments:

In [90]:
AbsDiff ← |-

In [91]:
5 AbsDiff 10

(Tacit-Programming-3-train-Fork)=
### 3-train Fork

A train with three functions is called a fork.
In the fork `(F G H)`, the two outer functions `F` and `H` are applied first, and then the function `G` in the middle is applied to the results of the two outer functions.

If we type a fork in the session, we see a diagram that hints at the fact that origin of the name "fork", because the diagram looks like a fork with three tines.
If we use an "empty" dfn `{}` as a placeholder for a function, we can see the fork-like diagram:

In [92]:
{}{}{}

(Tacit-Programming-Monadic-Fork)=
#### Monadic Fork

In the monadic case, we have that `(F G H) ⍵ ←→ (F ⍵) G (H ⍵)`.
For example, the fork `(⌽≡⊢)`, when used monadically, checks if the argument vector is a palindrome.
(Recall that a palindrome is a sequence that reads the same when reversed.)

In [93]:
(⌽≡⊢) 'TACOCAT'

In [94]:
(⌽≡⊢) 1 2 3 4 3 2 1

In [95]:
(⌽≡⊢) 'MISSISSIPPI'

Notice that, in a fork that is used monadically, the two outer functions are used monadically but the middle function is always used dyadically.

(Tacit-Programming-Dyadic-Fork)=
#### Dyadic Fork

In the dyadic case, we have that `⍺ (F G H) ⍵ ←→ (⍺ F ⍵) G (⍺ H ⍵)`.
So, if a fork is used dyadically, both arguments get passed to both outer functions, and then the results are given as arguments to the middle function.

The train `(-,+)`, from the beginning of this section, was used dyadically, so now we can understand it:

In [96]:
10 (-,+) 2

Is the same as:

In [97]:
(10 - 2) , (10 + 2)

Another good example of a dyadic fork is `(≠⊆⊢)`, which can be used to split a vector on a separator.
Below, you can see this fork being used to split a sentence into words:

In [98]:
' ' (≠⊆⊢) 'this is a sentence with some words'

This fork is equivalent to the following expression:

In [99]:
sentence ← 'this is a sentence with some words'
(' ' ≠ sentence) ⊆ (' ' ⊢ sentence)

Of course, the right tack is used to say that the right argument to `⊆` should be the right argument of the fork, unchanged.
In fact, the expression above can be simplified to

In [100]:
(' ' ≠ sentence) ⊆ sentence

(Tacit-Programming-Arrays-in-Forks)=
#### Arrays in Forks

The left tine of a fork does not have to be a function.
In fact, it can be any array, which will then be used as the left argument to the function in the centre of the fork.

For example, the fork `(1=×)` checks if a number is positive:

In [101]:
(1=×) 13.4

In [102]:
(1=×) 0

In [103]:
(1=×) ¯73.42

However, the right tine of a fork cannot be an array.
The right tine of a fork must be a function.
If it were an array, then APL would not interpret that as a fork, but as a normal APL expression that happens to be parenthesised.

For example, one might think that `(⊢*0.5)` is a fork that implements the function square root.
However, if you use this "fork", this is what happens:

In [104]:
(⊢*0.5) 16

APL sees the expression `(⊢*0.5) 16` as a 2-item vector, where the first item is `⊢*0.5`, which is just `*0.5`:

In [105]:
*0.5

If we wanted to insist on writing the function square root as a train, we could fix this by using the operator _constant_, to turn the value 0.5 into a function that always returns 0.5:

In [106]:
(⊢*0.5⍨) 16

(Tacit-Programming-Longer-Trains)=
### Longer Trains

A function train does not have to be limited to two or three functions.
Function trains can have an arbitrarily large size.

In a function train with four or more functions, APL starts combining functions into 3-trains from the right.
For example, to parse the train `(≢≠⊆⊢)`, APL starts by looking at `≠⊆⊢` as a 3-train, which creates a derived function `T ← ≠⊆⊢`.
Now, we can look at `(≢≠⊆⊢)` as `(≢T)` which is a 2-train, an atop.
`T` is the train that we used to split a sentence into its words, so we see that the train `(≢T)` can be used to count how many words are in a sentence:

In [107]:
' ' (≢≠⊆⊢) 'this sentence has five words'

As a beginner, it is recommended that you use the session to inspect the structure of longer trains:

In [108]:
≢≠⊆⊢

The diagrams that the session draws will help you inspect the structure of the derived function.

As another example, consider the train `(5<(≢≠⊆⊢))`.
In this train, the parentheses make the structure explicit:

 - the outer train is a fork where the left tine is actually an array (the scalar 5), the middle tine is the function `<` and the right tine is another train; and
 - the right tine is the 4-train from before.

Again, the session helps us visualise this structure:

In [109]:
5<(≢≠⊆⊢)

With the train above, we can check, for example, if a sentence has more than five words:

In [110]:
' ' (5<(≢≠⊆⊢)) 'this sentence has five words'

In [111]:
' ' (5<(≢≠⊆⊢)) 'this longer sentence contains a total of nine words'

If we look at a 5-train, we see that a 5-train is a fork with a right tine that is also a fork.
In the expression below, we use `{Fi}` as a placeholder for an arbitrary function:

In [112]:
{F5} {F4} {F3}{F2}{F1}

When APL parses the expression above to create the appropriate derived function, it starts by taking the three rightmost functions and creating a fork, that we will call `T1`:

In [113]:
T1 ← {F3}{F2}{F1}

Then, APL uses that fork `T1` as the right tine of the fork that uses the fourth and fifth functions.

In [114]:
{F5} {F4} T1

Much like in regular expressions, in function trains we can use parenthesis to change the way APL parses things.
For example, if we parenthesise the three central functions, we create a 3-train where the middle tine is itself a fork:

In [115]:
{F5} ({F4}{F3}{F2}) {F1}

If we type a 6-train, we see that the sixth function is applied atop the corresponding 5-train:

In [116]:
{F6}  {F5} {F4} {F3}{F2}{F1}

As you may have guessed by now, the length of the function train determines whether we have an atop or a fork, and that distinction depends on the parity of the length of the train:

 - a function train with an odd number of functions is a fork; and
 - a function train with an even number of functions is an atop.

Naturally, a typical train will get increasingly more difficult to understand (for humans) as it grows, so you should keep that in mind when writing your own trains.
Nonetheless, even long trains have a uniform structure that the tree diagram makes clear:

In [174]:
{F9}{F8}{F7}{F6}{F5}{F4}{F3}{F2}{F1}

We can exploit the uniformity in the structure of (long) trains to write the specification of how trains of arbitrary length work:

```{admonition} Rule 
:class: tip
- In trains, the functions in odd positions are the functions that receive the arguments of the train directly and those functions are used monadically or dyadically, depending on whether the train is called monadically or dyadically.
- The functions in even positions are used dyadically with the results of the surrounding functions as arguments.
- If the leftmost function of a train is in an even position, that function will be applied atop the remainder of the train.
```

Let us inspect an arbitrary train with 8 functions:

In [175]:
{F8}{F7}{F6}{F5}{F4}{F3}{F2}{F1}

The functions in positions 1, 3, 5, and 7, will receive the train arguments directly:

```
┌────┴────┐
{F8} ┌────┼─────────┐
     {F7} {F6} ┌────┼─────────┐
     ↑↑↑↑      {F5} {F4} ┌────┼────┐
               ↑↑↑↑      {F3} {F2} {F1}
                         ↑↑↑↑      ↑↑↑↑
```

Then, the functions in even positions will receive, as arguments, the results of the surrounding functions.
Notice the `←` and `→` next to the intersections of the tree diagram:

```
┌────┴────┐
{F8} ┌───→┼←────────┐
     {F7} {F6} ┌───→┼←────────┐
     ↑↑↑↑      {F5} {F4} ┌───→┼←───┐
               ↑↑↑↑      {F3} {F2} {F1}
                         ↑↑↑↑      ↑↑↑↑
```

Finally, if the leftmost function is in an even position, that function will be applied _atop_ the remainder of the train.
In this example, that means `F8` would be applied to the result returned by `F6`.

(Tacit-Programming-Using-Trains)=
### Using Trains

Tacit programming, and trains in particular, are infamous for being too complicated and convoluted.
This is debatable and there are benefits to using trains.
Some of the associated benefits are concrete and can be measured.
For example, function trains lend themselves nicely to idiom recognition, which means they can be faster:

In [117]:
⎕← 10↑vec ← ?1e6⍴1e6

In [118]:
]runtime -c (vec≥999000)⍳1 vec(⍳∘1≥)999000

Some other advantages of using trains are subjective and/or impossible to quantify.
We will expose some of them here.

Trains provide a mathematically pure way of specifying data transformations, given that trains are functions derived from the composition of other functions.
Thus, trains are an interesting alternative for when you need to define functions that transform your data without producing side-effects.
This "purity", which can be hard to define and somewhat subjective, is what allows function trains to be inverted by the operator _power_, as is shown in {numref}`Tacit-Programming-Inverting-Trains`.

Because function trains do not use any characters to specify the function composition or the way the arguments flow (because trains use tacit programming), it can be argued that trains are a more direct way of expressing pure computations, when compared to dfns or tradfns.

(Tacit-Programming-Carriages-in-a-Train)=
### Carriages in a Train

Function trains do not need to be composed solely of primitive functions.
Any function can be used in a function train, namely other function trains, dfns, tradfns, and functions derived from operators.
Below, you can find a couple of examples.

First, we create a fork that checks if a non-empty vector contains only a single unique element.
In this fork, the right tine is a named dfn:

In [119]:
CountUnique ← {≢∪⍵}

In [120]:
(1=CountUnique) 1 1 1 1 2 1

There is nothing preventing us from using the dfn directly:

In [121]:
(1={≢∪⍵}) 1 1 1 1 1 1

Now, we replace the dfn with an equivalent formulation using the operator _beside_ to compose the two functions that we need on the right tine:

In [122]:
(1=≢∘∪) 1 1 1 1 2 1

Because the derived function is being used monadically, we know that `≢∘∪` is the same as `≢⍤∪` and `≢⍥∪`.
On top of that, we know that `≢⍤∪` is the same as the 2-train `(≢∪)`, so we can replace the right tine with another train:

In [123]:
(1=(≢∪)) 1 1 1 1 1 1

As you can see, there is a lot of flexibility associated with function trains.
Naturally, making use of this flexibility can cost you in readability.

For example, compare the two trains that follow:

In [124]:
' ' (≠≢⍤⊆⊢) sentence

In [125]:
' ' (≢≠⊆⊢) sentence

The two trains are equivalent, but they differ in the way we specify that the function _tally_ must be applied.
In the train `(≠≢⍤⊆⊢)`, we use the operator _atop_ to specify we do the _tally of the partition_, whereas in the 4-train `(≢≠⊆⊢)`, we have the _tally_ atop the fork `(≠⊆⊢)`.

We can even see that the corresponding diagrams differ:

In [126]:
≠≢⍤⊆⊢

In [127]:
(≢≠⊆⊢)

When possible, avoid using operators inside trains because it breaks the uniform pattern of trains.

A train with 9 functions is a fork with a uniform structure, as shown by its diagram:

In [128]:
{F9}{F8}{F7}{F6}{F5}{F4}{F3}{F2}{F1}

If we insert some operators, the diagram loses regularity and becomes harder to follow:

In [129]:
{F9}{F8}⍤{F7}∘{F6}{F5}{F4}{F3}⍥{F2}{F1}

(Tacit-Programming-Hybrid-Function-operators)=
### Hybrid Function-operators

Some primitives are both functions and operators, like _reduce_ `/` and _reduce-first_ `⌿`.
One must be careful when using hybrid function-operators in function trains if the intended purpose is to use them as functions.

Here is an expression that selects the positive numbers of a numeric vector:

In [130]:
v ← ¯0.4 ¯1.3 ¯0.9 3.2 4.5 0.6 4.9 ¯2 ¯2.3 0.3

In [131]:
(0∘<v)/v

One might try to rewrite that expression as a fork like so:

In [132]:
0 (</⊢) v

However, it clearly does not work.
The issue is that `/` is being parsed as the operator _reduce_, so it is trying to use `0∘<` as its operand.
To force the glyph _slash_ to be parsed as the function _compress_, one can write the function _compress_ as the right operand of the atop `⊢⍤/`:

In [133]:
0 (<⊢⍤/⊢) v

This workaround forces the glyph _slash_ to be seen as the function _compress_ (and not the operator) because the atop `⍤` needs a right operand.
The atop `⊢⍤F` is equivalent to the function `F`, so putting `⊢⍤/` in the train will not change its behaviour.

Another possible fix is to wrap the glyph in a dfn:

In [134]:
0 (<{⍺/⍵}⊢) v

(Tacit-Programming-Inverting-Trains)=
### Inverting Trains

The operator _power_ can take a negative right operand, which will invert the function and then apply it.
An advantage of using compositional operators and/or function trains is that the derived functions are amenable to inversion by the operator _power_.
The operator _power_ can invert some of these functions because they are pure functions, and thus there are formulas that allow the interpreter to know how to invert the functions.

Below, you can find a train that adds 1 to the square of the argument:

In [13]:
(1+×⍨) 10

This train can be inverted by the operator power:

In [14]:
(1+×⍨)⍣¯1 ⊢ 101

However, the equivalent dfn cannot be inverted:

In [16]:
{1+×⍨⍵}⍣¯1 ⊢ 101

DOMAIN ERROR
      {1+×⍨⍵}⍣¯1⊢101
            ∧


There are limitations to what (tacit) functions the operator _power_ can invert, but this capability is nonetheless surprising.

(Tacit-Programming-Exercises)=
## Exercises

```{exercise}
:label: ex-operators-reduction
What is the result of the expression `2∘×⍤+/ 3 5`?
What about `2∘×⍤+/ 3 5 7`?
```


```{exercise}
:label: ex-bind-comparison
Use the operator _bind_ to create derived functions that accept numeric arrays and return a Boolean mask indicating:

 - what values are positive;
 - what values are equal to `¯1`; and
 - what values are less than or equal to `3.5`.
```


````{exercise}
:label: ex-fill-in-the-op
Replace the underscores `_` with the correct compositional operators so that the results become correct:

```APL
      4 5 6 ,_≢ 1 0 1
3 3
      4 5 6 ,_≢ 1 0 1
4 5 6 3
      4 5 6 ,_≢ 1 0 1
1
```
````


````{exercise}
:label: ex-fill-in-the-op-2
Replace the underscores `_` with the correct compositional operators so that the results become correct:

```APL
      2 +/_| 11 6 7 19
3
      (⍳6) /⍨_~ 1 0 0 1 0 0
2 3 5 6
      2 ÷_+_÷ 4
1.33333
```
````


````{exercise}
:label: ex-remove-parentheses
Rewrite the expressions below without parentheses.
Use the operator _commute_ instead.

```APL
      (∪w)⍳w ← 'MISSISSIPPI'
1 2 3 3 2 3 3 2 4 4 2
      ≢(' '≠sentence)⊆sentence
7
      (2 2,(⌊(≢⎕A)÷4))⍴⎕A
ABCDEF
GHIJKL

MNOPQR
STUVWX
```
````


```{exercise}
:label: ex-constant-mask-at
Take the character vector `sentence` and create a Boolean mask that indicates what elements of `sentence` are blank spaces (`' '`).
That mask, in conjunction with the operators _at_ and _constant_, can also be used to replace all the spaces with a different character.
Replace all spaces with asterisks (`'*'`) using this technique.
```

In typical code, this technique is useful when the Boolean mask is used for other computations other than the replacement with _at_.

```{exercise}
:label: ex-behind
The operator _beside_ in `F∘G` can be interpreted as "preprocess the right argument of `F` with `G`".
Implement a dop `_B_` such that `⍺ G _B_ F ⍵ ←→ (G ⍺) F ⍵`.
This operator is sometimes referred to as _behind_ and is interpreted as "preprocess the left argument of `F` with `G`".
Try to implement it in terms of _beside_ and _commute_.

**Hint**: start from `F∘G`.
```

Use these expressions to verify your implementation:

In [195]:
¯2 |_B_+ 5

In [196]:
(⍳10) (3∘↑_B_,) 4 5 6

```{exercise}
:label: ex-average
The dfn `{(+/⍵)÷≢⍵}` computes the average of a numeric vector.
Rewrite it as a fork.
```


```{exercise}
:label: ex-multiplication-table
The monadic tacit function `∘.×⍨∘⍳` computes the multiplication table for the numbers from 1 to the right argument.
Rewrite it as a fork.
```


In [1]:
∘.×⍨∘⍳ 5

```{exercise}
:label: ex-trains-beside
The train `(FG)` behaves the same way as the derived function `F⍤G`.
Write a train that behaves the same way as `F∘G` when `F∘G` is used dyadically.
```


```{exercise}
:label: ex-trains-over
The train `(FG)` behaves in the same way as the derived function `F⍤G`.
Write two trains that behave the same way as `F⍥G` when `F⍥G` is used dyadically:

 - one train that uses nested trains but no compositional operators; and
 - one train that uses compositional operators but no nested trains.
```


```{exercise}
:label: ex-trains-behind
Write a train that behaves like `G _B_ F` when `G _B_ F` is used dyadically, assuming `_B_` is the operator _behind_ from a previous exercise.
```


````{exercise}
:label: ex-maths-as-tacit-monadic
Rewrite the dfns that follow as function trains:

```APL
      {|×⍵} ¯3.14
1
      {*2×⍵} 2
54.5982
      {2+5×⍵} ¯3.14
¯13.7
      {⍵-2*⍵} 0.5
¯0.914214
      {|⍵-2*⍵} 0.5
0.914214
      {(⍵*2)+⍵*3} 2
12
```
````


````{exercise}
:label: ex-maths-as-tacit-dyadic
Rewrite the dfns that follow as function trains:

```APL
      3 {÷⍵-⍺} 4
1
      3 {(⍺+⍵)×⍺-⍵} 4
¯7
      3 {1+(⍺+⍵)×⍺-⍵} 4
¯6
      3 {⍵*1+⍺} 4
256
      3 {(⍳⍵)*1+⍺} 4
1 16 81 256
      3 {⍵*1+⍳⍺} 4
16 64 256
```
````


```{exercise}
:label: ex-rewrite-train
The dfn `{,1↑⍵+⍉⍵}` is a monadic dfn that accepts numeric matrix that are square (that is, that have the same number of rows and columns).
Of the trains below, which ones are equivalent to this dfn when called monadically?

 1. `(,1↑⊢+⍉)`
 2. `(,∘1∘↑⊢+⍉)`
 3. `(,⍉1∘↑⍤+⊢)`
 4. `(,1↑+∘⍉⍨)`
 5. `((,1∘↑)⍉+⊢)`
 6. `(,∘(1∘↑)⊢+⍉)`
 7. `(≢↑⍉,⍤+⊢)`

You are welcome to inspect the (tree) diagrams of the trains, but do not run the trains.
The objective of the exercise is to analyse the trains and reason about them.
```


(Tacit-Programming-Solutions)=
## Solutions

```{solution} ex-operators-reduction 
```

What is the result of the expression `2∘×⍤+/ 3 5`?
What about `2∘×⍤+/ 3 5 7`?

Because operators bind from the left, `2∘×⍤+/` is equivalent to `((2∘×)⍤+)/`.
The leftmost function is `2∘×` which is _times_ bound to a left argument 2, which is the function _double_.
The function _double_ is being used _atop_ the function _plus_, so the reduction is a _reduction by addition followed by doubling_.
Therefore, the expression `2∘×⍤+/ 3 5` will result in 16:

In [135]:
2∘×⍤+/ 3 5

In [136]:
2∘×3+5

Similarly, if the vector argument is `3 5 7`, we start by adding 5 and 7 and doubling, which gives 24.
Then, we add 3 to 24 and double, which gives 54:

In [137]:
2∘×⍤+/ 3 5 7

```{solution} ex-bind-comparison 
```

Positive values are greater than 0, which can be computed with this derived function:

In [138]:
>∘0

Values equal to `¯1` can be computed with:

In [139]:
¯1∘=

Values less than or equal to 3.5 can be found with:

In [140]:
≤∘3.5

```{solution} ex-fill-in-the-op 
```


In [141]:
4 5 6 ,⍥≢ 1 0 1

In [142]:
4 5 6 ,∘≢ 1 0 1

In [143]:
4 5 6 ,⍤≢ 1 0 1

```{solution} ex-fill-in-the-op-2 
```

Replace the underscores `_` with the correct compositional operators so that the results become correct:

```APL
      2 +/_| 11 6 7 19
3
      (⍳6) /⍨_~ 1 0 0 1 0 0
2 3 5 6
      2 ÷_+_÷ 4
1.33333
```

In [144]:
2 +/⍤| 11 6 7 19

In [145]:
(⍳6) /⍨∘~ 1 0 0 1 0 0

In [146]:
2 ÷⍤+⍥÷ 4

```{solution} ex-remove-parentheses 
```

Rewrite the expressions below without parentheses.
Use the operator _commute_ instead.

```APL
      (∪w)⍳w ← 'MISSISSIPPI'
1 2 3 3 2 3 3 2 4 4 2
      ≢(' '≠sentence)⊆sentence
7
      ((⌊(≢⎕A)÷9),3 3)⍴⎕A
ABC
DEF
GHI

JKL
MNO
PQR
```

In [162]:
w⍳⍨∪w ← 'MISSISSIPPI'

In [163]:
≢sentence⊆⍨' '≠sentence

In [168]:
⎕A⍴⍨3 3,⍨⌊9÷⍨≢⎕A

```{solution} ex-constant-mask-at 
```


In [169]:
mask ← ' '=sentence

In [170]:
+/mask  ⍝ number of blank spaces

The operator _constant_ `⍨` turns the vector `mask` into a function that returns the mask that determines where the asterisk is going to be put:

In [171]:
'*'@(mask⍨)sentence

If the mask had not been calculated previously, we could achieve the same effect with a right operand dfn that _computes_ the mask:

In [172]:
'*'@{' '=⍵}sentence

We could even use a compositional operator:

In [173]:
'*'@(' '∘=)sentence

In this exercise, we tried to emulate the context in which the mask has already been computed because it was used for something else.
In cases like this, it is not needed to recompute the mask.

```{solution} ex-behind 
```

A direct implementation could be:

In [197]:
_B_ ← {(⍺⍺ ⍺)⍵⍵ ⍵}

In [198]:
¯2 |_B_+ 5

In [199]:
(⍳10) (3∘↑_B_,) 4 5 6

Alternatively, one can implement _behind_ in terms of _beside_ and _commute_.
Using the hint, we can start with `F∘G`.

We want to have `⍺ G _B_ F ⍵ ←→ (G ⍺) F ⍵` and we have `⍺ G _B_ F ⍵ ←→ ⍺ F∘G ⍵ ←→ ⍺ F G ⍵`.
The first thing that is wrong with this version is that `G` is being applied to `⍵` and not `⍺`, so we can fix this by using the operator _commute_ once.

If we have `F∘G⍨`, then `⍺ F∘G⍨ ⍵ ←→ ⍵ F∘G ⍺ ←→ ⍵ F G ⍺`.
Thus, `G` is being applied to the correct argument, but then `F` is getting its arguments `⍵` and `G ⍺` in the wrong order.
To fix this, we need to apply a second _commute_ to the function `F` alone.

If we use the pattern `F⍨∘G⍨` then we have `⍺ F⍨∘G~ ⍵ ←→ ⍵ F⍨∘G ⍺ ←→ ⍵ F⍨ G ⍺ ←→ (G ⍺) F ⍵`, which is what we wanted.
We can implement this in our dop:

In [200]:
_B_ ← {⍺ ⍵⍵⍨∘⍺⍺⍨ ⍵}

In [201]:
¯2 |_B_+ 5

In [202]:
(⍳10) (3∘↑_B_,) 4 5 6

```{solution} ex-average 
```


In [203]:
Avg ← +/÷≢

In [204]:
Avg 1 2 3 4

```{solution} ex-multiplication-table 
```


In [3]:
(⍳∘.×⍳) 5

```{solution} ex-trains-beside 
```

When `F∘G` is used dyadically, we have `⍺ F∘G ⍵ ←→ ⍺ F G ⍵ ←→ (⍺⊣⍵) F G ⍵ ←→ (⍺⊣⍵) F (G ⍺⊢⍵) ←→ (⍺⊣⍵) F (⍺ G⍤⊢ ⍵)`.
Therefore, the train `(⊣FG⍤⊢)` is the same as `F∘G` when both are used dyadically.

In [205]:
5 +∘| ¯2

In [206]:
5 (⊣+|⍤⊢) ¯2

```{solution} ex-trains-over 
```

When `F⍥G` is used dyadically, we have `⍺ F⍥G ⍵ ←→ (G ⍺) F (G ⍵) ←→ (G ⍺⊣⍵) F (G ⍺⊢⍵)`.
Both `G ⍺⊣⍵` and `G ⍺⊢⍵` exhibit the pattern of an atop, which can be written as a 2-train or with the operator _atop_.

If we use 2-trains, we get `((G⊣)F(G⊢))`.
If we use the operator _atop_, we get `(G⍤⊣FG⍤⊢)`.
These are the same as `F⍥G` if used dyadically:

In [207]:
(⍳10) +⍥≢ ⎕A

In [208]:
(⍳10) ((≢⊣)+(≢⊢)) ⎕A

In [209]:
(⍳10) (≢⍤⊣+≢⍤⊢) ⎕A

```{solution} ex-trains-behind 
```

`G _B_ F` used dyadically gives `⍺ G _B_ F ⍵ ←→ (G ⍺) F ⍵`, which is somewhat symmetric to `F∘G ←→ ⍺ F (G ⍵)`.
Thus, if the fork for _beside_ was `(⊣FG⍤⊢)` we get that the fork for _behind_ is `(G⍤⊣F⊢)`:

In [210]:
¯2 |_B_+ 5

In [211]:
¯2 (|⍤⊣+⊢) 5

```{solution} ex-maths-as-tacit-monadic 
```

In the case of the first dfn `{|×⍵}`, it is so short that we realise `(|×)` is just an atop:

In [7]:
(|×) ¯3.14

When rewriting a dfn or another expression as a function train, arrays (whether the argument or constant arrays used in the expression) have to go in the odd positions.
Dyadic functions that combine results that are computed from the argument tend to be the functions that go in the even positions.

For example, in the dfn `{*2×⍵}` we see the arrays `2` and `⍵` and we could try to put them in the odd positions of a train: `(... _ 2 _ ⊢)`.
We use the function _right tack_ instead of `⍵` because, in tacit programming we do not mention the arguments explicitly.
Then, we see that the function _times_ is a dyadic function combining the `2` and the `⍵`, so that can be the centre function in the fork of the first three functions: `(... _ 2 × ⊢)`.
Finally, we see that the function _exponential_ is applied atop the result, so the train becomes `(*2×⊢)`:

In [11]:
(*2×⊢) 2

Similarly, in the dfn `{2+5×⍵}`, we see three arrays and no computations are done to the left of the `2`, so we can expect to try and fit that dfn in a train that looks like `(2 _ 5 _ ⊢)`.
Conveniently enough, it is enough to insert the same functions in the same positions, and the train is correct:

In [17]:
(2+5×⊢) ¯3.14

The same reasoning works for the dfns `{⍵-2*⍵}` and `{|⍵-2*⍵}`, giving the trains below:

In [18]:
(⊢-2*⊢) 0.5

In [19]:
(|⊢-2*⊢) 0.5

In the dfn `{(⍵*2)+⍵*3}`, we see that the function `+` is combining the results of computing `⍵*2` and `⍵*3`, thus `+` seems like the centre function of a fork `(F + G)`.
Now, we need `F` to compute `⍵*2` and we need `G` to compute `⍵*3`.

If you look at the expression `⍵*2` as being the function _square_ applied to `⍵`, and if you look at the expression `⍵*3` as the function _cube_ applied to `⍵`, then the fork `(*∘2+*∘3)` might look more natural to you.

In [9]:
(*∘2+*∘3) 2

If you start to associate the argument(s) with the functions _left_ and _right tack_, you might see `⍵*2` as the "fork" `(⊢*2)`, except that forks cannot have arrays on the right.
You could fix this by writing `(2*⍨⊢)` or `(⊢*2⍨)`.
Thus, you would come up with a fork like `((2*⍨⊢)+(3*⍨⊢))`.

In [10]:
((2*⍨⊢)+(3*⍨⊢)) 2

```{solution} ex-maths-as-tacit-dyadic 
```

In trains that are used dyadically, whenever we find a function that is being used with both the left and right arguments, that is a strong indication that that function should be in an odd position of a train.
For example, for the first dfn `{÷⍵-⍺}`, we see the function _minus_ being applied to both arguments, albeit in the wrong order.
This means that `(... -⍨)` is a good start for our train.
Then, we can finish our train with the function _reciprocal_ atop the subtraction:

In [22]:
3 (÷-⍨) 4

Likewise, the dfn `{(⍺+⍵)×⍺-⍵}` shows the functions _plus_ and _minus_ being used dyadically, hinting at a fork of the form `(+ ... -)`.
Then, it is just a matter of inserting the function _times_ in the centre:

In [23]:
3 (+×-) 4

Again, we see two functions being used dyadically with `⍺` and `⍵`, that are likely to go in odd positions.
On top of that, we see the array `1` being used explicitly.
Constant arrays also go in odd positions, so we have a possible train structure that looks like `(1 _ + _ -)`.
Then, we fill in the gaps and arrive at the train below:

In [24]:
3 (1++×-) 4

In dyadic trains where one of the arguments shows up isolated, that is likely to be a place to use a _left_ or _right tack_, depending on whether we need `⍺` or `⍵`, respectively.
Below, by inspecting the positions of the arguments and the literal arrays, we can suspect our train will have a structure like `(⊢ _ 1 _ ⊣)`.
To finish this train, we insert the functions that are missing:

In [25]:
3 (⊢*1+⊣) 4

When part of the expression that is being translated contains a function that is applied to only one of the arguments, we must work around that in some way.
For example, in the dfn `{(⍳⍵)*1+⍺}`, we can modify the train `(⊢*1+⊣)` by having the _index generator atop the right tack_:

In [26]:
3 (⍳⍤⊢*1+⊣) 4

In the example of the dfn `{⍵*1+⍳⍺}` we can do the exact same thing to apply the _index generator atop the left tack_, resulting in the train

In [30]:
3 (⊢*1+⍳⍤⊣) 4

Another alternative would be to use the operator _beside_ so that the function _plus_ becomes _plus with the right argument pre-processed by the index generator_:

In [31]:
3 (⊢*1+∘⍳⊣) 4

(Note that, in the prose above, the "right argument" of _plus_ that is pre-processed is actually the left argument of the train.)

```{solution} ex-rewrite-train 
```

The dfn `{,1↑⍵+⍉⍵}` takes a square numeric matrix, adds it with its transpose, and then returns the first row of the resulting matrix as a vector.
Here is an example:

In [44]:
{,1↑⍵+⍉⍵} 5 5⍴⍳25

Of the seven trains shown, the only one that is not equivalent to this dfn is the second option, `(,∘1∘↑⊢+⍉)`.

The second option is not equivalent to the dfn because of the derived function `,∘1∘↑` that is applied atop the fork `(⊢+⍉)`.
The derived function `,∘1∘↑` is parsed as `(,∘1)∘↑` because operators bind from the left.
Thus, after we add the matrix to its transpose, we mix the matrix (which leaves it unchanged) and then we catenate the scalar `1` to its right, adding a column of ones:

In [46]:
(,∘1∘↑⊢+⍉) 5 5⍴⍳25

Now, we explain why the other trains are equivalent to the dfn.
The very first one, `(,1↑⊢+⍉)` is a direct translation of the dfn:

In [50]:
(,1↑⊢+⍉) 5 5⍴⍳25

The third one is `(,⍉1∘↑⍤+⊢)`, which is pretty similar to the first one, except that the part "take the first row" was moved atop the addition of the matrix argument and its transpose.
This can be seen in the tree diagram of the train:

In [52]:
(,⍉1∘↑⍤+⊢)

This means that, after we add the matrix argument with its transpose, we immediately _take_ the first row.
Then, the function _ravel_ that is atop the fork `(⍉1∘↑⍤+⊢)` turns that 1-row matrix into the result vector:

In [55]:
(,⍉1∘↑⍤+⊢) 5 5⍴⍳25

The fourth train is `(,1↑+∘⍉⍨)`, which is a 4-train:

In [53]:
(,1↑+∘⍉⍨)

What we need to do is understand the rightmost function in the train, which is `+∘⍉⍨`.
The train is called monadically, so the function `+∘⍉⍨` will also be called monadically, which means the operator `⍨` is the operator _selfie_: `+∘⍉⍨ ⍵ ←→ ⍵ +∘⍉ ⍵`.
Now, the usage of the operator _beside_ means that we pre-process the right argument of _plus_ with _transpose_, so we end up with `+∘⍉⍨ ⍵ ←→ ⍵ +∘⍉ ⍵ ←→ ⍵ + ⍉⍵`, which is exactly what we have in the dfn:

In [54]:
(,1↑+∘⍉⍨) 5 5⍴⍳25

The fifth and sixth trains, `((,1∘↑)⍉+⊢)` and `(,∘(1∘↑)⊢+⍉)`, differ in two things.
The first difference is in the first three functions: `(⍉+⊢)` versus `(⊢+⍉)`.
However, _plus_ is a commutative function, so this difference in the definition of the train doe not affect the result.

The second difference is in the way the leftmost function is defined.
Notice how both trains are 4-trains.

In one, we have a nested 2-train `(,1∘↑)` atop the fork.
This atop applies the functions _take one_ and _ravel_ consecutively to the addition of the matrix argument with its transpose.

In the other, we have the derived function `,∘(1∘↑)` atop the fork.
This derived function uses parentheses to prevent `,∘1∘↑` to bind from the left, like in the second train.

In short, both trains are equivalent to the original dfn:

In [56]:
((,1∘↑)⍉+⊢) 5 5⍴⍳25

In [57]:
(,∘(1∘↑)⊢+⍉) 5 5⍴⍳25

The final train that we have to study is `(≢↑⍉,⍤+⊢)`.
This one looks more different from the original dfn because it uses primitive functions that are not present in the original dfn.

This train is a 5-train that starts with the fork `(⍉,⍤+⊢)`.
This fork starts by adding the argument matrix to its transpose and then ravels it, because of the function _ravel_ atop the function _plus_:

In [59]:
(⍉+⊢) 5 5⍴⍳25

In [60]:
(⍉,⍤+⊢) 5 5⍴⍳25

Then, that ravel is going to be the right argument to the fork `(≢↑⊢)`.
The function _tally_ takes the original matrix as argument, so the result of that will be the number of rows (or columns, because it is a square matrix) that the matrix has.
Thus, from the ravel of the matrix addition, we will _take_ the number of elements of a single row, which corresponds to the first row of the matrix addition:

In [61]:
(≢↑⍉,⍤+⊢) 5 5⍴⍳25