10. Nested Arrays (Continued)#

10.1. First Contact#

10.1.1. Definitions#

We have already met nested arrays in the chapter about Data and Variables; let us just remind ourselves of some definitions:

An array is said to be generalised or nested when one or more of its items are not simple scalars, but scalars containing “enclosed” arrays (this term will be explained soon).

Such an array can be created in many ways, although until now we have only covered the simplest one, called vector notation, or strand notation. Using this notation, the items of an array are just juxtaposed, and each item can be identified as a separate item because:

• it is separated from its neighbours by blanks, or

• it is embedded within quotes, or

• it is an expression embedded within parentheses, or

• it is a variable name, or the name of a niladic function which returns a result.

Just to demonstrate how it works, we will create a nested vector and a nested matrix:

one ← 2 2⍴8 6 2 4
two ← 'Hello'

nesVec ← 87 24 'John' 51 (78 45 23) 85 one 69
]display nesVec

┌→───────────────────────────────────────┐ │ ┌→───┐ ┌→───────┐ ┌→──┐ │ │ 87 24 │John│ 51 │78 45 23│ 85 ↓8 6│ 69 │ │ └────┘ └~───────┘ │2 4│ │ │ └~──┘ │ └∊───────────────────────────────────────┘
nesMat ← 2 3⍴'Dyalog' 44 two 27 one (2 3⍴1 2 0 0 0 5)
]display nesMat

┌→───────────────────────┐ ↓ ┌→─────┐ ┌→────┐ │ │ │Dyalog│ 44 │Hello│ │ │ └──────┘ └─────┘ │ │ ┌→──┐ ┌→────┐ │ │ 27 ↓8 6│ ↓1 2 0│ │ │ │2 4│ │0 0 5│ │ │ └~──┘ └~────┘ │ └∊───────────────────────┘

Later, we will provide a more formal description of this notation.

10.1.2. Enclose & Disclose#

It seems so easy to create and work with nested arrays; couldn’t we turn a simple array into a nested array by, for example, replacing one item of a simple matrix with a vector?

For example, we create a simple matrix:

⎕← mat ← 2 3⍴87 63 52 74 11 62

87 63 52 74 11 62

Then, we try to change it into a nested array:

mat[1;2] ← 10 20 30

LENGTH ERROR
mat[1;2]←10 20 30
∧


It doesn’t work!

We cannot replace one item with an array of three items.

mat[1;2] is a scalar. We can only replace it with a scalar.

10.1.2.1. Enclose#

Let us now use a little trick to make the assignment above work. We just have to zip up the three values into a single “bag”, using a function called enclose, represented by the symbol ⊂, typed with APL+z.

Then we will be able to replace one item by one bag!

mat[1;2] ← ⊂10 20 30
mat

┌──┬────────┬──┐ │87│10 20 30│52│ ├──┼────────┼──┤ │74│11 │62│ └──┴────────┴──┘

Now it works!

We can, of course, do the same with character data, but we now know that an expression like

mat[2;3] ← 2 4⍴'JohnPete'

LENGTH ERROR
mat[2;3]←2 4⍴'JohnPete'
∧


is incorrect. We must enclose the array like this:

mat[2;3] ← ⊂2 4⍴'JohnPete'


The result is what we expected:

]display mat

┌→─────────────────────┐ ↓ ┌→───────┐ │ │ 87 │10 20 30│ 52 │ │ └~───────┘ │ │ ┌→───┐ │ │ 74 11 ↓John│ │ │ │Pete│ │ │ └────┘ │ └∊─────────────────────┘

The result of enclose is always a scalar: cf. Section 10.1.2.4.

10.1.2.2. Disclose#

If we look at the contents of mat[2;3], we see a little 2 by 4 matrix, but if we look at its shape, we see that surprisingly it has no shape. Its rank is zero, so it must be a scalar!

mat[2;3]

┌────┐ │John│ │Pete│ └────┘
⍴mat[2;3]


As we can see, its shape is empty. And its rank is zero:

⍴⍴mat[2;3]

0

The explanation is obvious: we have put this little matrix into a bag (a scalar), so we now see the bag, and not its contents. If we want to see its contents, we must extract them from the bag, using a function called disclose, which is represented by the symbol ⊃ and typed with APL+x.

⍴⊃mat[2;3]

2 4

And its rank is two, as expected:

⍴⍴⊃mat[2;3]

2

We experience the same behaviour if we try to extract one item from a nested vector.

Let us recall the nested vector nesVec:

nesVec

┌──┬──┬────┬──┬────────┬──┬───┬──┐ │87│24│John│51│78 45 23│85│8 6│69│ │ │ │ │ │ │ │2 4│ │ └──┴──┴────┴──┴────────┴──┴───┴──┘

We can use similar expressions to the ones we used on mat:

⍴nesVec[5]


The above looks like a scalar; it is a scalar, containing an enclosed vector.

Once we disclose it, we gain access to its contents (three elements, in this case):

⍴⊃nesVec[5]

3

In fact, this should not have come as a complete surprise to us. Earlier, we learned that the shape of the result of an indexing operation is identical to the shape of the indices. In this case (as well as in the matrix case above), the index specifies a scalar. Hence, it would be incorrect to expect anything other than a scalar as the result of the indexing operation!

10.1.2.3. Mnemonics#

It is easy to remember how to generate the two symbols for enclose and disclose on a US or UK keyboard:

• Disclose ⊃ is generated by APL+X, as in eXtract; and

• Enclose ⊂ is generated by APL+Z, as in Zip-up.

For reference, the actual symbols are called left shoe and right shoe, respectively for ⊂ and ⊃; “enclose” and “disclose” are the names of the functions.

10.1.2.4. Simple and Other Scalars#

We know that the result of enclose is always a scalar, but there is a difference between enclosing a scalar number or character, and enclosing any other array.

When appropriate, we shall use four different terms:

• simple scalar refers to a single number or letter (rank zero);

• enclosed array refers to a scalar that is the result of enclosing anything other than a simple scalar;

• item refers to a scalar that is a constituent of an array, whether it is a simple scalar or an enclosed array; and

• nested array is an array in which at least one of the items is an enclosed array.

Always remember these important points:

• enclose does nothing to a simple scalar - it returns the scalar unchanged. The same for disclose;

• all items of an array are effectively scalars, whether they are simple scalars or enclosed arrays: their rank is 0, and their shape is empty;

• a single item can be replaced only by another single item: a simple scalar, or an array of values zipped up using enclose (to form an enclosed array); and

• strand notation avoids the use of enclose, because of the conventions used to separate individual items from one another.

Let us create four vectors:

a ← 'coffee'
b ← 'tea'
c ← 'chocolate'

v ← a b c


The last statement is just a simpler way to write:

v ← (⊂a),(⊂b),⊂c


So, we can see that each of the items of v is an enclosed character vector. Thus,

⍴v[1]


is ⍬, not 6.

Here is another example:

nesVec[1 5 6] ← 'Yes' 987 'Hello'
]display nesVec

┌→────────────────────────────────────────┐ │ ┌→──┐ ┌→───┐ ┌→────┐ ┌→──┐ │ │ │Yes│ 24 │John│ 51 987 │Hello│ ↓8 6│ 69 │ │ └───┘ └────┘ └─────┘ │2 4│ │ │ └~──┘ │ └∊────────────────────────────────────────┘

If we use any additional enclose primitives, the results are very different. And the results also vary depending on where the enclose primitives are used.

Here are two examples:

nesVec[1 5 6] ← 'Yes' 987 (⊂'Hello')
]display nesVec

┌→────────────────────────────────────────────┐ │ ┌→──┐ ┌→───┐ ┌─────────┐ ┌→──┐ │ │ │Yes│ 24 │John│ 51 987 │ ┌→────┐ │ ↓8 6│ 69 │ │ └───┘ └────┘ │ │Hello│ │ │2 4│ │ │ │ └─────┘ │ └~──┘ │ │ └∊────────┘ │ └∊────────────────────────────────────────────┘
nesVec[1 5 6] ← ⊂'Yes' 987 'Hello'
]display nesVec

┌→────────────────────────────────────────────────────────────────────────────────────────┐ │ ┌→──────────────────┐ ┌→───┐ ┌→──────────────────┐ ┌→──────────────────┐ ┌→──┐ │ │ │ ┌→──┐ ┌→────┐ │ 24 │John│ 51 │ ┌→──┐ ┌→────┐ │ │ ┌→──┐ ┌→────┐ │ ↓8 6│ 69 │ │ │ │Yes│ 987 │Hello│ │ └────┘ │ │Yes│ 987 │Hello│ │ │ │Yes│ 987 │Hello│ │ │2 4│ │ │ │ └───┘ └─────┘ │ │ └───┘ └─────┘ │ │ └───┘ └─────┘ │ └~──┘ │ │ └∊──────────────────┘ └∊──────────────────┘ └∊──────────────────┘ │ └∊────────────────────────────────────────────────────────────────────────────────────────┘

Now, we revert back to the original values because we will need nesVec below:

nesVec[1 5 6] ← 'Yes' 987 'Hello'


Most of the time, the user command ]box displays enough information when working with nested arrays in the session. However, in some situations, you might want or need more granular display information, which you can obtain by using the function DISPLAY. We have already seen the function DISPLAY and its main characteristics in Section 3.6.4. We now need to explore some additional characteristics of it.

10.1.3.1. Conventions#

The following conventions are used in the character matrix that DISPLAY returns:

• A simple scalar has no box around it.

• All other arrays are shown with a surrounding box. The upper-left hand corner of the box describes the shape of the array. It can be:

• ─, a simple line for a scalar that is an enclosed array;

• →, a single arrow, for a vector;

• ↓ or ↓↓, one or more vertical arrows for matrices and higher-rank arrays;

• ⊖, a horizontal circled minus for an array with empty last axis; or

• ⌽, a vertical circled bar for an array with another empty axis.

• The bottom-left hand corner of the box describes the nature of the array:

• ─, a simple line for character contents;

• ~, a tilde for numeric contents;

• +, a plus symbol for mixed contents;

• ∊, a membership symbol for nested arrays;

• ∇, a del for ⎕OR arrays; or

• #, a hash for namespace references.

We have not yet studied the last two concepts (⎕OR and namespaces); you can ignore them for now.

10.1.3.2. Change the Default Presentation#

By default, the boxes are drawn with special line-drawing characters, but you can provide a zero left argument to force the function to use alternative (standard APL) characters:

)copy DISPLAY

C:\Program Files\Dyalog\Dyalog APL-64 18.2 Unicode\ws\DISPLAY.dws saved Thu Apr 7 00:29:10 2022
DISPLAY 'Hello'

┌→────┐ │Hello│ └─────┘
0 DISPLAY 'Hello'

.→----. |Hello| '-----'

As mentioned previously, the default presentation looks a lot better on the screen, but there may be situations where using standard APL characters may be preferred.

10.1.3.3. Distinguish Between Items#

Now that we have discovered the existence of scalars which are enclosed arrays, we can use DISPLAY to distinguish between the two kinds of scalars.

Notice how DISPLAY does not draw a box around the 34 below:

DISPLAY 34

34
DISPLAY nesVec[6]

┌─────────┐ │ ┌→────┐ │ │ │Hello│ │ │ └─────┘ │ └∊────────┘

The sixth item of nesVec is an enclosed vector, so its corners are marked with a simple line and an ∊. It contains a second box whose corners tell us that 'Hello' is a character vector. nesVec[6] is a scalar containing a vector. If we disclose the item, we obtain a simple vector:

DISPLAY ⊃nesVec[6]

┌→────┐ │Hello│ └─────┘

10.1.3.4. Empty Arrays#

Here is how DISPLAY identifies some empty arrays:

• Empty numeric vector:

DISPLAY ⍬

┌⊖┐ │0│ └~┘
• Empty text vector:

DISPLAY ''

┌⊖┐ │ │ └─┘

These are vectors, because there is not vertical arrow, and the ⊖ sign indicates that they are empty. At the bottom of the boxes, the symbols ~ and ─ show that an empty numeric vector and an empty character vector are different. One contains a zero, the other contains a blank. This indicates the type of the array, which is a property of an array even when the array is empty (in Section 10.9 we talk more about fill items).

We can see the same kind of output for empty matrices:

• Empty numeric matrix:

DISPLAY 0 5⍴0

┌→────────┐ ⌽0 0 0 0 0│ └~────────┘
• Empty character matrix:

DISPLAY 0 10⍴''

┌→─────────┐ ⌽ │ └──────────┘
• Another empty character matrix:

DISPLAY 5 0⍴''

┌⊖┐ ↓ │ │ │ │ │ │ │ │ │ └─┘
• Empty numeric 3D array:

DISPLAY 2 3 0⍴0

┌┌⊖┐ ↓↓0│ ││0│ ││0│ ││ │ ││0│ ││0│ ││0│ └└~┘

The output for the empty numeric 3D array contains 2 sets of 3 zeroes to show that its shape is 2 3 0.

10.2. Choose Indexing#

Choose indexing is a different way of indexing arrays and is one example application of nested arrays.

In simple indexing, that you learned in Section 3.5, you can index into an array to select multiple items at the same time:

⎕← mat ← 3 4⍴⍳12

1 2 3 4 5 6 7 8 9 10 11 12
mat[1 2;3 4]

3 4 7 8

However, simple indexing always extracts items in a grid-like fashion. For example, the indexing above specified that the result would come from rows 1 and 2 and from columns 3 and 4. But what if you wanted only the items mat[1;3], mat[2;3], and mat[2;4]?

In that case, you can use choose indexing. In choose indexing, the index is a nested array and each scalar of that array identifies a single element of the array being indexed. Each scalar is a vector of indices, with one element per axis of the array being indexed.

So, if we want to index into a matrix, which has two axes, we need each scalar to be a vector of length two, enclosed. If we want three values, our index vector will have length three:

mat[(1 3)(2 3)(2 4)]

3 7 8

In choose indexing, the index does not have to be a vector. Much like with simple indexing, the shape of the index will determine the shape of the result:

mat[(1 1)(2 2)(3 3)(3 4)]

1 6 11 12
mat[2 2⍴(1 1)(2 2)(3 3)(3 4)]

1 6 11 12
mat[2 2 1⍴(1 1)(2 2)(3 3)(3 4)]

1 6 11 12

If you want to use choose indexing to index a single item, you need to enclose it to turn it into a scalar:

mat[⊂(1 2)]

2

10.3. Depth#

10.3.1. Enclosing Scalars#

Applied to a simple scalar, enclose does nothing: the enclose of a simple scalar is the same simple scalar:

]display 35

35
]display ⊂35

35

However, when applied to any other array, enclose puts a “bag” around it.

]display 2 4 8

┌→────┐ │2 4 8│ └~────┘

If we use enclose once, we get a scalar containing a numeric vector:

]display ⊂2 4 8

┌─────────┐ │ ┌→────┐ │ │ │2 4 8│ │ │ └~────┘ │ └∊────────┘

With one more enclose, we get a scalar containing another scalar, itself containing a numeric vector.

10.3.2. The Depth of an Array#

Suppose that we write a function Process, which takes as its argument a vector consisting of: the name of a town, the number of inhabitants, a country code, and the turnover of our company in that town.

For example, we could call the function as Process 'Lyon' 466400 'FR' 894600.

For the purpose of this example, the function will just display the items it receives in its argument. We choose to write it with the following syntax:

]dinput
Process ← {
(town pop coun tov) ← ⍵
⎕← (15↑'Town = '),town
⎕← (15↑'Population = '),⍕pop
⎕← (15↑'Country = '),coun
⎕← (15↑'Turnover = '),⍕tov
}


Perhaps this is not the smartest thing we could do, but we did it!

Now, let us execute the function and verify that it works properly:

Process 'York' 186800 'GB' 540678

Town = York Population = 186800 Country = GB Turnover = 540678

This looks promising, but what will happen if the user forgets one of the items that the function expects? Let us test it:

Process 'York' 186800 'GB'

LENGTH ERROR
Process[1] (town pop coun tov)←⍵
∧


As we might expect, an error message is issued: we cannot put 3 values into 4 variables!

Let us add a little test to our function to check whether or not the right argument has 4 items.

Here is the new version; notice the new line of code:

]dinput
Process ← {
4≠≢⍵: 'Hey, weren''t you supposed to provide 4 values?'
(town pop coun tov) ← ⍵
⎕← (15↑'Town = '),town
⎕← (15↑'Population = '),⍕pop
⎕← (15↑'Country = '),coun
⎕← (15↑'Turnover = '),⍕tov
}


It seems to work well now:

Process 'York' 186800 'GB'

Hey, weren't you supposed to provide 4 values?

But one day the user forgets all but one of the items, and just types the name of the town. If the user is (un)lucky enough to type a town name with four letters, here is what happens:

Process 'York'

Town = Y Population = o Country = r Turnover = k

This trivial example shows that when nested arrays are involved, it is not sufficient to rely on the shape of an array; we need additional information: specifically, is it a simple or a nested array? To help distinguish between simple and nested arrays, APL provides a function named depth. It is represented by the monadic use of the symbol ≡.

Here is a set of rules that define how to determine the depth of an array:

• the depth of a simple scalar is 0;

• the depth of any other array of any shape is 1, if all of its items are simple scalars.

We call such an array a simple array, so we can instead say:

• the depth of a non-scalar, simple array is 1;

• the depth of any other array is equal to the depth of its deepest item plus 1; and

• the depth is positive if the array is uniform (all of its items have the same depth), and negative if it is not.

Therefore, our Process function can only work when the argument ⍵ has depth ¯2! Why ¯2? Because the town name and the country name are character vectors, but the population and the turnover are numeric scalars, meaning that ⍵ has heterogeneous depth:

]dinput
Process ← {
¯2≠≡⍵: 'The argument has the wrong depth!'
4≠≢⍵: 'Hey, weren''t you supposed to provide 4 values?'
(town pop coun tov) ← ⍵
⎕← (15↑'Town = '),town
⎕← (15↑'Population = '),⍕pop
⎕← (15↑'Country = '),coun
⎕← (15↑'Turnover = '),⍕tov
}

Process 'York' 186800 'GB' 540678

Town = York Population = 186800 Country = GB Turnover = 540678
Process 'York'

The argument has the wrong depth!

Another intuitive definition of depth is this: ]display the array and count the number of boxes you must pass to reach its deepest item.

Here are some examples:

≡ 540678

0

As seen above, a scalar has depth 0.

The following vector contains only simple scalars. Its depth is 1:

≡ 15 84 37 11

1

The rank of an array doesn’t influence directly its depth. If we reshape the vector above into a matrix, its depth is still 1 because it contains only simple scalars:

≡ 2 2⍴15 84 37 11

1

Now, let us consider this nested vector:

≡ vec1 ← (4 3) 'Yes' (8 7 5 6) (2 4)

2

It is composed of four enclosed vectors, each of depth 1 - so vec1 has depth 2. Now let us change the expression slightly:

≡ vec2 ← (4 3) 'Yes' (8 7 5) 6 (2 4)

¯2

This vector is no longer uniform: it contains four enclosed vectors and one simple scalar, so its depth is negative. The magnitude of the depth has not changed, since it reports the highest level of nesting.

In this context, the word “uniform” only means that the array contains items of the same depth.

• vec2 is not uniform: it contains vectors (of depth 1) mixed with a scalar (of depth 0); and

• vec1 is uniform: all its items are vectors (of depth 1), even though they do not have the same shape, the same type, and certainly not the same content.

10.3.3. The Depth of an Array, Take 2#

We used the example of the function Process to motivate the definition of the depth of an array, but perhaps we could have fixed our function in a different way.

This was the original definition of Process:

]dinput
Process ← {
(town pop coun tov) ← ⍵
⎕← (15↑'Town = '),town
⎕← (15↑'Population = '),⍕pop
⎕← (15↑'Country = '),coun
⎕← (15↑'Turnover = '),⍕tov
}


Instead of checking the length of ⍵ to see if there are enough items, perhaps we could write a more lenient version of Process that uses default values for the population, the country, and the turnover.

That way, if the user does not input enough arguments, the function still works, and displays that some information is missing:

]dinput
Process ← {
defaults ← ¯1 '?' ¯1
(town pop coun tov) ← ⍵,(¯1+≢⍵)↓defaults
⎕← (15↑'Town = '),town
⎕← (15↑'Population = '),⍕pop
⎕← (15↑'Country = '),coun
⎕← (15↑'Turnover = '),⍕tov
}

Process 'York' 186800 'GB' 540678

Town = York Population = 186800 Country = GB Turnover = 540678
Process 'York' 186800 'GB'

Town = York Population = 186800 Country = GB Turnover = ¯1

It seems like we have a more robust function, but let us see what happens if we keep removing items from the arguments:

Process 'York' 186800

Town = York Population = 186800 Country = ? Turnover = ¯1

If we only pass in the town and the population, the function still works.

Now, let us try to pass in only the town:

Process 'York'

Town = Y Population = o Country = r Turnover = k

Once again, the user runs into trouble because our function Process takes a look at the character vector 'York', sees it has four items, and thus adds no default values.

The issue can be resolved if the user remembers to enclose the town name:

Process ⊂'York'

Town = York Population = ¯1 Country = ? Turnover = ¯1

However, as you will come to understand, it is rarely a good idea to rely on the user to pass the arguments in the correct format.

Wouldn’t it be nice if we, the developers, could take care of that for ourselves? As it turns out, we can.

10.4. Nest#

The function nest is a monadic function represented by the left shoe underbar character, ⊆, which you can type with APL + Shift + Z. (Remembering how to type ⊆ should not be too hard, because it lives in the same key as ⊂.)

The function nest is sometimes called enclose if simple, because that is exactly what it does: you give it an array, and ⊆ will enclose it if and only if the argument array is simple.

In an intuitive sense, but using less rigorous words, ⊆ will put a box around arrays that don’t have any boxes yet.

Let us take a look at a couple of examples. Here is a nested array:

'York' 186800 'GB' 540678

┌────┬──────┬──┬──────┐ │York│186800│GB│540678│ └────┴──────┴──┴──────┘

Because the array above is nested, it is not simple. Therefore, ⊆ applied to that array will do nothing:

⊆'York' 186800 'GB' 540678

┌────┬──────┬──┬──────┐ │York│186800│GB│540678│ └────┴──────┴──┴──────┘

On the other hand, we can compare the simple character vector

'York'

York

with what we get if we nest it:

⊆'York'

┌────┐ │York│ └────┘

Because 'York' was not nested, ⊆ did it for us.

10.4.1. Argument Homogenisation#

In the context of the function Process from before, the function nest becomes quite useful. With it, we can handle the case when the user forgets to enclose the town name when no other information is given:

]dinput
Process ← {
defaults ← ¯1 '?' ¯1
(town pop coun tov) ← (⊆⍵),(¯1+≢⊆⍵)↓defaults
⎕← (15↑'Town = '),town
⎕← (15↑'Population = '),⍕pop
⎕← (15↑'Country = '),coun
⎕← (15↑'Turnover = '),⍕tov
}

Process 'York'

Town = York Population = ¯1 Country = ? Turnover = ¯1

In the Section 10.15 you will be asked to use nest for the purpose of argument homogenisation again.

10.4.2. Nesting a Scalar#

A word of caution is in order, pertaining to what happens if we nest a simple scalar. The function nest is supposed to enclose its argument array when it is a simple array. So, let us try to nest the scalar 42:

⊆42

42

At a first glance, it looks like the function failed! After all, there is no box around the 42… But the function did not fail. The “issue” here is that simple scalars match their own enclosures. So, when ⊆ tried enclosing 42, nothing happened.

Bear this in mind when using nest, but do not worry about this giving you unpleasant surprises. For example, pretend there is a town called 'A' and let us call Process with that town name:

Process 'A'

Town = A Population = ¯1 Country = ? Turnover = ¯1

See? ⊆'A' gives 'A', but that didn’t prevent the function from correctly handling the default values for the missing pieces of information.

10.5. Each#

10.5.1. Definition and Examples#

To avoid the necessity of processing the items of an array one after the other in an explicitly programmed loop, one can use a monadic operator called each, which is represented by a diaeresis symbol, which looks like ¨ and is typed with APL+Shift+1.

As its name implies, each applies the function on its left (its operand) to each of the items of the array on its right (if the function is monadic), or to each pair of corresponding items of the arrays on its left and right (if the function is dyadic).

Let us try it with some small nested vectors and a monadic function:

vec3 ← (5 2) (7 10 23) (52 41) (38 5 17 22)
vec4 ← (15 12) 71023 (2 2⍴⍳4) (74 85 96)
vec5 ← (7 5 1) (19 14 13) (33 44 55)


Now, we can ask for the shape of vec3:

⍴vec3

4

Using ¨, we can ask for the shape of each of the items of vec3:

⍴¨vec3

┌─┬─┬─┬─┐ │2│3│2│4│ └─┴─┴─┴─┘

We can do the same with the second vector:

⍴¨vec4

┌─┬┬───┬─┐ │2││2 2│3│ └─┴┴───┴─┘

Beware! One item of vec4 is a scalar, so its shape is empty, as shown above. If ]box were off, this could look odd at first sight:

]box off

Was ON
⍴¨vec4

2 2 2 3
]box on

Was OFF

If the function specified as the operand to each is dyadic, the derived function is also dyadic. As usual, if one of the arguments is a scalar, the scalar is automatically repeated to match the shape of the other argument. For example, take the following vector with the names of some months:

monVec ← 'January' 'February' 'March' 'April' 'May' 'June'


To take the first 3 letters of each vector in that vector of vectors, we would do

3↑¨monVec

┌───┬───┬───┬───┬───┬───┐ │Jan│Feb│Mar│Apr│May│Jun│ └───┴───┴───┴───┴───┴───┘

As we have just shown, there is no need to repeat the 3 to have the same shape as monVec.

Naturally, the operand to each can also be a user-defined function, provided that it can be applied to all of the items of the argument array(s):

Average ← {(+/⍵)÷≢⍵}
Average¨vec3

3.5 13.3333 46.5 20.5

Remark

In fact, each is a bit more than a “hidden” loop.

Please, remember that all items of an array are scalars - either simple scalars or enclosed arrays. So, in an expression like ⍴¨vec5, shouldn’t we expect the result to be just a list of three empty vectors, since the shape of a scalar is an empty vector?

No, the each operator is smarter than that. For each item of the argument array, the item is first disclosed (the “bag” is opened), the function is applied to the disclosed item, and the result is enclosed again to form a scalar (i.e., put into a new “bag”). Finally, all the new bags (scalars) are arranged in exactly the same structure (rank and shape) as the original argument array to for the final result.

So,

⍴¨vec5

┌─┬─┬─┐ │3│3│3│ └─┴─┴─┘

is in fact equivalent to

(⊂⍴⊃vec5[1]), (⊂⍴⊃vec5[2]), (⊂⍴⊃vec5[3])

┌─┬─┬─┐ │3│3│3│ └─┴─┴─┘
(⍴¨vec5)≡(⊂⍴⊃vec5[1]), (⊂⍴⊃vec5[2]), (⊂⍴⊃vec5[3])

1

If the operand to each is a dyadic function, the corresponding items of the left and right arguments are both disclosed before applying the function.

We have seen that the operand to each may be a primitive function or a user-defined function. It may also be a derived function returned by another operator. For example, in the following expressions, the operand to each is not /, but the derived function +/.

In this example, we sum the numbers inside each item of the vector:

+/¨vec3

7 40 93 82

In this next one, it still works, even though one item is a matrix:

+/¨vec4

┌──┬─────┬───┬───┐ │27│71023│3 7│255│ └──┴─────┴───┴───┘

Beware: in some cases, the same derived function can be applied with or without the help of each, but the result will not be the same at all:

]display vec5

┌→──────────────────────────────┐ │ ┌→────┐ ┌→───────┐ ┌→───────┐ │ │ │7 5 1│ │19 14 13│ │33 44 55│ │ │ └~────┘ └~───────┘ └~───────┘ │ └∊──────────────────────────────┘

Without ¨, +/ sums the three sub-vectors together:

+/vec5

┌────────┐ │59 63 69│ └────────┘

With ¨, +/¨ will compute the sum of each of the sub-vectors:

+/¨vec5

13 46 132

10.5.2. The Use of Each#

Each is a “loop cruncher”. Instead of programming loops, in APL you can apply any function to each of the items of an array, each of which may contain a complex set of data.

This operator is also useful combined with match when a simple equal sign would have caused an error. For example, to compare two lists of names:

'John' 'Julius' 'Jim' 'Jean' ≡¨ 'John' 'Oops' 'Jim' 'Jeff'

1 0 1 0

When used inappropriately, the each operator can sometimes use a large amount of memory for its intermediate results, so you may need to use it with some care.

Suppose that we have a huge list customerTover, of turnover amounts, one item per customer (we have more than 5,000 of them!). Each item contains a matrix having a varying number of rows (products) and 52 columns (weeks). Our task is to calculate the total average turnover per week per customer. No problem, that’s just (+/¨+⌿¨customerTover)÷52.

However, if customerTover is very large, and we do not have much workspace left, the above expression may easily cause a WS FULL error.

The reason is that the intermediate expression +⌿¨customerTover produces a list of 52 amounts per customer, and that may require more workspace than we have room for.

Instead, we can put the entire expression into a function. As is often the case in APL (and in programming, in general), the hardest part of writing a function is finding a good name for it. Fortunately, we can get by without a name if we use an anonymous dfn, with {(+/+⌿⍵)÷52}¨customerTover.

Because we have “isolated” the entire logical process in the function and used each to loop through the items one by one, we will at most have only one customer’s data “active” at any time, and each intermediate result (a 52-item vector) will be thrown away before recalculating that for the next customer. The result of each function call is just one number, so it is much less likely that we will run into WS FULL problems.

10.5.3. Three Compressions!#

In the following we will show three expressions which look similar, but their results are very different. Let us first recall that vec5 consists of three vectors, each containing three items:

vec5

┌─────┬────────┬────────┐ │7 5 1│19 14 13│33 44 55│ └─────┴────────┴────────┘

What is the result of a compression?

1 0 1/vec5

┌─────┬────────┐ │7 5 1│33 44 55│ └─────┴────────┘

Above, the vector 1 0 1 applies to the three items of vec5, compressing out the middle one.

]display 1 0 1/vec5

┌→───────────────────┐ │ ┌→────┐ ┌→───────┐ │ │ │7 5 1│ │33 44 55│ │ │ └~────┘ └~───────┘ │ └∊───────────────────┘

As mentioned, the compression applies to the items of vec5, as it would to any vector. So, the second item has been removed.

If we use 1 0 1/¨vec5, do you think the result is the same? Are you sure? It is not displayed the same way:

1 0 1/¨vec5

┌─────┬┬────────┐ │7 5 1││33 44 55│ └─────┴┴────────┘

Things are different here: each item of 1 0 1 is paired with each sub-vector, like this:

• 1/7 5 1 gives 7 5 1;

• 0/19 14 13 gives ⍬; and

• 1/33 44 55 gives 33 44 55.

Thanks to ]display:

]display 1 0 1/¨vec5

┌→───────────────────────┐ │ ┌→────┐ ┌⊖┐ ┌→───────┐ │ │ │7 5 1│ │0│ │33 44 55│ │ │ └~────┘ └~┘ └~───────┘ │ └∊───────────────────────┘

There is a third way of using compress. If we enclose the left argument, the entire mask 1 0 1 is applied to each sub-vector. The second item of each sub-vector has been removed:

]display (⊂1 0 1)/¨vec5

┌→──────────────────────┐ │ ┌→──┐ ┌→────┐ ┌→────┐ │ │ │7 1│ │19 13│ │33 55│ │ │ └~──┘ └~────┘ └~────┘ │ └∊──────────────────────┘

10.6. Processing Nested Arrays#

We have already seen a number of operations involving nested arrays; we shall explore some more in this section. Because nested arrays generally tend to have a rather simple - or at least uniform - structure, we can illustrate the operations using our little vectors.

You can refer to this section concerning the application of scalar dyadic functions to nested arrays.

However, let us here explore again how each applies to scalar dyadic functions:

vec5

┌─────┬────────┬────────┐ │7 5 1│19 14 13│33 44 55│ └─────┴────────┴────────┘
vec5 + 100 20 1

┌───────────┬────────┬────────┐ │107 105 101│39 34 33│34 45 56│ └───────────┴────────┴────────┘

100, 20, and 1 are added to the three sub-vectors, respectively.

Using each, the result is still the same:

vec5 +¨ 100 20 1

┌───────────┬────────┬────────┐ │107 105 101│39 34 33│34 45 56│ └───────────┴────────┴────────┘

If we enclose the right argument, then 100 20 1 becomes a scalar, and gets added to each of the three sub-vectors:

vec5 +¨ ⊂100 20 1

┌────────┬─────────┬─────────┐ │107 25 2│119 34 14│133 64 56│ └────────┴─────────┴─────────┘

If we drop the each operator, the result is the same because the scalar on the right is extended to match the shape of the left vector:

vec5 + ⊂100 20 1

┌────────┬─────────┬─────────┐ │107 25 2│119 34 14│133 64 56│ └────────┴─────────┴─────────┘

In fact, each is a superfluous operator when used with scalar dyadic functions, because scalar dyadic functions are pervasive, as seen in a previous section.

10.6.2. Juxtaposition vs Catenation#

When you catenate a number of arrays, for example v ← a,b,c, you create a new array with the contents of a, b, and c catenated together to make a single new array, as we have seen many times before.

Let us use a small vector and see how it works:

small ← 3 4 5

1 2,small,6 7

1 2 3 4 5 6 7

As we can see, the result is a simple vector.

What happens here is, of course, that the first 3-item vector small and the 2-item vector 6 7 are combined into one 5-item vector. Then, this 5-item vector is combined with the 2-item vector 1 2 to form the resulting 7-item vector. Both the final and the interim results are simple vectors.

We can now explain what happens when you juxtapose two or more arrays (strand notation), for example v ← a b c d e: each array is enclosed, and the resulting scalars are catenated together.

Such an expression produces a vector made of as many items as we have arrays on the right. In the example that follows, the result is a nested vector:

1 2 small 6 7

┌─┬─┬─────┬─┬─┐ │1│2│3 4 5│6│7│ └─┴─┴─────┴─┴─┘

This is what we call vector notation or strand notation. In this case, we juxtaposed five arrays, so we created a nested array of length five.

What happens here is that each of the five arrays is first enclosed, and then the resulting five scalars are catenated together to produce the 5-item vector. Please remember that enclosing a simple scalar does not change it, so you can only see the difference for the array small:

(1 2) small 6 7

┌───┬─────┬─┬─┐ │1 2│3 4 5│6│7│ └───┴─────┴─┴─┘

Here, we juxtaposed four arrays, two of which are vectors. It is, again, an example of strand notation.

In other words, juxtaposition works on arrays seen as building blocks, while catenation works on the contents of the arrays.

It may help you to know that there is a strict relationship between catenation and strand notation: a b c is the same as (⊂a),(⊂b),(⊂c).

Here is an example:

a ← ⍬
b ← 'apl'
c ← 42

a b c

┌┬───┬──┐ ││apl│42│ └┴───┴──┘
(⊂a),(⊂b),(⊂c)

┌┬───┬──┐ ││apl│42│ └┴───┴──┘

The two results look the same; we can be sure they are the same by using ≡:

a b c≡(⊂a),(⊂b),(⊂c)

1

Now, we will turn our attention to two other expressions that give the same result,

(1 2) small,6 7

┌───┬─────┬─┬─┐ │1 2│3 4 5│6│7│ └───┴─────┴─┴─┘

and

(1 2) small 6 7

┌───┬─────┬─┬─┐ │1 2│3 4 5│6│7│ └───┴─────┴─┴─┘

These two expressions give the same result, but for a different reason than the one explained above. In fact, small is not catenated to the vector 6 7 as in the first example above. To read this expression correctly, we must recall comma is an APL function:

• its right argument is the vector 6 7, of course; and

• its left argument is whatever is on its left, up to the next function. As there is no such function (parenthesis are not functions), the left argument is the result of the entire expression to the left of the comma, i.e., the 2-item vector (1 2) small.

So, the result is that the 2-item vector (1 2) small is combined with the 2-item vector 6 7 to form the resulting 4-item vector.

Remember this: when interpreting an expression, you must never “break” a sequence of juxtaposed arrays (a strand), even if it is a nested vector.

So, in the previous example, the left argument to catenate is this whole array:

(1 2) small

┌───┬─────┐ │1 2│3 4 5│ └───┴─────┘

When catenate is executed, the two items of this argument are catenated to the two items 6 7 of the right argument, making the same 4-item nested vector as in the previous example.

Can you predict the result of (1 2),small 6 7?

10.6.3. Characters and Numbers#

We have a character matrix cm and a numeric matrix nm:

⎕← cm ← 3 7⍴'FrancisCarmen Luciano'

Francis Carmen Luciano
⎕RL ← 73
⎕← nm ← (?3 4⍴200000)÷100

21.21 1534.88 375.46 704.5 1125.14 1963.52 464.45 1438.25 796.53 1569 157.14 886.59

We would like to have them displayed side by side.

10.6.3.1. Solution 1#

The first idea is to just type cm nm:

cm nm

┌───────┬──────────────────────────────┐ │Francis│ 21.21 1534.88 375.46 704.5 │ │Carmen │1125.14 1963.52 464.45 1438.25│ │Luciano│ 796.53 1569 157.14 886.59│ └───────┴──────────────────────────────┘

The format of the result is not ideal; some values have two decimal digits, and some have only one or none. But there is a much more important problem. Imagine that we would like to draw a line on the top of the report. We can catenate a single dash along the first dimension:

'-'⍪cm nm

┌─┬───────┬──────────────────────────────┐ │-│Francis│ 21.21 1534.88 375.46 704.5 │ │ │Carmen │1125.14 1963.52 464.45 1438.25│ │ │Luciano│ 796.53 1569 157.14 886.59│ └─┴───────┴──────────────────────────────┘

This is not what we expected: the dash has been placed on the left, not on the top! The reason is that the expression cm nm does not produce a matrix, but a 2-item nested vector. And when one catenates a scalar to a vector, it is inserted before its first item or after the last one, to produce a longer vector. This cannot produce a matrix, unless laminate is used, but we shall not try that now.

10.6.3.2. Solution 2#

Well, if juxtaposition doesn’t achieve what we want, why shouldn’t we catenate our two matrices?

cm,nm

Francis 21.21 1534.88 375.46 704.5 Carmen 1125.14 1963.52 464.45 1438.25 Luciano 796.53 1569 157.14 886.59

This is almost the same presentation, but not exactly; this is a matrix!

Now, let us try to draw the line:

'-'⍪cm,nm

------- - - - - Francis 21.21 1534.88 375.46 704.5 Carmen 1125.14 1963.52 464.45 1438.25 Luciano 796.53 1569 157.14 886.59

Horrible! What happened?

When we catenated cm (shape 3 7) with nm (shape 3 4), we produced a 3 by 11 matrix. So, when we further catenated a dash on top of it, the dash was repeated 11 times to fit the last dimension of the matrix. This is why we obtained 7 dashes on top of the 7 text columns, and 4 dashes, each on top of each of the 4 numeric columns. This is still not what we want!

10.6.3.3. Solution 3#

The final solution will be the following: convert the numbers into text, using the format function, and then catenate one character matrix to another character matrix:

'-'⍪cm,9 2⍕nm

------------------------------------------- Francis 21.21 1534.88 375.46 704.50 Carmen 1125.14 1963.52 464.45 1438.25 Luciano 796.53 1569.00 157.14 886.59

Now, the line is exactly where we want it and the numbers are nicely formatted.

Exercise 10.1

Deduce the results of the following 3 expressions (depth, rank, shape), and then verify your solutions on the computer:

(⊂cm) (⊂nm)
(⊂cm),(⊂nm)
cm,⊂nm


10.6.4. Some More Operations#

Let us use vec5 once more.

10.6.4.1. Reduction#

+/vec5

┌────────┐ │59 63 69│ └────────┘

Notice the box around the final result!

The three enclosed arrays (scalars) have been added together, and the result is therefore an enclosed array (a scalar). You can tell this from the output, because there is a box around the result.

We know that the reduction of a vector (rank 1) produces a scalar (rank 0), and this rule still applies here.

To obtain the contents of the (enclosed) vector, we must disclose the result:

⊃+/vec5

59 63 69

The same thing can be observed if we try to collect all the values contained in vec5 into a single vector, by catenating them together:

,/vec5

┌───────────────────────┐ │7 5 1 19 14 13 33 44 55│ └───────────────────────┘

It worked, but here again we might want to disclose the result:

⊃,/vec5

7 5 1 19 14 13 33 44 55

10.6.4.2. Index Of and Membership#

The function index of (dyadic ⍳) may be used to search for (find the position of) items in a nested vector:

vec5 ⍳ (19 14 13)(1 5 7)

2 4

This is correct: the first vector appears in vec5 as vec5[2], and the second vector is not present.

But beware, there is a booby trap:

vec5 ⍳ (19 14 13)

4 4 4

(19 14 13) is not a nested array. vec5 is searched for each of these three numbers individually, and they are not found.

To get the expected result, we need to enclose the right argument to index of:

vec5 ⍳ ⊂19 14 13

2

It is also important to be aware of this when using membership:

(3 4 5)(7 5 1) ∊ vec5

0 1
(7 5 1) ∊ vec5

0 0 0
(⊂7 5 1) ∊ vec5

1

10.6.4.3. Indexing#

The rules we saw about indexing remain true: when one indexes a vector by an array, the result has the same shape as the array. If the vector is nested, the result is generally nested too:

]display vec4

┌→───────────────────────────────┐ │ ┌→────┐ ┌→──┐ ┌→───────┐ │ │ │15 12│ 71023 ↓1 2│ │74 85 96│ │ │ └~────┘ │3 4│ └~───────┘ │ │ └~──┘ │ └∊───────────────────────────────┘
]display vec4[2 2⍴4 2 1 3]

┌→─────────────────┐ ↓ ┌→───────┐ │ │ │74 85 96│ 71023 │ │ └~───────┘ │ │ ┌→────┐ ┌→──┐ │ │ │15 12│ ↓1 2│ │ │ └~────┘ │3 4│ │ │ └~──┘ │ └∊─────────────────┘

We have also seen, in Section 3.5.3, that a nested array can be used as an index. For example, to index items scattered throughout a matrix, the array that specifies the indices is composed of 2-item vectors (row and column indices):

⎕← tests ← 6 3⍴11 26 22 14 87 52 30 28 19 65 40 55 19 31 64 33 70 44

11 26 22 14 87 52 30 28 19 65 40 55 19 31 64 33 70 44
tests[(2 3)(5 1)(1 2)]

52 19 26
tests[2 2⍴(2 3)(5 1)(1 2)]

52 19 26 52

Let us try to obtain the same result with the index function, or squad:

(2 3)(5 1)(1 2) ⌷ tests

LENGTH ERROR
(2 3)(5 1)(1 2)⌷tests
∧


The above cannot work. Index expects a 2-item vector: a list of rows and a list of columns.

(2 3)(5 1)(1 2) ⌷¨ tests

RANK ERROR
(2 3)(5 1)(1 2)⌷¨tests
∧


This second attempt also won’t work: each item of the left argument cannot be associated with a corresponding item of tests, because they do not have the same shape.

In order to get this to work, we need to enclose tests:

(2 3)(5 1)(1 2) ⌷¨ ⊂tests

52 19 26

This last expression worked correctly. Each couple of indices is paired with tests as a whole because it has been enclosed, and therefore the scalar on the right is extended to match the 3-item vector on the left.

Always keep in mind the following rules:

• The items of a nested array are scalars and are therefore always processed as scalars.

In the expression below,

(5 6)(4 2)×10 5

┌─────┬─────┐ │50 60│20 10│ └─────┴─────┘

(5 6) is multiplied by 10 and (4 2) is multiplied by 5.

• A single list of values placed between parentheses is not a nested array:

(45 77 80)

45 77 80

The parentheses do nothing here.

• An expression is always evaluated from right to left, one function at a time. Note that strands can be easy to miss when determining what the left argument of a function is.

In the expression 2×a 3+b, the left argument of the plus function is not 3 alone, but the vector a 3.

Before we go any further with nested arrays, we recommend that you try to solve some exercises.

10.7. Intermission Exercises#

You are given three numeric vectors:

a ← 1 2 3
b ← 4 5 6
c ← 7 8 9


Exercise 10.2

Try to predict the results given by the following expressions in terms of depth, rank, and shape. Then check your results using ]display, or the appropriate primitives.

1. a b c × 1 2 3

2. (10 20),a

3. (10 20),a b

4. a b 2 × c[2]

5. 10×a 20×b

Exercise 10.3

Same question for the following expressions:

1. +/a b c

2. +/¨a b c

3. 1 0 1/¨a b c

4. (a b c)⍳(4 5 6)

5. 1 10 3 ∊ a

6. (⊂1 0 1)/¨a b c

7. 1 10 3 ∊ a b c

Exercise 10.4

What are the results of +/na and ,/na for the vector na shown below?

⎕← na ← 1 2 (2 2⍴3 4 5 6)7 8

┌─┬─┬───┬─┬─┐ │1│2│3 4│7│8│ │ │ │5 6│ │ │ └─┴─┴───┴─┴─┘

10.8. Split and Mix#

We saw that in some cases we can choose to represent data either as a matrix or as a nested vector; remember monMat and monVec.

Two primitive monadic functions are provided to switch from one form to the other:

• Mix (↑) returns an array of higher rank and lower depth than that of its argument; and

• Split (↓) returns an array of lower rank and higher depth than that of its argument.

10.8.1. Basic Use#

Let us apply mix to two small vectors:

vtex ← 'One' 'Two' 'Three'
vnum ← (6 2) 14 (7 5 3)
⎕← rtex ← ↑ vtex

One Two Three

Notice how we have converted a nested vector (of depth 2 and rank 1) into a simple matrix (of depth 1 and rank 2).

⎕← rnum ← ↑ vnum

6 2 0 14 0 0 7 5 3

In this example, we have converted a nested vector (of depth -2 and rank 1) into a simple matrix (of depth 1 and rank 2).

Of course the operation is possible only because the shorter items are padded with blanks (for text) or zeroes (for numbers), or more generally by the appropriate fill item (this notion will be explained soon).

The last example above shows that when we say that the depth is reduced, we actually mean that the magnitude of the depth is reduced.

And now, let us apply split to the matrices we have just produced:

⎕← newtex ← ↓rtex

┌─────┬─────┬─────┐ │One │Two │Three│ └─────┴─────┴─────┘

We converted a simple matrix (of depth 1 and rank 2) into a nested vector (of depth 2 and rank 1).

⎕← newnum ← ↓rnum

┌─────┬──────┬─────┐ │6 2 0│14 0 0│7 5 3│ └─────┴──────┴─────┘

Note that the two new vectors (newtex and newnum) are not identical to the original ones (vtex and vnum) because, when they were converted into the matrices rtex and rnum, the shorter items were padded. When one splits a matrix, the items of the result all have the same size.

10.8.1.1. Mix Applied to Heterogeneous Data#

The examples shown above represent very common uses of mix and split. However, it is of course also possible to apply the functions to heterogeneous data.

For example, we can mix text and numbers:

↑'Mixed' (11 43)

M i x e d 11 43 0 0 0

And we can also mix a simple vector with a nested one. As expected, the result below is a 2 by 3 matrix:

↑ 'Yes' ('Oui' 'Da' 'Si')

┌───┬──┬──┐ │Y │e │s │ ├───┼──┼──┤ │Oui│Da│Si│ └───┴──┴──┘

10.8.2. Axis Specification#

10.8.2.1. Split#

When we apply the function split to an array, its rank will decrease, so we must specify which of its dimensions is to be suppressed. If we don’t specify it explicitly, the default is to suppress the last dimension.

Let us work on chemistry, a matrix we used earlier:

⎕← chemistry ← 3 5⍴'H2SO4CaCO3Fe2O3'

H2SO4 CaCO3 Fe2O3

In this case, there are two possible uses of split: we can apply it either to the first dimension or to the second dimension.

If we specify the first axis, the matrix is split column-wise:

↓[1]chemistry

┌───┬───┬───┬───┬───┐ │HCF│2ae│SC2│OOO│433│ └───┴───┴───┴───┴───┘

If we specify the second axis, the matrix is split row-wise:

↓[2]chemistry

┌─────┬─────┬─────┐ │H2SO4│CaCO3│Fe2O3│ └─────┴─────┴─────┘

If we omit the axis specification, split defaults to the last axis:

↓chemistry

┌─────┬─────┬─────┐ │H2SO4│CaCO3│Fe2O3│ └─────┴─────┴─────┘

10.8.2.2. Mix#

The use of mix is a bit more complex because it adds a new dimension to an existing array. So does the function laminate, and the two functions use the same convention to specify where to insert the new dimension.

If we apply the function mix to a 3-item nested vector of vectors, in which the largest item is an enclosed 5-item vector, the result must be either a 5 by 3 matrix, or a 3 by 5 matrix (the default).

In the same way as for laminate, a new dimension is created. This new dimension can be inserted before or after the existing dimension. The programmer decides this by specifying an axis:

• [0.5] inserts the new dimension before the existing one, resulting in a 5 by 3 matrix; or

• [1.5] inserts the new dimension after the existing one, resulting in a 3 by 5 matrix.

↑[0.5]'One' 'Two' 'Three'

OTT nwh eor e e
↑[1.5]'One' 'Two' 'Three'

One Two Three

The last example is the default behaviour, where the new dimension is inserted after the existing one:

↑'One' 'Two' 'Three'

One Two Three

Let us now work with a nested matrix:

⎕← friends ← 2 3⍴'John' 'Mike' 'Anna' 'Noah' 'Suzy' 'Paul'

┌────┬────┬────┐ │John│Mike│Anna│ ├────┼────┼────┤ │Noah│Suzy│Paul│ └────┴────┴────┘

The shape of this matrix is 2 3, and its items are all of length 4. So, mix can produce three different results, according to axis specifications as follows:

With the axis

the new dimension is inserted

and the resulting shape is

[2.5]

after 2 3

 2 3 4

[1.5]

between 2 and 3

 2 4 3

[0.5]

before 2 3

4 2 3

Each of these three cases is illustrated below.

↑[2.5]friends    ⍝ Default case, [2.5] was unnecessary.

John Mike Anna Noah Suzy Paul
⍴↑[2.5]friends

2 3 4
↑[1.5]friends

JMA oin hkn nea NSP oua azu hyl
⍴↑[1.5]friends

2 4 3
↑[0.5]friends

JMA NSP oin oua hkn azu nea hyl
⍴↑[0.5]friends

4 2 3

In the first example, the names are placed “horizontally” as rows in two sub-matrices.

In the second case, they are placed “vertically” in columns.

The third case is more difficult to read; the names are positioned perpendicularly to the matrices, with one letter in each. You might like to imagine that the letters are arranged in a cube, and that you are viewing it from three different positions.

Notice that, naturally, there is a connection between using ↑[k] and using mix followed by dyadic transpose.

The tables above have shown that the main difference between using the default mix, or using mix with axis, pertains to the place where the new axis gets inserted into the shape of the result. Therefore, one can always use dyadic transpose after mix to shuffle the axis of the result to the intended position.

Let us revisit the examples above using friends.

↑[0.5]friends will have a resulting shape of 4 2 3, while ↑friends has a shape of 2 3 4. Therefore, dyadic transpose needs to move the last axis of ↑friends to the front:

2 3 1⍉↑friends

JMA NSP oin oua hkn azu nea hyl
(↑[0.5]friends)≡2 3 1⍉↑friends

1

Recall that the left argument of dyadic transpose tells you the position to which each axis goes. If la is the left argument of dyadic transpose, la ← 2 3 1, then la[1] tells us where the 1st axis goes, la[2] tells us where the 2nd axis goes, and la[3] tells us where the 3rd (and last) axis goes.

Because la[3] is 1, we know that the last axis (which was created by mix) will now become the first axis, and the axes that were in positions 1 and 2 will move one position down, to 2 and 3.

Similarly, we can determine what should be the left argument to dyadic transpose if we were to use it instead of doing ↑[1.5]friends. With ↑[1.5], we want the new axis to go in the middle. If we work from ↑friends, the last axis in ↑friends needs to go to position 2, so we have la ← ? ? 2. We just have to fill in the rest of the left argument, making sure that the original axes remain ordered:

la ← 1 3 2
la⍉↑friends

JMA oin hkn nea NSP oua azu hyl
(↑[1.5]friends)≡la⍉↑friends

1

10.9. Type, Prototype, Fill Item#

Some operations like expand or take may insert new additional items into an array. Up to now, things were simple; numeric arrays were expanded with zeroes and character arrays were expanded with blanks. But what will happen if the array contains both numbers and characters (a mixed array), or if it is a nested array?

We need a variable to experiment a little:

⎕← hogwash ← 19 (2 2⍴⍳4) (3 1⍴'APL') (2 2⍴5 8 'Nuts' 9)

┌──┬───┬─┬────────┐ │19│1 2│A│┌────┬─┐│ │ │3 4│P││5 │8││ │ │ │L│├────┼─┤│ │ │ │ ││Nuts│9││ │ │ │ │└────┴─┘│ └──┴───┴─┴────────┘

What would be the result of expressions like 6↑hogwash or 1 1 0 1 0 1\hogwash?

In general, when expanding an array, APL inserts fill items, and it does so using the prototype of the array. In order to understand what the prototype of hogwash is, we first need to understand what the type of an array is.

Definition

The type of an array is an array with the exact same structure (shape, rank, and depth, for all levels of nesting) in which all numbers are replaced by zeroes and all characters are replaced by blanks.

For example, here is the type of hogwash:

⎕← hogwashType ← 0 (2 2⍴0) (3 1⍴' ') (2 2⍴0 0 '    ' 0)

┌─┬───┬─┬────────┐ │0│0 0│ │┌────┬─┐│ │ │0 0│ ││0 │0││ │ │ │ │├────┼─┤│ │ │ │ ││ │0││ │ │ │ │└────┴─┘│ └─┴───┴─┴────────┘

As we can (not) see, the type of a nested array may be difficult to interpret because of the invisible blanks:

]display hogwashType

┌→─────────────────────────┐ │ ┌→──┐ ┌→┐ ┌→─────────┐ │ │ 0 ↓0 0│ ↓ │ ↓ │ │ │ │0 0│ │ │ │ 0 0 │ │ │ └~──┘ │ │ │ │ │ │ └─┘ │ ┌→───┐ │ │ │ │ │ │ 0 │ │ │ │ └────┘ │ │ │ └∊─────────┘ │ └∊─────────────────────────┘

Having defined what the type of an array is, we can define what the prototype of an array is:

Definition

In other words, the prototype of an array is its first item, in which all numbers are replaced by zeroes and all characters are replaced by blanks.

The prototype of an array is used as a fill item whenever an operation needs to create additional items.

The first item of hogwash is a number, so the prototype of hogwash is a single zero. If we lengthen the vector using overtake, it will be padded with zeroes (fill items):

6↑hogwash

┌──┬───┬─┬────────┬─┬─┐ │19│1 2│A│┌────┬─┐│0│0│ │ │3 4│P││5 │8││ │ │ │ │ │L│├────┼─┤│ │ │ │ │ │ ││Nuts│9││ │ │ │ │ │ │└────┴─┘│ │ │ └──┴───┴─┴────────┴─┴─┘

Similarly, if we expand the array, the new items will also be zeroes:

1 1 0 1 0 1\hogwash

┌──┬───┬─┬─┬─┬────────┐ │19│1 2│0│A│0│┌────┬─┐│ │ │3 4│ │P│ ││5 │8││ │ │ │ │L│ │├────┼─┤│ │ │ │ │ │ ││Nuts│9││ │ │ │ │ │ │└────┴─┘│ └──┴───┴─┴─┴─┴────────┘

Let us rotate the vector by one position:

hogwash ← 1⌽hogwash


Now, the first item is a numeric matrix:

⊃hogwash

1 2 3 4

Therefore, the prototype of hogwash is now

2 2⍴0

0 0 0 0

If we take six items from hogwash, two such matrices will be added:

6↑hogwash

┌───┬─┬────────┬──┬───┬───┐ │1 2│A│┌────┬─┐│19│0 0│0 0│ │3 4│P││5 │8││ │0 0│0 0│ │ │L│├────┼─┤│ │ │ │ │ │ ││Nuts│9││ │ │ │ │ │ │└────┴─┘│ │ │ │ └───┴─┴────────┴──┴───┴───┘

Let us rotate the variable once more:

hogwash ← 1⌽hogwash


Now, the first item is a little 3 by 1 character matrix containing the letters 'APL'. So, the prototype will be a 3 by 1 character matrix containing three blank spaces. This is the array that will be used by expand as the fill item. Let us verify it:

]display 1 1 0 1 0 1\hogwash

┌→──────────────────────────────────┐ │ ┌→┐ ┌→─────────┐ ┌→┐ ┌→┐ ┌→──┐ │ │ ↓A│ ↓ │ ↓ │ 19 ↓ │ ↓1 2│ │ │ │P│ │ 5 8 │ │ │ │ │ │3 4│ │ │ │L│ │ │ │ │ │ │ └~──┘ │ │ └─┘ │ ┌→───┐ │ └─┘ └─┘ │ │ │ │Nuts│ 9 │ │ │ │ └────┘ │ │ │ └∊─────────┘ │ └∊──────────────────────────────────┘

If we repeat the rotation, the first item will be a nested matrix. So, the prototype (and hence, also the fill item) will be a 2 by 2 nested matrix. Let us try to overtake again:

hogwash ← 1⌽hogwash

]display 6↑hogwash

┌→────────────────────────────────────────────────────┐ │ ┌→─────────┐ ┌→──┐ ┌→┐ ┌→─────────┐ ┌→─────────┐ │ │ ↓ │ 19 ↓1 2│ ↓A│ ↓ │ ↓ │ │ │ │ 5 8 │ │3 4│ │P│ │ 0 0 │ │ 0 0 │ │ │ │ │ └~──┘ │L│ │ │ │ │ │ │ │ ┌→───┐ │ └─┘ │ ┌→───┐ │ │ ┌→───┐ │ │ │ │ │Nuts│ 9 │ │ │ │ 0 │ │ │ │ 0 │ │ │ │ └────┘ │ │ └────┘ │ │ └────┘ │ │ │ └∊─────────┘ └∊─────────┘ └∊─────────┘ │ └∊────────────────────────────────────────────────────┘

Obviously, fill items are generally only useful for arrays whose items have a uniform structure.

We will talk a bit about computing the type and prototype of arrays in Section 10.16.2.

10.10. Pick#

10.10.1. Definition#

Whenever you need to select one (and only one) item from an array, you can use the dyadic function pick, represented by the symbol ⊃. What makes pick different from ordinary indexing is that it is possible to “dig into” a nested array and pick an item at any level of nesting, and that it discloses the result. The latter is probably the reason why pick and the monadic function disclose use the same symbol.

The syntax of pick is as follows: r ← path ⊃ data.

The left argument is a scalar or a vector which specifies the path that leads to the desired item. Each item of path is the index or set of indices needed to reach the item at the corresponding level of depth of the array.

The operation starts at the outermost level and goes deeper and deeper into the levels of nesting. At each level, the selected item is disclosed before applying the next level of selection.

We shall work with the nested matrix weird from a previous section:

⎕← weird ← 2 2⍴456 (2 2⍴ 'Dyalog' 44 27 (2 2⍴8 6 2 4)) (17 51) 'Twisted'

┌─────┬────────────┐ │456 │┌──────┬───┐│ │ ││Dyalog│44 ││ │ │├──────┼───┤│ │ ││27 │8 6││ │ ││ │2 4││ │ │└──────┴───┘│ ├─────┼────────────┤ │17 51│Twisted │ └─────┴────────────┘

Let us try to select the value 51.

To select the 51 we must first select the vector located in row 2, column 1 of the matrix, and then select the second item of that vector. This is how we express this selection using pick:

(2 1) 2 ⊃ weird

51

The left argument (2 1) 2 is a 2-item vector because we need to select at two levels of nesting.

Using simple indexing and explicit disclosing we need a much more complicated expression to obtain the same selection:

⊃(⊃weird[2;1])[2]

51

Although, to be fair, in this special case the leftmost ⊃ was not required. (Can you figure out why?)

We can also select the letter “g” within “Dyalog”. To do so, we must first select the matrix located in row 1, column 2. Within this matrix, we must select the character vector located in row 1, column 1. Finally, we must select the 6th item of that character vector:

(1 2) (1 1) 6 ⊃ weird

g

This time, the left argument is a 3-item vector because we need to select at three levels of nesting:

• (1 2) is the set of indices for the selection at the outermost level of depth;

• (1 1) is the set of indices for the selection at the second level of depth; and

• 6 is the index for the selection at the third level of depth.

Using simple indexing, this selection is almost obscure:

⊃(⊃(⊃weird[1;2])[1;1])[6]

g

10.10.2. Left Argument Length#

The left argument to pick is a vector with as many items as the depth at which we want to select an item. Each item of the left argument has a number of items corresponding to the rank of the sub-item at the corresponding depth at which it operates.

If we remove the last item of path in the example above, the selection will stop one level above the level at which it stopped before. This means that we would select the entire character vector 'Dyalog' instead of just the letter 'g':

(1 2) (1 1) ⊃ weird

Dyalog

Yes, we selected the entire character vector. Please, note again that the result has been disclosed, so that a simple array is returned in this case, instead of a scalar which is an enclosed vector.

The difference becomes more clear if we compare this with the equivalent simple indexing without the final disclose:

(⊃weird[1;2])[1;1]

┌──────┐ │Dyalog│ └──────┘

We tried removing the last item of path, but what happens if we instead remove the last two items of path? If we remove the last two items of path, we might expect to select the entire 2 by 2 nested matrix that contains the character vector 'Dyalog':

(1 2) ⊃ weird

RANK ERROR
(1 2)⊃weird
∧


But it does not work!

The reason for this is a problem that we have seen before:

In the expression (1 2) (1 1) ⊃ weird, the item (1 2) is a scalar (an enclosed vector) because of strand notation. The left argument to pick has two items, because we want to select an item at the second level.

In the expression (1 2) ⊃ weird, we do not have a strand, so the argument (1 2) is not enclosed. It is a (simple) 2-item vector and, therefore, only suitable for selection at the second level. The RANK ERROR is reported because we try to use a scalar 1 as an index at the outermost level. However, at this level the array is a matrix, so two items are needed to form a proper index.

We want to select at the outermost level, so the left argument to pick must have exactly one item. Therefore, we must explicitly enclose the vector, leading to the correct expression:

(⊂1 2) ⊃ weird

┌──────┬───┐ │Dyalog│44 │ ├──────┼───┤ │27 │8 6│ │ │2 4│ └──────┴───┘

We still need two indices inside the enclosure because, at the outermost level, the array is a matrix.

The expression we used before (without the explicit enclose) is inappropriate for the array weird, but it could work fine with a different array; for example, to take the first item of a nested vector, and then select the second item of it, as shown here:

1 2⊃'Madrid' 'New York' 'London'

a

The 1 selects 'Madrid', and the 2 then selects the 'a'.

In this expression, an enclose would be wrong, as we need to select at two levels. However, at each level we only need one index, as we select from vectors at both levels.

10.10.3. Disclosed Result#

As mentioned previously, pick returns the contents of the specified item, not the scalar which contains it.

Let us refer to the original value of hogwash (i.e., before we rotated it before):

hogwash ← 19 (2 2⍴⍳4) (3 1⍴'APL') (2 2⍴5 8 'Nuts' 9)


Because boxing is ON, we can readily tell the difference between

2⊃hogwash

1 2 3 4

and

hogwash[2]

┌───┐ │1 2│ │3 4│ └───┘

However, if boxing is OFF, we might make the mistake of believing that the two results are equal:

]box off

Was ON
2⊃hogwash

1 2 3 4
hogwash[2]

1 2 3 4

Because boxing is OFF, the two results look very similar. (An attentive reader will notice that the result of hogwash[2] is indented one space to the right, which indicates one level of nesting.) This is deceptive:

• the first expression (2⊃hogwash) returns the 2 by 2 matrix contained in hogwash:

⍴2⊃hogwash

2 2
• while the other expression merely returns the second item of hogwash, which is an enclosed matrix:

⍴hogwash[2]


To prevent us from shooting ourselves in the foot, let us turn boxing back ON:

]box on

Was OFF

10.10.4. Pick First#

We have not mentioned this before (because up to now we have only used it on 1-item arrays), but disclose ⊃ actually discloses just the first item of an array. All other items are ignored. In other words, disclose ⊃array is the same as 1⊃,array. For this reason, the function ⊃ is also called first:

⊃26 (10 20 30) 100

26
⊃'January' 'February' 'March'

January
⊃2 2⍴'Dyalog' (2 2⍴⍳4) 'APL' 100

Dyalog
⊃12

12

10.10.5. Selective Assignment#

When one wants to modify an item deep inside an array, it is important to remember that pick returns a disclosed result.

For example, let us try to replace the number 5 with the character vector 'five' in the fourth item of hogwash.

If we wanted to extract the value 5, we would just write

4 (1 1)⊃hogwash

5

To replace it, we use the same expression in a normal selective assignment:

(4 (1 1)⊃hogwash) ← 'Five'
hogwash

┌──┬───┬─┬────────┐ │19│1 2│A│┌────┬─┐│ │ │3 4│P││Five│8││ │ │ │L│├────┼─┤│ │ │ │ ││Nuts│9││ │ │ │ │└────┴─┘│ └──┴───┴─┴────────┘

And it works, though we haven’t enclosed the replacement value! Going back is just as easy:

(4 (1 1)⊃hogwash) ← 5
hogwash

┌──┬───┬─┬────────┐ │19│1 2│A│┌────┬─┐│ │ │3 4│P││5 │8││ │ │ │L│├────┼─┤│ │ │ │ ││Nuts│9││ │ │ │ │└────┴─┘│ └──┴───┴─┴────────┘

10.10.6. An Idiom#

Suppose you have a nested vector:

nv ← (3 7 5)(9 7 2 8)(1 6)(2 0 8)


You can select one of its items with:

2⊃nv

9 7 2 8

But how can you select two (or more) items? For example, the 2nd and the 4th items?

2 4⊃nv

8

This does not work; it selects only one item: the 4th item of the 2nd item, which is the number 8 in this case.

Maybe we can use each ⊃¨ to pick each of the items we want?

2 4⊃¨nv

LENGTH ERROR
2 4⊃¨nv
∧


This gives a LENGTH ERROR because ¨ is trying to pair each of the two numbers on the left with an item on the right, but nv has a total of four items.

In order to fix this, we need to enclose nv so that ¨ knows to pair each number on the left with the whole vector nv:

2 4⊃¨⊂nv

┌───────┬─────┐ │9 7 2 8│2 0 8│ └───────┴─────┘

This expression is known as the “chipmunk idiom”, probably because of the eyes and moustaches of the combined symbol: ⊃¨⊂.

10.11. Reach Indexing#

10.11.1. Relationship to Pick#

The way in which you can use pick to access elements from a nested array is very similar to another indexing notation that is called reach indexing. Unlike simple indexing and choose indexing, which only let you access the scalars of an array, reach indexing can be used to index into arbitrary levels of depth of nested arrays. Hence, its name.

In reach indexing, the index specification is given by a non-simple integer array, each of whose items reach down to a nested element of the array being indexed. As we will see, each of those items works in the same way as the left argument of pick.

Recall the nested array weird:

weird

┌─────┬────────────┐ │456 │┌──────┬───┐│ │ ││Dyalog│44 ││ │ │├──────┼───┤│ │ ││27 │8 6││ │ ││ │2 4││ │ │└──────┴───┘│ ├─────┼────────────┤ │17 51│Twisted │ └─────┴────────────┘

We learned that, to access the nested character vector 'Twisted', we could pick it with the left argument (⊂2 2). Similarly, we can access the integer 44 by picking it with the left argument (1 2) (1 2):

(⊂2 2)⊃weird

Twisted
(1 2)(1 2)⊃weird

44

To pick both in a single expression, we need to use the idiom we just learned:

(⊂2 2)((1 2)(1 2)) ⊃¨⊂ weird

┌───────┬──┐ │Twisted│44│ └───────┴──┘

By using reach indexing, we just need to take the left argument of ⊃¨⊂ and put it inside square brackets:

weird[(⊂2 2)((1 2)(1 2))]

┌───────┬──┐ │Twisted│44│ └───────┴──┘

One key difference between reach indexing and using pick (or the idiom) is that pick will disclose the result:

(⊂2 2)⊃weird

Twisted

Whereas reach indexing doesn’t:

weird[⊂(2 2)]

┌───────┐ │Twisted│ └───────┘

10.11.2. Reach Versus Choose Indexing#

In some situations, indices for reach indexing can look like indices for choose indexing. For example, to pick the character vector 'DYALOG' from the array weird, we do

(1 2)(1 1)⊃weird

Dyalog

Thus, one might think that weird[(1 2)(1 1)] uses reach indexing to fetch that same character vector (but enclosed). Alas, this doesn’t work. Or at least, not in the intended way:

weird[(1 2)(1 1)]

┌────────────┬───┐ │┌──────┬───┐│456│ ││Dyalog│44 ││ │ │├──────┼───┤│ │ ││27 │8 6││ │ ││ │2 4││ │ │└──────┴───┘│ │ └────────────┴───┘

The result we got not what we expected because (1 2)(1 1) was interpreted as an index vector with two scalars, whereas we wanted it to refer to a single element of the array weird. To fix this, we have to enclose that vector to make it a scalar:

weird[⊂(1 2)(1 1)]

┌──────┐ │Dyalog│ └──────┘

In reach indexing, each scalar of the index array reaches to a single item, so (1 2)(1 1) can be seen as an index vector for choose indexing (for the scalars weird[1;2] and weird[1;1]) or an index vector for reach indexing that accesses the same values.

Thus, we can see that reach indexing and choose indexing overlap, but when they do, both schemes interpret the indices in the same way.

10.12. Partitioned Enclose & Partition#

10.12.1. Partitioned Enclose#

The primitive function partitioned enclose is the dyadic use of the left shoe ⊂. It is used to group the items of an array into a vector of nested items, or enclosures, according to a specified pattern. It is used as r ← pattern ⊂ array, or optionally with an axis specification: r ← pattern ⊂[axis] array.

Partitioned enclose breaks up the right argument array into nested items, as determined by the left argument pattern.

10.12.1.1. Simple Boolean Vector Left Argument#

Let us start by understanding how partitioned enclose works when the left argument pattern is a simple Boolean vector:

1 0 0 1 0 0 0 0 0 ⊂ 'Partition'

┌───┬──────┐ │Par│tition│ └───┴──────┘
1 0 0 1 0 0 1 0 0  ⊂ 'Partition'

┌───┬───┬───┐ │Par│tit│ion│ └───┴───┴───┘

The two examples seem to show that the 1s in the left argument specify where new enclosures of the right argument start. The 0s just put the corresponding elements in the preceding enclosure.

Notice that, as soon as we start the last enclosure (with the last 1), the trailing 0s are irrelevant. Thus, we can safely omit them from the left argument:

1 0 0 1 0 0 1 ⊂ 'Partition'

┌───┬───┬───┐ │Par│tit│ion│ └───┴───┴───┘

Again, we can omit trailing zeroes, but we do not have to. In fact, in older versions of Dyalog APL, partitioned enclose expects the trailing zeroes to be present. In other words, the ability to not specify trailing zeroes was an extension to partitioned enclose that was introduced after partitioned enclose had been in the language.

We have seen what we can do about trailing zeroes. It is also important to understand what happens when the left argument has leading zeroes:

0 0 1 0 0 1 0 0 1 ⊂ 'Partition'

┌───┬───┬─┐ │rti│tio│n│ └───┴───┴─┘

Leading zeroes have not been preceded by any enclosures, so the corresponding items have nowhere to go. Because of that, they are omitted from the final result.

We have already covered most of the behaviour of partitioned enclose, we are only missing some details.

10.12.1.2. Multiple Enclosures#

The left argument pattern can be a simple integer vector with arbitrary non-negative integers, it doesn’t have to contain only zeroes and ones. If we interpret the role of the zeroes and ones in a slightly different way, we can immediately understand how larger integers will work.

For that, we can use less rigorous language, and say that the enclosures of the result start in the places where we inserted dividers to split the right argument. Having said that, we just have to understand how those dividers are placed:

• a 0 in the left argument means that we will insert 0 dividers before the corresponding item of the right argument; and

• a 1 in the left argument means that we will insert 1 divider before the corresponding item of the right argument.

Thus, an integer n in the left argument means that we will insert n dividers before the corresponding item of the right argument:

3 0 0 1 0 0 2 0 0 ⊂ 'Partition'

┌┬┬───┬───┬┬───┐ │││Par│tit││ion│ └┴┴───┴───┴┴───┘

Above, pattern started with a 3 and array started with 'P'. Thus, partitioned enclose must insert 3 dividers before the 'P'. Because more than one divider was inserted, only the last one gets the corresponding item from the argument array.

Using mix as visual aid, we can see clearly where the dividers will be inserted:

↑(3 0 0 1 0 0 2 0 0) 'Partition'

3 0 0 1 0 0 2 0 0 P a r t i t i o n

The usage of mix shows that we insert 3 dividers before the initial 'P', 1 divider before the first 't', and 2 dividers before the last 'i'.

10.12.1.3. Trailing Empty Enclosures#

When the left argument pattern starts with an integer that is greater than one, the final result will have some leading empty enclosures. If we want to get a result with trailing empty enclosures, we just need to make sure that the length of pattern is one greater than the length of the right argument:

2 0 0 1 0 0 1 0 0 1 ⊂ 'Partition'

┌┬───┬───┬───┬┐ ││Par│tit│ion││ └┴───┴───┴───┴┘

We can use mix again, and we will understand how the trailing 1 creates an empty enclosure by inserting a divider right after the last item of the right argument:

↑(2 0 0 1 0 0 1 0 0 1) 'Partition'

2 0 0 1 0 0 1 0 0 1 P a r t i t i o n

10.12.1.4. Scalar Left Argument#

So far, we have only seen how partitioned enclose works with a vector left argument. Now, we will see what happens if the left argument is a scalar.

First, take a look at this example:

1 0 0 0 0 0 0 0 0 ⊂ 'Partition'

┌─────────┐ │Partition│ └─────────┘

We know we can omit trailing zeroes, so we might be tempted to rewrite the example above as:

1 ⊂ 'Partition'

┌─┬─┬─┬─┬─┬─┬─┬─┬─┐ │P│a│r│t│i│t│i│o│n│ └─┴─┴─┴─┴─┴─┴─┴─┴─┘

However, when we do so, we get an unexpected result! That’s because the left argument is a scalar, and we can only omit trailing zeroes from vectors.

When the left argument is a scalar s, it gets extended to (≢array)⍴s. Therefore, the example above is equivalent to

(9⍴1) ⊂ 'Partition'

┌─┬─┬─┬─┬─┬─┬─┬─┬─┐ │P│a│r│t│i│t│i│o│n│ └─┴─┴─┴─┴─┴─┴─┴─┴─┘

10.12.1.5. Partitioned Enclose with Axis#

When we first introduced partitioned enclose, we mentioned that it can also accept an axis specification, as such: pattern ⊂[axis] array. Obviously, when array is a vector, axis is irrelevant because we can only have axis ← 1.

For the axis specification to be relevant, array needs to be of rank two or higher. First, we want to know what is the default value for axis, and we can find that out with a quick test:

1 ⊂ 2 2⍴⍳4

┌─┬─┐ │1│2│ │3│4│ └─┴─┘

When applied to a matrix with no axis specification, partitioned enclose created enclosures around the columns of the matrix, which shows that the default axis is ≢⍴axis, i.e., the last axis.

If we want to create enclosures around the rows, we can specify axis ← 1:

1 ⊂[1] 2 2⍴⍳4

┌───┬───┐ │1 2│3 4│ └───┴───┘

Notice that ⊂ returns a vector, while perhaps you expected the result to look like this:

⍪ 1 ⊂[1] 2 2⍴⍳4

┌───┐ │1 2│ ├───┤ │3 4│ └───┘

This is how partitioned enclose works: it always returns a vector with the enclosures as items.

Here is another example, where we use a 3D array as the right argument:

⎕← cuboid ← 3 4 5⍴⎕A

ABCDE FGHIJ KLMNO PQRST UVWXY ZABCD EFGHI JKLMN OPQRS TUVWX YZABC DEFGH

By using partitioned enclose along the first axis, we can get a vector with enclosures around the planes that compose cuboid:

1 0 1 ⊂[1] cuboid

┌─────┬─────┐ │ABCDE│OPQRS│ │FGHIJ│TUVWX│ │KLMNO│YZABC│ │PQRST│DEFGH│ │ │ │ │UVWXY│ │ │ZABCD│ │ │EFGHI│ │ │JKLMN│ │ └─────┴─────┘

The things we learned about the behaviour of the left argument of partitioned enclose still apply when we have a higher-dimensional right argument and/or an axis specification; we just need to interpret the left argument from the point of view of the correct axis:

1 0 1 ⊂[2] cuboid

┌─────┬─────┐ │ABCDE│KLMNO│ │FGHIJ│PQRST│ │ │ │ │UVWXY│EFGHI│ │ZABCD│JKLMN│ │ │ │ │OPQRS│YZABC│ │TUVWX│DEFGH│ └─────┴─────┘

In this example, the left argument is 1 0 1 and the axis specified is the second one, which has length

2⊃⍴cuboid

4

So, if the left argument is 1 0 1 and the axis in question has length four, we are omitting a trailing zero:

1 0 1 0 ⊂[2] cuboid

┌─────┬─────┐ │ABCDE│KLMNO│ │FGHIJ│PQRST│ │ │ │ │UVWXY│EFGHI│ │ZABCD│JKLMN│ │ │ │ │OPQRS│YZABC│ │TUVWX│DEFGH│ └─────┴─────┘

If you find it hard to visualise why the result is as shown, you can try to reason about partitioned enclose with an axis specification as a series of enclosures around indexing operations.

First, we can put the left argument up with the valid indices for the axis in question:

↑(1 0 1 0)(⍳2⊃⍴cuboid)

1 0 1 0 1 2 3 4

This shows that we will have an enclosure around indices 1 2 and another one around indices 3 4. Now, we just have to do the indexing along the correct axis. Because cuboid is a 3D array and we are working with the second index, the indexing will look like cuboid[;??;]:

(⊂cuboid[;1 2;]),(⊂cuboid[;3 4;])

┌─────┬─────┐ │ABCDE│KLMNO│ │FGHIJ│PQRST│ │ │ │ │UVWXY│EFGHI│ │ZABCD│JKLMN│ │ │ │ │OPQRS│YZABC│ │TUVWX│DEFGH│ └─────┴─────┘
(1 0 1 0⊂[2]cuboid) ≡ (⊂cuboid[;1 2;]),(⊂cuboid[;3 4;])

1

10.12.1.6. Wrap-up#

Now that we have seen the various nuances associated with partitioned enclose, we can bundle them up together. In the expression r ← pattern ⊂[axis] array, we have that:

• array may be any array;

• pattern may be a non-negative integer scalar or a simple numeric vector composed of non-negative integers;

• if left unspecified, axis defaults to ≢⍴array, i.e., the last axis of array;

• if pattern is a scalar s, it is extended to (axis⊃⍴array)⍴s;

• if pattern is a vector, its maximum length is 1+axis⊃⍴array and if the pattern length is not the maximum, it is extended with trailing zeroes;

• each non-zero element in pattern specifies how many dividers to insert before the corresponding position along the appropriate axis of array;

• each enclosure has rank ≢⍴array and shape ⍴array, except in the position specified by axis;

• the result r is a vector containing all the enclosures specified by the pattern; and

• the length of r is +⌿pattern (after extensions).

10.12.2. Partition#

The partition function is the dyadic usage of ⊆, and is somewhat similar to the partitioned enclose function.

In r ← pattern ⊆ array, pattern must be a simple vector of non-negative integers, with the same length as the specified axis of the array to be partitioned. It operates as follows:

• the first enclosure starts with the first item of the array;

• each enclosure ends when the next value of pattern is greater than the current one; and

• the items which correspond to zeroes in pattern are removed.

10.12.2.1. Working on Vectors#

We shall work with characters, but of course we could have worked with numbers just as well:

pattern ← 3 3 3 7 7 1 1 0 3 3 3 9 2 1 1 0
pattern ⊆ 'Once upon a time'

┌───┬────┬───┬────┐ │Onc│e up│n a│ tim│ └───┴────┴───┴────┘

The four enclosures correspond to the beginning of the array, plus the three increments: 3 → 7, 0 → 3, and 3 → 9. You will also notice that two characters have disappeared, because they corresponded to zeroes in the pattern.

This definition can be used to group the items of a vector according to a given vector of keys, provided that the keys are ordered in ascending order. For example:

area ← 22 22 41 41 41 41 57 63 63 63 85 85
cash ← 17 10 21 45 75 41 30 81 20 11 42 53
area ⊆ cash

┌─────┬───────────┬──┬────────┬─────┐ │17 10│21 45 75 41│30│81 20 11│42 53│ └─────┴───────────┴──┴────────┴─────┘

This definition is also extremely convenient to divide a character string into a vector of strings on the basis of a separator. For example, let us partition a vector at each of its blank characters:

phrase ← 'Panama is a canal between Atlantic and Pacific'
↑phrase(phrase≠' ')

P a n a m a i s a c a n a l b e t w e e n A t l a n t i c a n d P a c i f i c 1 1 1 1 1 1 0 1 1 0 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1
(phrase≠' ')⊆phrase

┌──────┬──┬─┬─────┬───────┬────────┬───┬───────┐ │Panama│is│a│canal│between│Atlantic│and│Pacific│ └──────┴──┴─┴─────┴───────┴────────┴───┴───────┘

The blanks have been removed, because they matched the zeroes, and a new enclosure starts at the beginning of each word, corresponding to the increment 0 → 1. As you might imagine, this is extremely useful in many circumstances. One can write a function to do it, with the separator passed as a left argument:

Cut ← {(~⍵∊⍺)⊆⍵}
↑' 'Cut phrase

Panama is a canal between Atlantic and Pacific

In fact, we wrote the function to accept not just a single separator, but a list of separators, by replacing the perhaps more obvious (⍵≠⍺) by (~⍵∊⍺). Now we can use it like this:

↑'mw' Cut phrase

Pana a is a canal bet een Atlantic and Pacific

10.12.2.2. Working on Higher-Rank Arrays#

Although partition is very simple, and clearly useful, when applied to vectors, the situation is more complex when it is applied to matrices or higher-rank arrays. This is in contrast to the definition of partitioned enclose, which works on any rank arrays in a very straightforward way. We shall not study the more complex application of partition here; if you are interested, please refer to Section 10.16.3 at the end of this chapter.

10.13. Union & Intersection#

In mathematics, one uses the two functions union and intersection to compare two sets of values. Dyalog APL provides the same functions, with the same symbols as the ones used in mathematics:

• union, left ∪ right (typed with APL + v), returns a vector containing all the items of left, followed by the items of right which do not appear in left. Both left and right must be scalars or vectors. Equivalent to left,right~left.

• intersection, left ∩ right (typed with APL + c), returns a vector containing the items of left that also appear in right. Both left and right must be scalars or vectors. Equivalent to (left∊right)/left.

15 76 43 80 ∪ 11 43 15 20 76 93

15 76 43 80 11 20 93
'we' 'are' 'so' 'happy' ∩ 'why' 'are' 'you' 'so' 'tired?'

┌───┬──┐ │are│so│ └───┴──┘

Note that these functions do not remove duplicates (because, in mathematics, all the items of a set are supposedly distinct):

1 1 2 2 ∪ 1 1 3 3 5 5

1 1 2 2 3 3 5 5
'if' 'we' 'had' 'had' 'a' 'car' ∩ 'have' 'you' 'had' 'lunch' '?'


10.14. Enlist#

Enlist is the monadic usage of epsilon ∊. Enlist returns a vector of all the simple scalars contained in an array. This could, at first sight, look very much like ravel, but it is not the same for nested arrays. Ravel just rearranges the top-level items of an array, while enlist removes all levels of nesting and returns a simple vector. Let us compare the two functions:

⎕← test ← 2 2⍴'One' 'Two' 'Three' 'Four'

┌─────┬────┐ │One │Two │ ├─────┼────┤ │Three│Four│ └─────┴────┘
,test

┌───┬───┬─────┬────┐ │One│Two│Three│Four│ └───┴───┴─────┴────┘
∊test

OneTwoThreeFour

Fig. 10.1 Expert APLer determining how to best hammer a nail.#

10.15. Exercises#

Exercise 10.5

You are given two vectors. The first contains the reference codes for some items in a warehouse. Identical codes are grouped, but not necessarily in ascending order. The second vector contains the quantities of each item sold during the day or the week.

Write a dyadic function QuantitiesSold that accepts these two vectors as arguments and calculates how many items of each reference code have been sold. Preferably, use a partitioning function.

ref ← 47 47 83 83 83 83 83 29 36 36 36 50 50
qty ←  5  8  3 18 11  1  6 10 61 52 39  8 11

ref QuantitiesSold qty

13 39 10 152 19
13 39 10 152 19 ≡ ref QuantitiesSold qty

1

Exercise 10.6

You are given two character matrices with the same number of columns. Let us call them big and small.

You are asked to find where the rows of small appear in big. i.e., for each row in small find the index of the same row in big. For those rows of small which do not appear in big, you can return the value 0, or 1+≢big.

Currently, index of ⍳ works on matrices, but this hasn’t always been the case. Thus, can you solve this exercise without using index of on matrices? (Using index of on vectors is still allowed!)

⎕← big ← 5 2⍴⍳10

1 2 3 4 5 6 7 8 9 10
⎕← small ← (2 2⍴⍳4)⍪8+2 2⍴⍳8

1 2 3 4 9 10 11 12

The result should be

big ⍳ small  ⍝ 1 2 5 0 is also acceptable.

1 2 5 6

Exercise 10.7

A partitioned enclose with a single zero as the left argument returns an empty vector. However, as you know already, not all empty vectors are the same. When working with empty vectors, we also work with prototypes, because an empty vector knows what it would contain if it were not empty.

Go over the expressions that follow and build the empty vector that matches the result of the empty partitioned enclose:

• 0⊂'Partition'

• 0⊂⍳10

• 0⊂⍬

• 0⊂3 2⍴⎕A

• 0⊂3 4 5⍴⍳60

• 0⊂(1 2 3)(4 5 6)(7 8 9)

• 0⊂('cat')('dog')(7 8 9)

• 0⊂(14 'cat' 8)('a' 2 'c' 4)(1 2 3)

The first one is already solved:

sol ← 0⍴⊂''
(0⊂'Partition')≡sol

1

Exercise 10.8

Write a monadic function StartAndEnd that, given a word or a list of words, returns a Boolean vector where 1 indicates a word that starts and ends with the same letter. Each word will have at least one letter and will consist entirely of either uppercase (A–Z) or lowercase (a–z) letters. Words consisting of a single letter can be scalars and are considered to start and end with the same letter.

StartAndEnd 'area' 'banana' 'shoes'

1 0 1
StartAndEnd 'cape'

0
StartAndEnd 'z'

1

Exercise 10.9

Write a dyadic function Extract that accepts a character vector left argument (let us call it text) and an integer vector right argument (let us call it start). We would like to extract a part of text as a simple character vector. The extract is defined as a number of sub-vectors, each being five characters long, and starting at the positions given by start.

text ← 'This boring text has been typed just for a little experiment.'
start ← 6 27 52
text Extract start

borintypedxperi
'borintypedxperi' ≡ text Extract start

1

Exercise 10.10

This exercise is the same as the previous one, but instead of extracting five characters each time, you are asked to extract a variable number of characters specified by the variable long.

You can use the same example as above plus the additional variable length:

length ← 3 8 4
text ExtractL start length

bortyped juxper
'bortyped juxper' ≡ text ExtractL start length

1

10.16. The Specialist’s Section#

Each chapter is followed by a "Specialist's Section" like this one. This section is dedicated to skilled APLers who wish to improve their knowledge. You will find here rare or complex usages of the concepts presented in this chapter, or discover extended explanations which need the knowledge of some symbols that will be seen much further in the book.

If you are exploring APL for the first time, skip this section and go to the next chapter.

10.16.1. Compatibility and Migration Level#

10.16.1.1. Migration Level#

In the early 1980s, a number of “second-generation” APL systems evolved to support nested arrays. Dyalog APL entered the market just as these systems were starting to appear, and decided to adopt the APL2 specification that IBM had been presenting to the world. In the event, unfortunately, the APL2 specification changed very late in this process, after Dyalog had more or less released Dyalog APL (or so the story goes). As a result, there are some minor differences between the dialects.

Just to give you an idea of the (sometimes) subtle differences, let us take a look at the expression a b c[2], where a, b, and c are three vectors; for example:

a ← 1 2 3
b ← 4 5 6
c ← 7 8 9


The expression a b c[2] is ambiguous; it may be interpreted in two different ways:

• does it mean “create a 3-item vector made of a, b, and the second item of c”; or

• does it mean “create a 3-item vector made of a, b, and c, and then take the second item of it (that is to say, b enclosed)?

IBM chose the first interpretation, and in an IBM-compatible implementation of APL the result would be (1 2 3) (4 5 6) 8.

In Dyalog APL, indexing is a function like any other function, in that it takes as its argument the entire vector on its left. The result is therefore ⊂4 5 6 (⊂ because strand notation nested the items):

a b c[2]

┌─────┐ │4 5 6│ └─────┘

As a minor player at the time, Dyalog wished to move the product in the direction of APL2, and in order to help the people who needed to use both IBM’s APL2 and Dyalog APL, and to make it easier to migrate an application from APL2 to Dyalog, a compatibility feature was introduced into Dyalog APL via a special system variable named ⎕ML, where the letters “ML” stand for “migration level”.

The default value for ⎕ML is 1.

To use code written according to IBM’s conventions, it is possible to set ⎕ML to higher values (up to 3), and obtain an increasing (but not total) level of compatibility with IBM’s APL2. In other words, setting ⎕ML to 0 means “the Dyalog way”, which also shows that the default value for ⎕ML is a small compromise between Dyalog’s original specification and APL2. Today, Dyalog has become a major player in the APL market. Pressure on Dyalog users to move in the direction of APL2 has faded and many users prefer the Dyalog definitions. The unfortunate result of the story is that, depending on the roots of an application, code may be written to use any one of the possible migration levels.

In this book we use the default value of ⎕ML ← 1, but we shall mention how some operations could be written in IBM’s notation.

It should be emphasised that when you select a non-zero value for ⎕ML, the “Dyalog way” of operation will no longer be available for the primitive functions that are sensitive to the selected value of ⎕ML.

Remark

⎕ML is a normal system variable. It can be localised in a function header or in a dynamic function, so that its influence is restricted to that function.

10.16.1.2. A List of Differences#

This list is not a complete list of language differences between IBM APL2 and Dyalog. It only lists the features of Dyalog APL that can be made to function like those of APL2 by setting ⎕ML appropriately.

The first column contains the operation we are talking about; the second and third columns compare how you perform that operation the “Dyalog way” (with ⎕ML ← 0) or in APL2, respectively; and the fourth column contains additional comments.

Operation

Dyalog’s implementation

IBM’s implementation

Mix

↑[n]var with n decimal

⊃[n]var with n integer or decimal

Same behaviour, different symbols. IBM’s definition requires ⎕ML≥2.

Split

↓[n]var

⊂[n]

Same behaviour, different symbols. IBM’s definition requires ⎕ML≥1.

Partition

pat⊂[n]var with pat Boolean

pat⊂[n]var with pat integer

Same syntax, but different behaviour. With ⎕ML≥3, Dyalog’s ⊂ becomes the same as ⊆, because ⊆ behaves like IBM’s partition.

First

⊃var

↑var

Same behaviour, different symbols. IBM’s definition requires ⎕ML≥2.

Enlist

n/a

∊var

No Dyalog equivalent. Requires ⎕ML>0, but because the default value of ⎕ML is 1, enlist is typically available in Dyalog.

Type

∊var

↑0⍴⊂var

No special symbol in IBM’s definition. The IBM expression requires ⎕ML≥2. Because ⎕ML←1 is the default, ∊ is generally the enlist function, not the type function. See Section 10.16.2.

Depth

≡var

≡var

If the items of var have non-uniform depths, the IBM definition returns the absolute value of the depth rather than a negative value. IBM’s definition requires ⎕ML≥2.

⎕TC

Backspace, Linefeed, Newline

Backspace, Newline, Linefeed

The order of the contents of ⎕TC changes. IBM’s definition requires ⎕ML≥3.

10.16.2. Computing the Type and Prototype#

In Section 10.9 we defined the type of an array and the prototype of an array, and yet, we have not discussed how to compute it. However, you may have noticed that the discussion about ⎕ML referenced a primitive called type.

10.16.2.1. Migration Level Zero#

In Dyalog APL, when ⎕ML is set to 0 (the value that separates Dyalog APL from APL2 the most), the epsilon glyph stops representing the function enlist and becomes the function type.

In other words, the original versions of Dyalog APL included a function that computed the type of an array, and that function was ∊. However, in the present day, the default value of ⎕ML ← 1 means this function is generally not available.

Of course we can set ⎕ML ← 0 and see it in action:

⎕ML ← 0
∊hogwash

┌─┬───┬─┬────────┐ │0│0 0│ │┌────┬─┐│ │ │0 0│ ││0 │0││ │ │ │ │├────┼─┤│ │ │ │ ││ │0││ │ │ │ │└────┴─┘│ └─┴───┴─┴────────┘

We can even check that we determined the type of hogwash correctly before:

hogwashType≡∊hogwash

1

Similarly, if ⎕ML ← 0, we can easily compute the prototype of an array:

∊⊃hogwash

0

We can see that 0 really is the prototype of hogwash because when we overtake with ↑, the fill items are 0s:

6↑hogwash

┌──┬───┬─┬────────┬─┬─┐ │19│1 2│A│┌────┬─┐│0│0│ │ │3 4│P││5 │8││ │ │ │ │ │L│├────┼─┤│ │ │ │ │ │ ││Nuts│9││ │ │ │ │ │ │└────┴─┘│ │ │ └──┴───┴─┴────────┴─┴─┘

Now, before we forget, let us restore ⎕ML to its default value:

⎕ML ← 1


10.16.2.2. Migration Level Non-zero#

When we are working with the default ⎕ML value – or any non-zero ⎕ML value, for that matter – we cannot use the primitive function type ∊ because that function is not available. In those cases, we must resort to other techniques to determine the type or the prototype of an array.

To determine the prototype of an array, we can reshape the array to be empty, and then ask for its first element:

⊃0⍴hogwash

0

Similarly, to determine the type of an array arr, we can ask for the prototype of an array which has arr as the first item:

⊃0⍴⊂hogwash

┌─┬───┬─┬────────┐ │0│0 0│ │┌────┬─┐│ │ │0 0│ ││0 │0││ │ │ │ │├────┼─┤│ │ │ │ ││ │0││ │ │ │ │└────┴─┘│ └─┴───┴─┴────────┘

It is understandable if you find these two ways of determining the prototype and the type unsatisfying. After all, we are defining them in terms of themselves.

Another alternative follows, with a dfn that computes the type of an array recursively:

]dinput
type ← {
0=≡⍵: ⊃(⍵≡⍕⍵)⌽0' ' ⍝ Is ⍵ a simple scalar?
∇¨⍵                ⍝ If not, recurse.
}

type hogwash

┌─┬───┬─┬────────┐ │0│0 0│ │┌────┬─┐│ │ │0 0│ ││0 │0││ │ │ │ │├────┼─┤│ │ │ │ ││ │0││ │ │ │ │└────┴─┘│ └─┴───┴─┴────────┘
hogwashType ≡ type hogwash

1

After having defined type, prototype follows trivially:

prototype ← {type⊃⍵}

prototype hogwash

0

10.16.3. High-rank Partition#

We studied the function partition applied to vectors in Section 10.12.2; it appeared to be extremely useful.

Its use is much more complex when applied to arrays of arbitrary rank. Let us just try it on a matrix:

chemistry

H2SO4 CaCO3 Fe2O3
1 1 2 2 2 ⊆[2] chemistry

┌──┬───┐ │H2│SO4│ ├──┼───┤ │Ca│CO3│ ├──┼───┤ │Fe│2O3│ └──┴───┘

As we can see, partition operates along the specified axis, but it also separates all the items along the other axis, as if the matrix were seen through a grid. In other words, partition ⊂ will preserve the rank of its argument array:

≢⍴chemistry

2
≢⍴1 1 2 2 2 ⊆[2] chemistry

2

This is unlike partitioned enclose, which always returns a vector where each item has the original rank:

⎕← r ← 1 0 1 0 0 ⊂[2] chemistry

┌──┬───┐ │H2│SO4│ │Ca│CO3│ │Fe│2O3│ └──┴───┘

Although visually similar, the result of applying partitioned enclose is a vector of length 2, and each of its items, in turn is a sub-matrix of the original matrix:

⍴r

2
⍴¨r

┌───┬───┐ │3 2│3 3│ └───┴───┘

For partition, it is the other way around: the result is still a matrix, it’s the items that become vectors:

⍴r ← 1 1 2 2 2 ⊆[2] chemistry

3 2
⍴¨r

┌─┬─┐ │2│3│ ├─┼─┤ │2│3│ ├─┼─┤ │2│3│ └─┴─┘

Once more, be careful about the visual similarity of the results if ]box happens to be OFF:

]box off
⎕← 1 1 2 2 2 ⊆[2] chemistry
⎕← 1 0 1 0 0 ⊂[2] chemistry
]box on

Was ON
H2 SO4 Ca CO3 Fe 2O3
H2 SO4 Ca CO3 Fe 2O3
Was OFF

Let us try using partition on a 3D array to see what the result looks like:

cuboid

ABCDE FGHIJ KLMNO PQRST UVWXY ZABCD EFGHI JKLMN OPQRS TUVWX YZABC DEFGH
1 2 3 3 ⊆[2] cuboid

┌──┬──┬──┬──┬──┐ │A │B │C │D │E │ ├──┼──┼──┼──┼──┤ │F │G │H │I │J │ ├──┼──┼──┼──┼──┤ │KP│LQ│MR│NS│OT│ └──┴──┴──┴──┴──┘ ┌──┬──┬──┬──┬──┐ │U │V │W │X │Y │ ├──┼──┼──┼──┼──┤ │Z │A │B │C │D │ ├──┼──┼──┼──┼──┤ │EJ│FK│GL│HM│IN│ └──┴──┴──┴──┴──┘ ┌──┬──┬──┬──┬──┐ │O │P │Q │R │S │ ├──┼──┼──┼──┼──┤ │T │U │V │W │X │ ├──┼──┼──┼──┼──┤ │YD│ZE│AF│BG│CH│ └──┴──┴──┴──┴──┘

Once more, we see that the rank of the original array is preserved and that each scalar of the new array contains a vector.

Rules

In r ← pattern ⊆[axis] array:

• the result r is an array of the same rank as array;

• the dimensions of the result and of the right argument array match, except possibly along the axis specified by axis; and

• the length of the specified axis of the result is the number of partitions defined by pattern, which is ≢∪⌈\pattern.

10.16.4. Ambiguous Representation#

Even with ]box ON, and with the ]display user command, there are times where the visual representations of arrays are ambiguous:

⎕← v ← 5 8 '7' 9

5 8 7 9
]display v

┌→──────┐ │5 8 7 9│ └+──────┘

In this form, the dash which should tell us that the 7 is a character is indistinguishable from the dashes used to draw the box. We just know that one (or more) of the four items is a character because the + symbol tells us that this array is mixed.

A convenient way to distinguish between numbers and letters is to look at the type of the array and compare it with 0 (for numbers) or ' ' (for letters):

' '=type v

0 0 1 0

10.16.5. Pick Inside a Scalar#

Suppose that one item of a nested variable is a vector which has been enclosed twice, and we would like to select one value out of its contents. For example, how can we select the letter 'P' in the following vector:

⎕← nv ← (3 5 2)(⊂'CARPACCIO')(6 8 1)

┌─────┬───────────┬─────┐ │3 5 2│┌─────────┐│6 8 1│ │ ││CARPACCIO││ │ │ │└─────────┘│ │ └─────┴───────────┴─────┘

We might attempt to write

2 1 4 ⊃ nv

RANK ERROR
2 1 4⊃nv
∧


but that is incorrect because the second item of nv is an enclosed scalar. The index 1 would have been appropriate for a one-item vector, but not for a scalar.

2 ⍬ 4 ⊃ nv

P

10.17. Solutions#

The following solutions we propose are not necessarily the “best” ones; perhaps you will find other solutions that we have never considered. APL is a very rich language, and due to the general nature of its primitive functions and operators there are always plenty of different ways to express different solutions to a given problem. Which one is “the best” depends on many things, for example the level of experience of the programmer, the importance of system performance, the required behaviour in border cases, the requirement to meet certain programming standards and also personal preferences. This is one of the reasons why APL is so pleasant to teach and to learn!

We advise you to try and solve the exercises before reading the solutions!

Solution to Exercise 10.1

A very reasonable thing to do first is figure out the depth, rank, and shape, of the two arrays we are working with:

cm

Francis Carmen Luciano
nm

21.21 1534.88 375.46 704.5 1125.14 1963.52 464.45 1438.25 796.53 1569 157.14 886.59
• cm is a simple (character) matrix, hence has depth 1; because it is a matrix, its rank is 2 and its shape is 3 7; and

• nm is a simple (numeric) matrix, hence has depth 1; because it is a matrix, its rank is 2 and its shape is 3 4.

We can start by verifying this:

DRS ← {(≡⍵)(≢⍴⍵)(⍴⍵)} ⍝ Depth, Rank, and Shape of an array.
DRS cm

┌─┬─┬───┐ │1│2│3 7│ └─┴─┴───┘
DRS nm

┌─┬─┬───┐ │1│2│3 4│ └─┴─┴───┘
• For the expression (⊂cm)(⊂nm), we are enclosing both matrices into scalars. Then, strand notation will try to build a vector out of the different things it can find. Because it can find two things, ⊂cm and ⊂nm, the result will be a 2-item vector: rank 1, shape ,2.

We just have to determine the depth of the result. To determine the depth of a vector, we first have to determine the depth of all of its items. In this case, that will be ≡⊂cm and ≡⊂nm. Both of these represent enclosures of simple matrices, so the depth of the enclosure is one plus the depth of the simple matrix, i.e. 2. Thus, the result is a vector where all items have depth 2 and, therefore, the result has depth 3:

DRS (⊂cm)(⊂nm)

┌─┬─┬─┐ │3│1│2│ └─┴─┴─┘
• For the expression (⊂cm),(⊂nm), what changes is the usage of the catenate primitive to build the final result, instead of letting strand notation do its work. Again, we have that the building blocks are ⊂cm and ⊂nm, but now catenate uses those as the items that build the final result. Because we are catenating two scalars, we get a 2-item vector: rank 1, shape ,2. However, this time the matrices themselves become the items of the vector, thus its depth is only 2:

(⊂cm),(⊂nm)

┌───────┬──────────────────────────────┐ │Francis│ 21.21 1534.88 375.46 704.5 │ │Carmen │1125.14 1963.52 464.45 1438.25│ │Luciano│ 796.53 1569 157.14 886.59│ └───────┴──────────────────────────────┘
DRS (⊂cm),(⊂nm)

┌─┬─┬─┐ │2│1│2│ └─┴─┴─┘

This contrasts with the previous expression, where the items of the vector were the enclosed scalars.

• Finally, for the expression cm,⊂nm, we have something that won’t be homogeneous, and thus the result is going to have a negative depth. Notice that catenate is being used between a matrix (cm), and a scalar (⊂nm). As you’ve seen in Section 4.11.3, catenating a scalar to a matrix makes it so that the scalar is repeated over the rows of the matrix, to extend the matrix by one column.

Because the matrix cm had 7 columns, the matrix cm,⊂nm will have 8 columns, and its number of rows will remain unchanged, meaning the final shape is 3 8. Its rank will also remain unchanged. What changes is the depth, because the result will no longer be a simple character matrix, but a nested matrix: some of the elements will be characters, others will be numeric matrices. Thus, the result will have some elements of depth 0 and others of depth 1, making it so that the final depth is ¯2:

cm,⊂nm

┌─┬─┬─┬─┬─┬─┬─┬──────────────────────────────┐ │F│r│a│n│c│i│s│ 21.21 1534.88 375.46 704.5 │ │ │ │ │ │ │ │ │1125.14 1963.52 464.45 1438.25│ │ │ │ │ │ │ │ │ 796.53 1569 157.14 886.59│ ├─┼─┼─┼─┼─┼─┼─┼──────────────────────────────┤ │C│a│r│m│e│n│ │ 21.21 1534.88 375.46 704.5 │ │ │ │ │ │ │ │ │1125.14 1963.52 464.45 1438.25│ │ │ │ │ │ │ │ │ 796.53 1569 157.14 886.59│ ├─┼─┼─┼─┼─┼─┼─┼──────────────────────────────┤ │L│u│c│i│a│n│o│ 21.21 1534.88 375.46 704.5 │ │ │ │ │ │ │ │ │1125.14 1963.52 464.45 1438.25│ │ │ │ │ │ │ │ │ 796.53 1569 157.14 886.59│ └─┴─┴─┴─┴─┴─┴─┴──────────────────────────────┘

Solution to Exercise 10.2

Let’s take the three vectors we need and work with them:

a ← 1 2 3
b ← 4 5 6
c ← 7 8 9

• a b c × 1 2 3

a b c is a 3-item nested vector, and all the sub-vectors have depth 1, so a b c has depth 2. Multiplying with 1 2 3 doesn’t change the structure, only the contents, so a b c × 1 2 3 is a 3-item vector of rank 1 and depth 2:

DRS a b c × 1 2 3

┌─┬─┬─┐ │2│1│3│ └─┴─┴─┘
• (10 20),a

a is a 3-item simple vector and (10 20) is a 2-item simple vector, so their catenation yields a 5-item simple vector, thus its depth is 1, its rank is 1, and its shape is ,5:

DRS (10 20),a

┌─┬─┬─┐ │1│1│5│ └─┴─┴─┘

Note that the parenthesis are superfluous and can be removed:

DRS 10 20,a

┌─┬─┬─┐ │1│1│5│ └─┴─┴─┘

The same is true for the next expression:

• (10 20),a b

Now we are catenating (10 20), which is a simple numeric vector, with a b, which is a 2-item nested vector. a b has depth 2, rank 1, and shape ,2. When we catenate the two vectors, we get a heterogeneous 4-item vector, thus its depth will be ¯2, its rank will be 1, and its shape will be ,4.

DRS (10 20),a b

┌──┬─┬─┐ │¯2│1│4│ └──┴─┴─┘

As mentioned before, removing the parenthesis doesn’t change the result:

DRS 10 20,a b

┌──┬─┬─┐ │¯2│1│4│ └──┴─┴─┘
• a b 2 × c[2]

c[2] is a simple scalar, thus multiplying it with a b 2 won’t change the structure of the array. Now, a b 2 is a 3-item vector that is not homogeneous, because a and b are nested vectors. a and b have depth 1, thus the final array will have depth 2, rank 1, and shape ,3:

DRS a b 2

┌──┬─┬─┐ │¯2│1│3│ └──┴─┴─┘
• 10×a 20×b

The strand a 20 creates a 2-item vector that is being multiplied by b, but b is a 3-item vector, thus we will get a LENGTH ERROR if we try to evaluate this expression:

10×a 20×b

LENGTH ERROR
10×a 20×b
∧


Solution to Exercise 10.3

We continue our work:

• +/a b c

The strand a b c builds a 3-item vector, and the +/ will reduce it to a single scalar containing the result of a+b+c so, in other words, +/a b c is the same as ⊂a+b+c. a+b+c evaluates to a 3-item simple vector of depth 1, thus ⊂a+b+c has depth 2. Because it is a scalar, it has rank 0 and shape ⍬:

DRS +/a b c

┌─┬─┬┐ │2│0││ └─┴─┴┘
• +/¨a b c

This expression is slightly different from the previous one in that the operator each ¨ was put next to the plus-reduction. Because of the each, we will sum each of the three vectors, each of them producing a single scalar. Now, it’s important to understand that f¨array doesn’t change the outer structure of array. So, if a b c is a 3-item vector, f¨a b c will still be a 3-item vector. In our case, because f ← +/, we will get a 3-item simple numeric vector, meaning the final depth is 1, the rank is 1, and the shape is ,3:

DRS +/¨a b c

┌─┬─┬─┐ │1│1│3│ └─┴─┴─┘
• 1 0 1/¨a b c

The use of ¨ again tells us that the result will be a 3-item vector. Now we are left with examining the contents of the resulting vector. Because the left argument to compress isn’t enclosed, compress each will do 1/a, 0/b, and 1/c. 1/a and 1/c don’t change the right argument arrays, but 0/b produces an empty vector. Thus, the result is equivalent to a ⍬ c. However, this doesn’t change any of the characteristics of a b c, and the result is still a 3-item vector with rank 1 and depth 2:

DRS 1 0 1/¨a b c

┌─┬─┬─┐ │2│1│3│ └─┴─┴─┘
a ⍬ c≡1 0 1/¨a b c

1
• (a b c)⍳4 5 6

This is a tricky question, because b ← 4 5 6 might lead you into thinking that the result is 2. However, index of looks at its left argument and finds a vector (although a nested one) so it will look at the right argument as a collection of scalars, and it will check for the position of each scalar in the left argument vector. Because none of the scalars 4, 5, and 6, are in the left argument vector, the final result is 4 4 4 which is a vector of depth 1, rank 1, and shape ,3:

(a b c)⍳4 5 6  ⍝ the parenthesis are superfluous

4 4 4
DRS (a b c)⍳4 5 6

┌─┬─┬─┐ │1│1│3│ └─┴─┴─┘
• 1 10 3 ∊ a

The result of a membership operation is an array with the same shape as the left argument, so it will be a 3-item vector. Then, each scalar in the result is either a 0 or a 1, so the result will be a simple vector: depth 1, rank 1, and shape ,3:

DRS 1 10 3 ∊ a

┌─┬─┬─┐ │1│1│3│ └─┴─┴─┘
• (⊂1 0 1)/¨a b c

This is similar to one of the previous expressions, but now the left argument to /¨ is enclosed, meaning that 1 0 1 is the left argument that is used when compressing each of the items of the right argument a b c. The operator each makes it so that the final result is a 3-item vector as well. Then, each of its scalars is the result of doing 1 0 1/ on one of a, b, or c; the results of which are always a 2-item simple vector. Therefore, the final result will be a nested vector of depth 2, rank 1, and shape ,3:

(⊂1 0 1)/¨a b c

┌───┬───┬───┐ │1 3│4 6│7 9│ └───┴───┴───┘
DRS (⊂1 0 1)/¨a b c

┌─┬─┬─┐ │2│1│3│ └─┴─┴─┘
• 1 10 3 ∊ a b c

As seen above, the structure of the result of membership ∊ depends on the left argument only, so we have again that the result has depth 1, rank 1, and shape ,3:

DRS 1 10 3 ∊ a b c

┌─┬─┬─┐ │1│1│3│ └─┴─┴─┘

Solution to Exercise 10.4

We want to know what +/na and ,/na evaluate to, given na:

⎕← na ← 1 2 (2 2⍴3 4 5 6)7 8

┌─┬─┬───┬─┬─┐ │1│2│3 4│7│8│ │ │ │5 6│ │ │ └─┴─┴───┴─┴─┘

The simplest way to think about this is to write down the expression that the reduction is equivalent to:

⊂1 + 2 + (2 2⍴3 4 5 6) + 7 + 8


Notice the final enclose to guarantee that the result is a scalar: it’s there because reduce is supposed to reduce the rank of the argument. If we reduce a vector, the result will be a scalar, even if an enclosed one.

We have a series of scalar additions and, in the middle, addition with a matrix. There are no shape mismatches, and thus we can just add all the scalars to all the positions in the matrix. 1 + 2 + 7 + 8 is 18, thus the final result is ⊂18+2 2⍴3 4 5 6, or ⊂2 2⍴21 22 23 24:

+/na

┌─────┐ │21 22│ │23 24│ └─────┘

As for ,/na, we can do a similar exercise. However, now we can’t shuffle things into the order that we prefer because , is not commutative.

Starting from the right, we first evaluate 7,8 to get 7 8, and then we evaluate (2 2⍴3 4 5 6),7 8. Because the left argument to catenate is a matrix and the right argument is a vector, catenate will try to spread the vector across the rows of the matrix. Because the number of rows matches the elements in the vector, that happens successfully and the result is 2 3⍴3 4 7 5 6 8. Then, we catenate two scalars (separately) to the left of the matrix, so those get replicated across the rows.

The final result is:

⊂2 5⍴1 2 3 4 7 1 2 5 6 8

┌─────────┐ │1 2 3 4 7│ │1 2 5 6 8│ └─────────┘
,/na

┌─────────┐ │1 2 3 4 7│ │1 2 5 6 8│ └─────────┘

Solution to Exercise 10.5

We have the two vectors here:

ref ← 47 47 83 83 83 83 83 29 36 36 36 50 50
qty ←  5  8  3 18 11  1  6 10 61 52 39  8 11


And we want to use a partitioning function to figure out how many items of each reference were sold. In order to do that, we can try to create a Boolean vector that identifies whenever the vector ref reaches a new reference. If we do a pairwise not-equals reduction, we get quite close:

2≠/ref

0 1 0 0 0 0 1 1 0 0 1 0

The only issue is that this fails to identify the initial reference, but we can fix it by catenating a 1 in the beginning:

1,2≠/ref

1 0 1 0 0 0 0 1 1 0 0 1 0

With that out of the way, we can use partitioned enclose to get a nested vector where each sub-vector contains the quantities sold for that reference:

↑ref qty

47 47 83 83 83 83 83 29 36 36 36 50 50 5 8 3 18 11 1 6 10 61 52 39 8 11
(1,2≠/ref)⊂qty

┌───┬───────────┬──┬────────┬────┐ │5 8│3 18 11 1 6│10│61 52 39│8 11│ └───┴───────────┴──┴────────┴────┘

Finally, we can add those up with an each:

+/¨(1,2≠/ref)⊂qty

13 39 10 152 19

Alternatively, we can mix the results after partitioned enclose and then sum along the last axis. The partitioned enclose is likely to return sub-vectors that do not have the same length, which means that fill items are inserted when we mix:

↑(1,2≠/ref)⊂qty

5 8 0 0 0 3 18 11 1 6 10 0 0 0 0 61 52 39 0 0 8 11 0 0 0

However, that doesn’t change the final result because adding zeroes does nothing:

+/↑(1,2≠/ref)⊂qty  ⍝ still correct

13 39 10 152 19

Let’s wrap it into a function:

QuantitiesSold ← {+/↑(1,2≠/⍺)⊂⍵}
ref QuantitiesSold qty

13 39 10 152 19

If we wanted to return the quantities and the respective references, we could use the same Boolean vector to compress the references and partition the quantities:

]dinput
QuantitiesSoldAndRefs ← {
pat ← 1,2≠/⍺
(pat/⍺),[.5](+/↑pat⊂⍵)
}

ref QuantitiesSoldAndRefs qty

47 83 29 36 50 13 39 10 152 19

Solution to Exercise 10.6

In order to be able to look the rows of a matrix up on the rows of another matrix, we just need to use split to turn both matrices into vectors of rows:

⎕← big ← 5 2⍴⍳10

1 2 3 4 5 6 7 8 9 10
⎕← small ← (2 2⍴⍳4)⍪8+2 2⍴⍳8

1 2 3 4 9 10 11 12
(↓big)⍳↓small

1 2 5 6

Solution to Exercise 10.7

In order to understand the results of the empty partitioned encloses, it is helpful to think about what the result would be if it were a “normal” partitioned enclose.

For example, for the solved case, here is a “normal” partitioned enclose of 'Partition', using an arbitrary left argument with a couple of 1s and a couple of 0s:

1 0 0 1 0 1 0 0 0⊂'Partition'

┌───┬──┬────┐ │Par│ti│tion│ └───┴──┴────┘

The result is, thus, a vector of character vectors. So, if the left argument is 0, we get “an empty vector of empty character vectors”. Here is a vector of empty character vectors:

'' '' '' ''

┌┬┬┬┐ │││││ └┴┴┴┘

But this vector has 4 elements. To make it empty, we need to reshape it:

0⍴'' '' '' ''


But it is a waste of typing effort to write four '', for nothing, when we can just let reshape take care of reusing data:

0⍴4⍴⊂''


Now, reshaping twice is redundant, so we can keep only the last reshape:

0⍴⊂''


This matches the empty result:

(0⍴⊂'')≡0⊂'Partition'

1

Let us follow a similar reasoning for the remaining expressions.

• 0⊂⍳10

This is very similar to the example above, except the data is numeric instead of textual, so the result is an empty vector of empty numeric vectors:

(0⍴⊂⍬)≡0⊂⍳10

1
• 0⊂⍬

This is an attempt at a tricky question, but remember that ⍬ is a simple numeric vector, although an empty one. Thus, the solution is the same:

(0⍴⊂⍬)≡0⊂⍬

1
• 0⊂3 2⍴⎕A

The partitioned enclose of a character matrix is a vector of character matrices, so the result of the empty partitioned enclose is going to be an empty vector of empty character matrices. Now, the question is: what is the shape of those empty matrices?

Well, by modifying the left argument to partitioned enclose, we can vary the number of columns in the sub-matrices:

1 1⊂3 2⍴⎕A

┌─┬─┐ │A│B│ │C│D│ │E│F│ └─┴─┘
1 0⊂3 2⍴⎕A

┌──┐ │AB│ │CD│ │EF│ └──┘

But we see that the sub-matrices always have three rows, because that is how many rows the original matrix has. Hence, the final result is an empty vector of empty character matrices with three rows:

(0⍴⊂3 0⍴'')≡0⊂3 2⍴⎕A

1
• 0⊂3 4 5⍴⍳60

The train of thought for this example is very similar to the previous one, except we are working with an array of rank three. The basic premise is the same, though, and that’s that modifying the pattern of the left argument of partitioned enclose is going to alter the dimension of the last axis of each sub-result, but each sub-result will have a shape that starts with 3 4.

Hence, the final result is an empty vector of empty cuboids with shapes 3 4 0:

(0⍴⊂3 4 0⍴⍬)≡0⊂3 4 5⍴⍳60

1
• 0⊂(1 2 3)(4 5 6)(7 8 9)

The right argument to partitioned enclose is a nested vector, so the result would generally be a vector, where each item would be a vector of triples of integers. So, in this case, the result is an empty vector of empty vectors of triples of integers:

(0⍴⊂0⍴⊂3⍴⍬)≡0⊂(1 2 3)(4 5 6)(7 8 9)
⍝      ↑ the triples of integers
⍝   ↑ the empty vectors of triples of integers
⍝↑ the empty vector of empty vectors of triples of integers

1
• 0⊂('cat')('dog')(7 8 9)

This example is very similar to the previous one, except now we have a mixed vector. However, when working with empty vectors, fill items, and prototypes, what matters is the first item of the vector, which is a 3-item character vector in this case. Thus, the result will be the same as before, except we have triples of characters instead of triples of integers:

(0⍴⊂0⍴⊂3⍴'')≡0⊂('cat')('dog')(7 8 9)
⍝      ↑ the triples of characters ...

1
• 0⊂(14 'cat' 8)('a' 2 'c' 4)(1 2 3)

Solving this final expression requires applying the same thought process as before. The first item is a 5-item vector with an integer, three characters, and another integer, so that is exactly what we shall recreate:

(0⍴⊂0⍴⊂0'   '0)≡0⊂(14 'cat' 8)('a' 2 'c' 4)(1 2 3)

1

Solution to Exercise 10.8

To check if a word starts and ends with the same character, we can take a character from the front, one from the back, and compare those:

word ← 'area'
(1↑word)=¯1↑word

1

However, we can also make use of the first primitive that we just learned about. First ⊃ picks the first character in a character vector, we are just left with picking the last element of the word. An interesting way to look at it is by realising that the last element of a (character) vector is the first element of the reverse:

⊃⌽'last'

t

Thus, given a vector of words, we can use each to work on each word separately:

StartAndEnd ← { {(⊃⍵)=⊃⌽⍵}¨⍵ }
StartAndEnd 'area' 'banana' 'shoes'    ⍝ 1 0 1

1 0 1

The functions appears to be working, so let us test it on the other examples:

StartAndEnd 'cape'                     ⍝ 0

1 1 1 1

Ok, clearly the function is not working yet. The issue is that the argument ⍵ is a 4-item vector and {(⊃⍵)=⊃⌽⍵}¨ is going to traverse each of the characters of that vector. The fix for this is using the primitive function nest to preprocess the argument, to guarantee that the argument is always nested.

StartAndEnd ← { {(⊃⍵)=⊃⌽⍵}¨⊆⍵ }

StartAndEnd 'area' 'banana' 'shoes'    ⍝ 1 0 1

1 0 1
StartAndEnd 'cape'                     ⍝ 0

0
StartAndEnd 'z'                        ⍝ 1

1

Solution to Exercise 10.9

In order to extract the sub-vectors from the big character vector, we need to take the starting indices and count five indices starting from there:

start ← 6 27 52

start + ¯1+⊂⍳5

┌──────────┬──────────────┬──────────────┐ │6 7 8 9 10│27 28 29 30 31│52 53 54 55 56│ └──────────┴──────────────┴──────────────┘

Now, the most pragmatic thing to do is to enlist all those indices and index directly into the character vector:

text ← 'This boring text has been typed just for a little experiment.'
text[∊start+¯1+⊂⍳5]

borintypedxperi

Putting this in a function gives:

Extract ← { ⍺[∊⍵+¯1+⊂⍳5] }
text Extract start
⍝ 'borintypedxperi'

borintypedxperi

Solution to Exercise 10.10

This exercise is very similar to the previous one, except that now we don’t extract sub-vectors of fixed length, the lengths depend on the right argument. Fixing this can be done by using iota each:

length ← 3 8 4

]dinput
ExtractL ← {
(start length) ← ⍵
⍺[∊start+¯1+⍳¨length]
}

text ExtractL start length
⍝ 'bortyped juxper'

bortyped juxper