Tuesday, 11 June 2013

[Deep Breath] For-Comprehensions

As I already noted, for-comprehensions are the point at which I lost track of the Fast Track to Scala course.  I think the word “for” had lulled me into a false sense of security.  It didn’t last long.

So now I’ve hit them again, in Chapter 2 of Scala in Action, and this time I’m determined to understand them.  The rest of this post will comprise a running commentary on my progress towards this goal.

The Syntax

Already I can see why I got so confused:

‘for’ (‘(‘Enumerators’)’ | ‘{‘Enumerators’}’) {nl} [‘yield’] Expr

Scala in Action, Nilanjan Raychaudhuri, Chapter 2

But having just typed it in (no cut-and-paste for me), I can already see that, as before with functions and function-literals, that either parentheses and curly braces can be used to wrap the ‘Enumerators’. What else can I learn?

A Small First Step

On top of this, it must be my lucky day, as Nilanjan decides to tackle this from a perspective of tradition, and seeing as Java is the new Cobol, tradition is something that makes me very comfortable.  Lets iterate through some collections:

val files = new java.io.File(“.”).listFiles
for(file <- files) {
    val filename = file.getName
    if(filename.endsWith(“.scala”)) println (file)
}

from Scala in Action, Nilanjan Raychaudhuri, Chapter 2

This is so damn legible that I’m already one step ahead of Nilanjan when he says that this looks very much like the equivalent foreach construct in Java.  (Mmmmmm, the warm glow of familiarity.) But lets not jump ahead. we’re getting some terminology too – the “file <- files” element of all this is called a generator, and it’s job is to iterate through a collection.  Why not call it an “Enumerator” like the previous syntax snippet?  Don’t get too far ahead of yourself. I’m sure we’ll get to that.

OK. Continue.

Next, Add Some Definitions and Guard Clauses

Right, now we’re going to do a lot more within the generator.

for(
   
file <- files;
    filename = file.getName;
    if(filename.endsWith(“.scala”))

) println (file)

from Scala in Action, Nilanjan Raychaudhuri, Chapter 2

This is precisely one of the concepts I now realise I’d missed during the course.  Armed with my knowledge of the Scala-ite’s love of reducing things, I can begin to unpick what has happened here.  Nilanjan tells me we can add definitions and guard clauses within a for loop. I also notice that we’ve had to add semi-colons (first sighting of them in Scala-land). The former (filename = file.getName;) simply defines a new val and the latter (if(filename.endsWith(“.scala”)) runs a check against it. If the guard clause is true, then the body of the loop executes, in this case, println (file).

Before we move on, I wonder if we can inline it even more?:

for(
   
file <- files;
    if(file.getName.endsWith(“.scala”))

) println (file)

Sure can.  I think one trap I need not to fall into here is to think of this as being like a standard Java for loop.  The semi-colons (I am guessing) are there to help the compiler and that is all.  I now know I can inline the definitions. But can I have more of them?:

for(
   
file <- files;
    filename = file.getName;
    timeNow = new java.util.Date()

    if(filename.endsWith(“.scala”))

) println (“time now: ” + timeNow + “, file: “ + file)

Yup. This works fine, even with the just-noticed missing semi-colon (try it yourself).

But before I inadvertently fall off the deep end again, lets reel in the off-piste enthusiasm and get back to the book.

Multiple Generators

Now we’re getting complicated. Apparently we can specify more than one generator, and the loop is executed across both of them, just as if the latter was inside the former, and in light of that, the Scala syntax seems surprisingly simple (simpler than any way I can think of doing it in Java anyway):

scala> val aList = List(1, 2, 3)
aList: List[Int] = List(1, 2, 3)
scala> val bList = List(4, 5, 6)
bList: List[Int] = List(4, 5, 6)
scala> for { a <- aList ; b <- bList } println(a + b)
5
6
7
6
7
8
7
8
9

from Scala in Action, Nilanjan Raychaudhuri, Chapter 2

Nice.

One more thing before we turn to the next page, Nilanjan threw in that curly-braces-instead-of-parens trick I spotted earlier.  Tricksy.  He also put everything on one line. I was waiting for that, so that wasn’t so much of a surprise.

Interim Summary – The Imperative Form

Summary-time. Apparently all of these examples of the for-comprehension are in the imperative form. In this form we specify statements that will get executed by the loop (e.g. println (file) and println(a + b)) and nothing gets returns.  OK. I get that.  However, mapping back to the syntax definition at the top of this post, what I still don’t get is why we just talked about “statements” instead of “expr(essions)”, and “generators” instead of “enumerators”. Perhaps that will become clear later on. Lets keep on trucking.

Onward! The Functional Form (aka “Sequence Comprehension”)

Now we’re working with values rather than executing statements.  That means they’re objects, and remembering that everything in Scala is an Object I’ve got my wits about me.

scala> for {a <- aList ; b <- bList} yield a + b
res3: List[Int] = List(5, 6, 7, 6, 7, 8, 7, 8, 9)

from Scala in Action, Nilanjan Raychaudhuri, Chapter 2

Now luckily for me, (although this was the point during the course when I was lulled into a false sense of security) I have come across the yield keyword before, in Ruby-land.  But is it really lucky? After some back-and-forth reading / comparison, I’ve decided to not think about Ruby’s yield too much.  All I want to take is the warm glow I get from a familiar keyword, but tackle the Scala meaning on it’s own terms. Lets see if I manage.

Nilanjan deftly expresses the difference between this and the previous example.  Whereas originally we were just using the a and b values within the loop, now we’re returning the values from the loop.  Unsurprisingly with this in mind the result is a List[Int].  We can then store this result and use it:

scala> val result = for {a <- aList ; b <- bList} yield a + b
res3: List[Int] = List(5, 6, 7, 6, 7, 8, 7, 8, 9)

scala> for (r <- result) print(r)
5
6
7
6
7
8
7
8
9

from Scala in Action, Nilanjan Raychaudhuri, Chapter 2

He then points out that while this new form is more verbose (yup) it separates the computation (adding a to b) from the use of the result (in our case, simply printing it).  Does his claim that this improves reusability and compatibility?  I can see how it does.  And now I understand it, and have a nice syntax style I can parse, both styles seem very legible too.

Here We Are & This Is It

That was all surprisingly sane.  As usual, taking it slowly, reading around a little, and trying out a few edge cases to cement my understanding helped a great deal.  I think I’ll try the Koan related to this now to see how I get on, and if there is anything else I can learn.