Wednesday, July 27, 2011

Odersky talk (16 minutes) at OSCON...

Typesafe and Martin Odersky are working very hard on the parallel and concurrency problems. And all in Scala. This talk is so exciting as it shows how wide and deep they are approaching the problem. And how Scala's unique OO + FP approach to software engineering is ideal for this problem domain.

Sunday, July 24, 2011

Project Euler - Problem 002 - Learning to think the Scala way...

This post continues my journey to learn Scala by using the problems off of the web-site projecteuler.com (descriped in my original post here: http://fromjavatoscala.blogspot.com/2011/05/project-euler-problem-001-learning-to.html)

Problem Statement - 002:


Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:
By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.



Visualization/Examples:
  1. Natural numbers in Fibonacci series not exceeding 100 with a multiple of 2
    1. Numbers not exceeding 100 = 1 2 3 ... 98 99 100
    2. Fibonacci Series = 1 2 3 5 8 13 21 34 55 89
    3. Multiples of 2 = 1 2 3 5 8 13 21 34 55 89
    4. Sum = 2 + 8 + 34 = 44


Mathematical Notation:
TBD

Test Cases:

  1. floor = (0 and 1 as 1), ceiling = 100, multiple = 2 , answer = 2 + 8 + 34 = 44 (produced by hand)
  2. floor = (0 and 1 as 1), ceiling = 4000000, multiple = 2, answer = ? (produce from code)

Code Implementation Approaches:
The first and most intuitive approach for me at this time is to go one by one for each two-number sum starting from floor (0 and 1) to ceiling (100) and test a number for whether it is divisible by 2. If so, add it to a total I am keeping. And when the loop is finished, I will have my answer. Here's the small Scala Object class I created to perform the work:

object Problem002 extends App {
  //http://projecteuler.net/index.php?section=problems&id=2
  override def main(args : Array[String]) : Unit = {
    javaWithoutSemicolons
  }
  def javaWithoutSemicolons : Unit = {
    var first = 0   //floor
    var second = 1  //floor
    val ceiling = 100
    val multiple = 2
    var sum = 0
    var next = 0
    while (second <= ceiling) {
      next = first + second
      first = second
      second = next
      if (first % multiple == 0)
        sum = sum + first
    }
    Console.println("javaWithoutSemicolons.sum=" + sum)
  }
}

Just like for problem 1, this code looks almost identical to the Java code I would write, essentially Java without semicolons. Note the presence of the "var"s which is a strong indication I am using the "imperative" approach again (as contrasted with the "functional" approach) to composing a solution.

The next step, again like last time, is to "refactor" this code to move towards the functional paradigm. Let's start by creating a function which returns a list of Fibonacci numbers less than a passed in value. And then use our now familiar filter and sum to obtain our value.

def headedTowardsFunctional : Unit = {
  def fibList (ceiling: Int, floor: Int = -1): List[Int] = {
    var values = List[Int]()
    var first = 0
    if (first > floor)
      values = first :: values
    var second = 1
    var next = 0
    while (second <= ceiling) {
      next = first + second
      first = second
      if (first > floor)
        values = first :: values
      second = next
    }
    values.reverse
  }
  val ceiling = 100
  val multiple = 2
  val fibs = fibList(ceiling)
  val fibsFilter = fibs.filter(n => (n % multiple == 0))
  val sum = fibsFilter.sum
  Console.println("headedTowardsFunctional.sum=" + sum)
}

I have successfully removed all the vars from the main method. However, there are still several in my fibList function. There are a number of approaches possible here. The first that comes to mind is to collect a list of all the numbers in the series which fit in an Int. While it sounds huge, it isn't. There are less than a fifty (47 to be precise). Because the series is exponential in nature, not linear, the entire thing fits in a tiny list.

So, here's a version which separates the production of the whole series up to the highest value an Int can hold. And then that list is easily handled with a for comprehension.

def functional1 : Unit = {
  val fibsAll = {
    var fibs = 0 :: List()
    var current = fibs.head
    var next = 1
    var continue = true
    while (continue) {
      current = fibs.head
      fibs = next :: fibs
      continue = ((0.0 + next + current) <= Int.MaxValue)
      if (continue)
        next = next + current
    }
    fibs.reverse
  }
  def fibList (ceiling: Int, floor: Int = -1): List[Int] = {
    val values =
      for (i <- fibsAll if ((i > floor) && (i < ceiling))) yield i
    values.reverse
  }
  val ceiling = 100
  val multiple = 2
  val fibs = fibList(ceiling)
  val fibsFilter = fibs.filter(n => (n % multiple == 0))
  val sum = fibsFilter.sum
  Console.println("functional1.sum=" + sum)
}

And it is just a very simple step from there to take this fully functional (excluding the function used by fibAll). And while we're at it, let's move the ceiling and multiple constants into being root method parameters:

def functional2(ceiling: Int, multiple: Int) : Unit = {
  val fibsAll = {
    //generate all 47 for an Int
    var fibs = 0 :: List()
    var current = fibs.head
    var next = 1
    var continue = true
    while (continue) {
      current = fibs.head
      fibs = next :: fibs
      continue = ((0.0 + next + current) <= Int.MaxValue)
      if (continue)
        next = next + current
    }
    fibs.reverse
  }
  def fibList (ceiling: Int, floor: Int = -1): List[Int] = {
    val values =
      for (i <- fibsAll if ((i > floor) && (i < ceiling))) yield i
    values.reverse
  }
  val sum = fibList(ceiling).filter(n => (n % multiple == 0)).sum
  Console.println("functional2.sum=" + sum)
}

I don't particularly like the function I have created for fibsAll. It "smells" imperative with it's ample var-ness. How might I go about making it better, more in the Scala style? I've googled other implementations of Fibonacci in Scala, but all are fixated on producing the sum for n elements (and several are incorrect in that they seem to skip the first two root elements (0, 1) starting with (1, 2) instead. The correct sequence starts as [0, 1, 1, 2, 3, 5...] and not [1, 2, 3, 5...]. Even the projecteuler.com Problem 2 states the beginning of the sequence incorrectly at [1, 2, 3...].

After posting my current implementation of fibAll to CodeReview (like StackOverflow, but more specifically for code reviews) and going several comment cycles with Leonardo Lucena there, I finally have a much better solution for for implementing fibsAll. Here it is:

def functional3(ceiling: Int, multiple: Int) : Unit = {
  def fibs(a: Int = 0, b: Int = 1): Stream[Int] = Stream.cons(a, fibs(b, a + b))
  val fibsAll = fibs().takeWhile(n => (n > -1)).toList
  def fibsList (ceiling: Int, floor: Int = -1): List[Int] = {
    for (i <- fibsAll if ((i > floor) && (i < ceiling))) yield i
  }
  val sum = fibsList(ceiling).filter(n => (n % multiple == 0)).sum
  Console.println("functional3.sum=" + sum)
}

So what is the fibs function doing? And what is Stream.cons? And while it's recursive, it's not able to be optimized using the @tailrec annotation. The fibs function is simulating an infinite series, in this case based on Fibonacci. Stream is an object that enables temporary state (via recursion) for a function. It behaves similar to a List, but waits to evaluate the passed function until it is required to do so which in this case does not happen until the takeWhile call when assigning the val fibsAll. As to the inability to do the @tailrec optimization; the actual depth required by this series is relatively shallow at 47. However, if the takeWhile had been replaced with a take(1000) or the like, the function would likely abort with a stack overflow. At this time, I don't exactly know how to approach refactoring the function so as to remain consistent with the functional/recursive approach and enable the @tailrec annotation.

So, this we are at the most functional we can be at this time. Now, for the inevitable - change...

Requirements change (keeping it real):
Or said a slightly different way, slightly changing and expanding the original problem so as to explore code design, refactoring and performance. Using the same basic model in the original problem, find the answer for the following set of problems:

  1. floor = 0, ceiling = 100, multiple 2
  2. floor = 0, ceiling = 4000000, multiple 2
  3. floor = 0, ceiling = 4000000, multiple 3
  4. floor = 0, ceiling = 4000000, multiples 2, 3, 5, 7
  5. floor = 100, ceiling = 4000000, multiple 2


Code Refactoring (fancy way of saying "adapting"):
While I was developing the solution pathway above, I did so with an eye towards the likelyhood that the requirements might change (or in this case, expand). Hence, my placing the floor in the fibsList function. Given this new set of problems, we are most of the way there. We just need to adjust for a set of multiples, as opposed to a single one. And since we already solved that in Problem 1, we just have to graft the solution there into this code. Here's what it looks like:

def functional4(ceiling: Int, multiples: List[Int]) : Unit = {
  def fibs(a: Int = 0, b: Int = 1): Stream[Int] = Stream.cons(a, fibs(b, a + b))
  val fibsAll = fibs().takeWhile(n => (n > -1)).toList
  def fibsList (ceiling: Int, floor: Int = -1): List[Int] = {
    for (i <- fibsAll if ((i > floor) && (i < ceiling))) yield i
  }
  val sum = fibsList(ceiling).filter(n => {multiples.exists(x => (n % x) == 0)}).sum
  Console.println("functional4.sum=" + sum)
}

It's very pleasing to be able to pluck out and reuse part of what we learned from Problem 1. And, we're done! On to Problem 3...soon. {smirk}