For the last 36 hours, off and on, I’ve been working on solutions to Project Euler problems, just to learn some F#. I started at the beginning and have been working my way up, as the difficulty increases.
So far, I’ve solved problems 1-10, 14, 47, and 243. Problem 9 doesn’t really count as I basically borrowed the code verbatim from another site, and problem 5 almost doesn’t as I solved it with pencil and paper.
I may cover a few of the others later, but for now I’ll focus on 14.
Problem 14 was a bit trickier than it looked. I knew from the start that I’d have to resort to some sort of memoization to get runtime down to a reasonable level. As noted on the Project Euler site:
Each problem has been designed according to a “one-minute rule”, which means that although it may take several hours to design a successful algorithm with more difficult problems, an efficient implementation will allow a solution to be obtained on a modestly powered computer in less than one minute.
A brute-force solution without memoization took way too long on the virtual machine I use to run Visual Studio 2010. (I don’t know how long, because I killed it after about ten minutes.) So, I borrowed some code from Programming F# (page 199):
open System.Collections.Generic
let memoize (f : 'a -> 'b) =
let dict = new Dictionary<'a, 'b>()
let memoizedFunc (input : 'a) =
match dict.TryGetValue(input) with
| true, x -> x
| false, _ ->
let answer = f input
dict.Add(input, answer)
answer
memoizedFunc
The project statement uses a starting point of 13, and gives a sequence length of 10; I realized that if the sequence ever reached 13, for instance, it would always be at 1 ten iterations later. My first attempt at a solution copied around the actual sequence of numbers for each starting point. Ideally, if I always tacked the new number on as the head, the tail would eventually be a list that had already been generated, and wouldn’t need to be recreated or even copied, as the new list could just point to the old one.
While this wasn’t a horrible idea, it seemed a lot slower than I’d like. So, I switched to a simple tally of sequence length. Still slow.
After I let the program run all the way through, I realized my problem: the debugging output. What started as a ten-minute process shortened itself to about ten seconds when I removed the printfn from the inner function. I probably could have kept the sequence after all.
let rec collatz =
// The only reason to create an inner function here
// is to memoize it. You have to call the outer
// function recursively or the memoize function
// won't catch it.
let collatz_ (x : int64) =
if x = 1L then 0
elif x % 2L = 0L then
1 + collatz (x / 2L)
else
1 + collatz (3L * x + 1L)
memoize collatz_
// Candidates
[ 1L..999999L ]
// Calculate path length for each, decorate
|> List.map (fun x -> (collatz x, x))
// Sort by path length...
|> List.sortBy fst
// ... long paths first
|> List.rev
// Give me just the longest path
|> List.head
// Give me just the starting point
|> snd;;
PS: if you’re asking why it’s called “collatz”?
So, what did I learn?
I learned that console output is slow. I also learned that decorating an input list with tuples (a la the “Schwartzian Transform”) is a useful technique in F#. Function pipelining is incredibly handy. And I’m addicted to Project Euler.
This post couldn’t be more on the money.