Would I get hired as a programmer?

In a recent posting on Coding Horror regarding the difficulty of hiring programmers who can actually program, a couple of typical interview questions were discussed as elicting baffled responses. The first one concerned a solution to Fizz Buzz, which I figured I could do without problem, However, the following interview question, posted by Paul Jungwirth, caused me to pause:

You have the numbers 123456789, in that order. Between each number, you must insert either nothing, a plus sign, or a multiplication sign, so that the resulting expression equals 2002. Write a program that prints all solutions. (There are two.) This is a pretty tricky problem, actually, if you don't think to use recursion. I was satisfied with pseudocode, but the Perl Mongers version was to fit the whole thing into 80 characters!

Note: the original problem statement used 2001 as the target number, but this was later corrected to 2002.

Assuming the hiring process would ignore my long-in-the-toothness, how would I tackle this in Clojure? OK, first I'd have to create every combination of the operators (dot being the "nothing", i.e. concatenation, operator). In Clojure, this could look like:

  (defn comb [n ls]
    (if (= n 0)
      (list ls)
      (apply concat (for [op '(+ * dot)] 
                      (comb (dec n) (cons op ls))))))  

Which works like so:

  user=> (comb 2 nil)
  ((+ +) (* +) (dot +) (+ *) (* *) (dot *) (+ dot) (* dot) (dot dot))
  user=>

The sequences of operators generated could then be interleaved with the numbers to create the full set of expressions, for example:

  user=> (interleave [1 2 3] '[* * end]) ; end added so seqs are same length 
  (1 * 2 * 3 end)
  user=> 

This is when I realised that choosing Clojure may not have been such a good idea. While Clojure has an eval function, it naturally expects the normal Lisp prefix (Polish) notation. In the real world, I'd probably change my implementation language to Python, which has eval for infix notation built in. Alternatively, I could work out how to generate the Clojure expressions in prefix notation. However, I couldn't figure out a simple algorithm for that (but I'm sure there is one), so as a learning experience, I chose to write a parser to evaluate the generated infix expressions in Clojure.

First attempt: simple recursive descent parser which builds an abstract syntax tree (AST) as a Clojure form. This is based on the following grammer:

  plusexp ::= mulexp | mulexp '+ mulexp
  mulexp ::= dotexp | dotexp '* dotexp
  dotexp ::= number | number 'dot number
  number ::= 1..9

Here's the Clojure code, which includes a simple lexer to consume, (get-next), or peek at, (peek-next), the next token in an expression.

  (let [stream (atom nil)]
    (defn init-stream [ls]
      (reset! stream ls))
    (defn peek-next []
      (first @stream))
    (defn get-next []
      (let [elt (first @stream)]
        (reset! stream (rest @stream))
        elt)))
  
  (defn dot [i j]
    (+ (* i 10) j))

  (defn dotexp []
    (let [v (get-next)
          op (peek-next)]
      (if (= op 'dot)
        (do (get-next) (list 'dot v (dotexp)))
        v)))

  (defn mulexp []
    (let [v (dotexp)
          op (peek-next)]
      (if (= op '*)
        (do (get-next) (list '* v (mulexp)))
        v)))

  (defn plusexp []
    (let [v (mulexp)
          op (peek-next)]
      (if (= op '+)
        (do (get-next) (list '+ v (plusexp)))
        v)))

   (defn parse1 [exp]
    (init-stream exp)
    (plusexp))

This code works, but the Clojure expressions generated (a representation of an abstract syntax tree) are incorrect; like operators have a right associativity rather than the left association they are supposed to have. For example:

  user=> (parse1 '(1 + 2 * 3 + 4))
  (+ 1 (+ (* 2 3) 4))
  user=> 

You will note that the last + operator is executed before the first. While not a problem for + and *, as they are commutative, the dot operator is not. Of course, I could always redefine dot but that's too easy.

Second attempt: uses the same method of acquiring tokens from the expression, is more complicated, but produces a correct AST.

  (declare plusarg mularg dotarg expr1)

  (defn dotarg [larg]
    (let [rarg (get-next)
          next-op (peek-next)]
      (condp = next-op 
        'dot (do (get-next) (dotarg (list 'dot larg rarg)))
        '* (do (get-next) (mularg (list 'dot larg rarg)))
        (list 'dot larg rarg))))
 
  (defn mularg [larg]
    (let [rarg (get-next)
          next-op (peek-next)]
      (condp = next-op
        'dot (do (get-next) (list '* larg (dotarg rarg)))
        '* (do (get-next) (mularg (list '* larg rarg)))
        (list '* larg rarg))))

  (defn plusarg [larg]
    (let [rarg (get-next)
          next-op (peek-next)]
      (condp = next-op
        'dot (do (get-next) (list '+ larg (dotarg rarg))) 
        '* (do (get-next) (list '+ larg (mularg rarg)))
        (list '+ larg rarg))))

  (defn expr1 [larg]
    (let [op (get-next)]
      (condp = op
         'dot (expr1 (dotarg larg))
         '* (expr1 (mularg larg))
         '+ (expr1 (plusarg larg))
        larg)))

   (defn parse [exp]
    (init-stream exp)
    (expr1 (get-next)))

Here's the same expression as above, but evaluated with the new parser:

  user=> (parse '(1 + 2 * 3 + 4))
  (+ (+ 1 (* 2 3)) 4)
  user=> 

For more information on RDP's, see this article. The second version of the parser, above, appears to be a variant on the Precedence Climbing algorithm he describes.

Note that both these approaches share a common problem; they are recursive, but do not/cannot make use of tail-recursion, so are liable to blow the stack if passed a large expression to evaluate. It's not a problem for these toy expressions, of course, but the approach would need to change for production code (maybe use the trampoline function).

OK, with those building blocks, let's put together some code to solve the original problem:

  (def whole-numbers (iterate inc 1))

  (defn calc [n target]
    (loop [ops-list (comb (dec n) nil) sols nil]
      (let [exp (interleave 
                 (take n whole-numbers) 
                 (concat (first ops-list) '(end)))
            tot (eval (parse exp))]  
        (if (empty? ops-list)
          sols
          (recur (rest ops-list) 
                 (if (= tot target)
                   (cons (list exp tot) sols)
                   sols))))))

Hmm, I find it hard to forget my imperative programming past. Really shouldn't need the loop/recur. Here's a (probably) more idomatic version of calc.

  (defn calc [n target]
    (filter #(= (first %) target)
            (for [ops (comb (dec n) nil)]
              (let [exp (interleave (take n whole-numbers) (concat ops '(end)))]
                [(eval (parse exp)) exp]))))

Now we can find out the two solutions mentioned in the problem:

  user=> (calc 9 2002)
  ([2002 (1 * 2 + 3 dot 4 * 5 dot 6 + 7 + 8 dot 9 end)] 
   [2002 (1 * 2 dot 3 + 4 dot 5 * 6 * 7 + 8 dot 9 end)])
  user=> 

There are a couple of problems with this approach. One, the parsing functions are not functional (i.e. the use of (get-next)) and two, the calc function took a long time, over 12 seconds:

  user=> (time (calc 9 2002))
  "Elapsed time: 12224.948701 msecs"
  (((1 * 2 dot 3 + 4 dot 5 * 6 * 7 + 8 dot 9 end) 2002) 
   ((1 * 2 + 3 dot 4 * 5 dot 6 + 7 + 8 dot 9 end) 2002))
  user=> 

How does this compare to a solution in python? Here's the python code:

  def combine(n):
      exp = ""
      exps = list()
      def comb(d,e,es):
          if d == 0:
              es.append(e+str(n))
          else:
              for op in ["+","*",""]:
                  comb(d-1,e+str(n-d)+op,es)

      comb(n-1,exp,exps)
      return exps


  for exp in combine(9):
      tot = eval(exp)
      if tot == 2002:
          print exp

And timing is:

  [~/dev/clojure/src]$ time python comb.py
  1*2+34*56+7+89
  1*23+45*6*7+89

  real    0m0.391s
  user    0m0.031s
  sys     0m0.062s
  [~/dev/clojure/src]$

Much faster. After a little investigation, the bottleneck in the Clojure code is the (eval) function. Converting the parser to perform the computation, rather than generating an AST, made the speed of the Clojure version comparable to that of the python solution.

To fix the non-functional approach, here's a version that is functional. Essentially, every RDP function passes back both the left argument and the remaining expression when it finally returns, as a vector. However, I think it's even more obtuse than the original.

  (defn rrest [ls]
    (rest (rest ls)))

  (defn dotarg [larg exp]
    (let [rarg (first exp)
          next-op (second exp)]
      (condp = next-op 
        'dot (dotarg (list 'dot larg rarg) (rrest exp))
        '* (mularg (list 'dot larg rarg) (rrest exp))
        [(list 'dot larg rarg) (rest exp)])))

  (defn mularg [larg exp]
    (let [rarg (first exp)
          next-op (second exp)]
      (condp = next-op
        'dot (let [argexp (dotarg rarg (rrest exp))]
               [(list '* larg (first argexp)) (second argexp)])
        '* (mularg (list '* larg rarg) (rrest exp))
        [(list '* larg rarg) (rest exp)])))

  (defn plusarg [larg exp]
    (let [rarg (first exp)
          next-op (second exp)]
      (condp = next-op
        'dot (let [argexp  (dotarg rarg (rrest exp))]
               [(list '+ larg (first argexp)) (second argexp)])
        '* (let [argexp (mularg rarg (rrest exp))]
             [(list '+ larg (first argexp)) (second argexp)])
        [(list '+ larg rarg) (rest exp)])))

  (defn expr1 [[larg exp]]
    (let [op (first exp)]
      (condp = op
        'dot (expr1 (dotarg larg (rest exp)))
        '* (expr1 (mularg larg (rest exp)))
        '+ (expr1 (plusarg larg (rest exp)))
        [larg nil])))

   (defn parse [exp]
     (first (expr1 [(first exp) (rest exp)])))

After all that, I suspect I would not have passed the interview, as they'd have been a bit impatient after watching me hack code for two days...

Addendum

Much later, with some time on my hands, I tried a different approach to parsing the infix expressions. This time, I specified explicit operator priorities with a recusive parser. The end result is:

  (def prio {'+ 1 '* 2 'dot 3 nil 0})

  (defn parse [pending-op larg]
    (let [op (get-next)
          rarg (get-next)
          next-op (peek-next)]
      (if (nil? op)
        larg
        (if (>= (prio op) (prio next-op))
          (if (<= (prio next-op) (prio pending-op))
            (list op larg rarg)
            (parse pending-op (list op larg rarg)))
          (parse nil (list op larg (parse op rarg)))))))

This took a while to develop; it wasn't until I twigged that I had to pass the pending operator to the parse invocation that constructed the right argument, so it knew when to return the sub-expression.