Lazy Evaluation and Dataflow

This chapter is about lazy evaluation and dataflow programming in Owl. Different from Haskell, OCaml eagerly evaluates the expressions in a program. However, laziness can be very helpful in numerical applications which involve complicated computation graphs and large data chunks such as matrices. Laziness can potentially reduce unnecessary memory allocations, optimise computation graph, incrementally update results, and etc.

Owl is able to simulate the lazy evaluation atop of OCaml using its Lazy functor. This functor has been optimised for memory allocation. Moreover, Lazy functor also provides a dataflow programming model to explicitly construct a computation graph.

The examples used in this chapter can be found in [].

Core APIs

The Lazy functor is specifically designed for numerical computing which uses ndarray as its core data structure. It does not aim to provide a general framework beyond aforementioned numerical scope. To generate a new module, you need to pass in Owl’s ndarray module.

module M = Owl.Lazy.Make (Dense.Ndarray.D);;
module N = Owl.Lazy.Make (Dense.Ndarray.S);;

The code above generates two lazy modules supporting single and double precision float numbers respectively. In the generated modules, there are several core APIs you need to know.

  • val variable : unit -> t : this function creates a placeholder for a variable in the computation graph. This is where you can start constructing a computation graph. The variables are the inputs of the graph, and you can plug in actual values later.
  • val assign_arr : t -> A.arr -> unit : this function assigns a specific ndarray value to the variable previously created by variable function. Note that you can only assign values to those created by variable function, otherwise an exception will be thrown.
  • val assign_elt : t -> A.elt -> unit : similar to assign_arr, this function assigns a value of type A.elt to a variable. A.elt depends on the module you plug into Lazy functor, but it refers to a float number in our current context.
  • val eval : t -> unit : this function evaluates the associated computation graph for a given variable of type t.
  • val to_arr : t -> A.arr : this function unpacks the value of an ndarray from a variable of type t. You can only use this function after you call eval.
  • val to_elt : t -> A.elt : Similarly, to_elt is used for unpacking the value of float number out of a variable.

Note: assign_arr and assign_elt can only be applied to those variables previously created by the variable function. There are other ways to plug in the inputs into a computation graph, e.g. the following two functions.

  • val of_arr : A.arr -> t : pack an ndarray and return it as a constant of type t.
  • val of_elt : A.elt -> t : pack an element and return it as a constant of type t.

The difference is that inputs created by of_arr and of_elt are treated as constants and their values cannot be changed in the evaluation of a computation graph. Therefore, you cannot use assign_arr and assign_elt to change the value of inputs created by these two functions.

A Simple Example

The following code snippet shows how to perform a sequence of computation on an ndarray x with eager evaluation. This is how you write normal application in Owl normally.

let eager_eval x () =
  Arr.(add x x |> log |> sin |> neg |> cos |> abs |> sinh |> round |> atan)

How can we perform the same computation using lazy evaluation? The following code demonstrates how Lazy functor can help us to achieve this.

let lazy_eval x () =
  let a = M.variable () in
  let z = M.(add a a |> log |> sin |> neg |> cos |> abs |> sinh |> round |> atan) in
  M.assign_arr a x;
  M.eval z

As you can see, we first define a placeholder for variable a. Then we use the predefined math operators in Lazy functor to construct the computation graph, and the code looks exactly the same as that in eager evaluation.

Comparing to eager evaluation, we need some extra steps: first, assign value x to variable a; second, evaluate the computation graph by calling eval function. z is the output of the computation graph, you can also think z represents the expression that produces the final value.

We did not really extract the final value from z, but it is trivial by calling M.to_arr z.

Visualise Computation Graph

Lazy functor provides two APIs that help you to visualise the computation graph corresponding to an expression.

  • val to_trace : t list -> string : the function returns a string which can be printed out on the terminal directly.
  • val to_dot : t list -> string : the function returns a dot-formatted string to represent the graph structure. Because it is in dot format, you can save it to a file then use another tool (such as GraphViz) to produce the figure.

For example, the following code prints out the trace using to_trace function.

let print_trace a =
  let x = M.variable () in
  let y = M.variable () in
  let z = M.(add x y |> sin |> abs |> log) in
  M.assign_arr x a;
  M.assign_arr y a;
  M.eval z;
  M.to_trace [z] |> print_endline

You can probably see the similar output on your screen if you try it out in utop. “valid” and “invalid” states relate to the internal mechanism of Lazy functor, we can simply ignore them at the moment.

[ #5 name:abs state:invalid ] -> [ #6 name:log state:valid ]
[ #4 name:sin state:invalid ] -> [ #5 name:abs state:invalid ]
[ #3 name:add state:invalid ] -> [ #4 name:sin state:invalid ]
[ #1 name:variable state:valid ] -> [ #3 name:add state:invalid ]
[ #2 name:variable state:valid ] -> [ #3 name:add state:invalid ]

What if you want to actually visualise the graph in to a picture, then you need to use to_trace function as the following code snippet does.

let print_graph () =
  let x = M.variable () in
  let y = M.variable () in
  let z = M.(add x y |> dot x |> sin |> abs |> sum' |> add_scalar x |> log |> atan2 y |> neg |> relu) in
  print_endline "visualise computation graph to dot file ...";
  M.to_dot [z] |> Utils.write_file "";
  Sys.command "dot -Tpdf -o plot_lazy.pdf"

With the code above, you can generate a nice figure as below. Note you need to install GraphViz on you local computer because Sys.command calls dot command which is a tool in GraphViz.

computation graph

Incremental Computation

What is incremental computation? Many numerical applications involve calculations on a large amount of variables which further creates complicated computation graphs. Quite often, we only change the value of several variables then we have to re-evaluate the whole graph. However, re-evaluating the whole computation graph is very expensive and is not necessary in many cases.

Incremental computation refers to the situation we only re-evaluate the subgraph depending on the updated variables. This can also be referred to as Self-Adjusting Computing (SAC), or dataflow programming. It has many names but the general idea is the same.

The flowing code shows how to perform incremental computation with Lazy functor.

let incremental x =
  let a = M.variable () in
  let b = M.variable () in
  let c = M.(a |> sin |> cos |> abs |> log |> sum' |> mul_scalar a |> scalar_add b) in
  M.assign_arr a x;
  M.assign_elt b 5.;
  Printf.printf "Incremental#0:\t%.3f ms\n" (Utils.time (fun () -> M.eval c));
  M.assign_elt b 6.;
  Printf.printf "Incremental#1:\t%.3f ms\n" (Utils.time (fun () -> M.eval c));
  M.assign_elt b 7.;
  Printf.printf "Incremental#2:\t%.3f ms\n" (Utils.time (fun () -> M.eval c))

In the code, a and b are variables and c represents the expression we want to evaluate. After the first evaluation, we changed the assignment of b twice, then re-evaluate the expression.

Because of incremental computation, Owl does not have to re-calculate everything in the following evaluations. In fact, as we can see in the following figure, only the part of the computation graph in the red-dotted rectangular will be re-evaluated if we update variable b. This mechanism can significantly reduce the computation time. On my computer, the first full evaluation takes about 275ms whereas the latter two incremental ones takes only 30ms each, a big win!

incremental computation

Behind the Scene

In this section, I want to show you something under the hood. It is not critical if you just want to use Lazy functor, but knowing these details help you understand better about Owl’s design.

Lazy functor is optimised for memory allocation in numerical computing, because keep allocating large chunk of memory is really expensive. To reuse the allocated one, you can use in-place modification function in Owl. E.g., Arr.sin is a pure function and always returns a new ndarray whilst Arr.sin_ performs in-place modification and overwrites the original ndarray. However, this is not safe and you should avoid using in-place modification unless you really know what you are doing.

Lazy functor is an ideal solution in this case. It automatically tracks the reference number of each node in the computation graph and tries to reuse the allocated memory as much as possible. By so doing, Lazy module often brings us a huge benefit in both performance and memory usage when large ndarrays and matrices are involved in computation. You can safely construct very complicated computation graphs without worrying about blowing up your memory. For example, you can keep transforming a big ndarray using sin function one million times and only constant memory will be used (plus some additional overhead in maintaining the graph structure).

One thing worth noting is that Lazy functor will never overwrite the value you have assigned on a variable, so you can always safely reuse them later, the functor only tries to re-use the memory of internal nodes which are opaque to the outside. In short, you do not need to worry about the issues that something might get overwritten then later cause problems in your application, Owl will take care of it.

Regarding incremental computation, every time you re-assign a new value on a variable, Lazy functor will invalidate all the nodes in the subgraph that depends on this variable, i.e. its descendant nodes. When you evaluate an expression, only the nodes in the subgraph this expression depends on will be evaluated, i.e. its ancestor nodes. As you see, the directions are different.

All right, I think you have learnt enough today in order to be Lazy in Owl.