My Experience With TDD In Clojure
I am spending more and more time writing code in functional-style languages, like Clojure and Scala. Even though most of those languages and mature enough to offer a complete development toolset, I am still finding out ways to adapt techniques which are bread-and-butter in Object-Oriented and Procedural languages to this old new world.
One of them is Test-Driven Development, the technique of specifying and verifying the expected behaviour for a piece of code before writing the code itself. This is not something unique to Object-Orientation, programs written in a functional style also benefit a lot from the design aid provided by tests.
Testing Driving Functional Design
Quite often I am reading some code base and find something like this:
(def sample "Wir laden unsere Batterie Jetzt sind wir voller Energie Wir sind die Roboter Wir funktionieren automatik Jetzt wollen wir tanzen mechanik Wir sind die Roboter Wir sind auf Alles programmiert Und was du willst wird ausgefuehrt") (def char-count-file "char-count.txt") (defn write-char-count [text] (spit char-count-file (reduce + (map #(.length %) (seq (.split text "\\W+")))))) (fact "writes the number of chars to the file" (write-char-count sample) (slurp char-count-file) => "190")
Unsurprisingly, this is hard to test at the unit level. Often times char-count-file will be some hard-coded path, and to run my test I will have to make sure that the directory the it tries to write to exists, it’s supported by my operating system, it has no old data from previous test runs, etc.
As mentioned, using tests to drive your design leads to writing functions which are easier to test. One obvious way to make the previous function easier to test would be to send in the file name as a parameter, something like this:
(defn write-char-count [target-file text] (spit target-file (reduce + (map #(.length %) (seq (.split text "\\W+")))))) (fact "writes the number of chars to indicated file" (let [test-output "test.txt"] (write-char-count test-output sample) (slurp char-count-file) => "190"))
This is good enough for many cases, but passing in the dependencies of a function every time we want to call it easily leads to dependency management hell.
In Functional Programming, we tend to pass functions around. If we keep the code like it is now, when we call this function from multiple parts of our code we need all of those different and possibly unrelated parts to know about all dependencies, which could even be other functions with their own dependencies! In order to avoid spaghetti code, we need more closure (pun intended).
One way is to use a function that leverages lexical closures to make dependencies available to the function, but not let them leak out:
(defn make-write-char-count [target-file] (fn [text] (spit target-file (reduce + (map #(.length %) (seq (.split text "\\W+"))))))) (fact "writes the number of chars to indicated file" (let [test-output "test.txt" write-char-count-fn (make-write-char-count test-output)] (write-char-count-fn sample) (slurp test-output) => "190"))
Now the writer function always knows where to write to, when invoked we just need to supply the contents to be written. You can pass this function everywhere around your system and nothing but the code that invoked make-write-char-count have to know about file names.
A bad thing about all these examples is that we are still relying the spit function, which will require us to write and read files during our tests. Real unit-tests should not depend on any external resource, so we need to find a way to break this dependency.
One way of doing that is with Midje’s prerequisites mocking feature to mock out the call. I used this style for a while, but I always felt something wasn’t right about this approach.
These days my coding style changed, even though I still use stubs a lot, I rarely find myself using mocks. I tend to build up on the idea of higher-order functions we introduced a couple of examples ago.
First, we explicitly separate our char counting code from writing to a file system:
(defn char-count [text] (reduce + (map #(.length %) (seq (.split text "\\W+"))))) (defn make-writer [target-file] (fn [contents] (spit target-file contents))) (facts "about counting characters" (fact "we count all characters but empty spaces" (char-count " a b c d ") => 41) (fact "an empty string has 0 characters" (char-count "") => 0)) ;;this should be run during integration testing (facts "about writing to the file system" (fact "it writes contents to the specified file" (let [test-output "some-file.txt" expected-content "banana"] ((make-writer test-output) expected-content) (slurp test-output) => expected-content)) (fact "writing to an non-existing path fails") ((make-writer "/some/path/that/does/not/exist/blah.txt") "something") => (throws java.io.FileNotFoundException))
By doing this, we were able to test the two different responsibilities in isolation. We can also have the integration test required by the writing function to run in a separate step from our unit tests, making it easier to follow the principles of the testing pyramid.
We then need to wire-up our functions. There are many ways of doing this; we could just manually pipeline them whenever we need:
((make-writer "blah.txt") (char-count sample))
But, like said above, passing in dependencies every time you call a function is not good. We should try following our previous strategy of using a closure:
(defn make-char-count-writer [target-file] (fn [text] (let [writer (make-writer target-file)] (-> text char-count writer)))) (defn count-my-chars [text] (let [counter (make-char-count-writer char-count-file)] (counter text))) (count-my-chars sample)
This way our dependencies are contained in one place. This makes it easier to test, but also leads to two design styles that I believe are very useful when using Functional Programming:
- Isolating side-effects
- Function composition using Combinators
In Functional Programming we want to work with values and pure functions. I really like how John Hughes explains the difference:
[...] if I give you a twenty pound note, that is not the same as giving you my credit card and instructions how to use my PIN to get twenty pounds from an ATM. What's the difference? What if I do the latter? You can choose not to do it at all, you can choose to do it as many times as you like, and it is clearly something quite different. So it's not hard if you explain that in this way for people to understand that there is a difference between an action and the value that it produces[...]
Side-effect free programming has many advantages, but having some side-effects are required to write useful programs (even reading from or writing to a console are side-effects!). Most Functional Programming languages try to find a way to keep side-effects away –Erik Meijer has some great presentations on this topic.
As we try to make our code unit-testable, we isolate code that cause side-effects in order to replace them with functions which are easier to verify in our tests. This leads the developer to a style of coding where most of the functions are side-effect free.
Function composition using Combinators
Combinators are functions with no free-variables-- they don’t depend on absolutely anything but their parameters. In functional languages, the term is commonly used to refer to “functions that take other functions as arguments, returning new functions”. On the usage of the Combinator Pattern, the Haskell wiki says:
Libraries such as Parsec use the combinator pattern, where complex structures are built by defining a small set of very simple 'primitives', and a set of 'combinators' for combining them into more complicated structures. It's somewhat similar to the Composition pattern found in object-oriented programming.
Combinators can be used to decompose big tasks in smaller pieces, which are easier to understand and reason about. Smaller pieces are also much easier to test, and using TDD in Functional Programming can lead to this style.
Years of trying to understand and modify code written by other people taught me that there is a huge correlation between TDD and a clean code bases. I am not sure why this happens, but I believe that the reason is that writing tests first force a developer to think about the code from an outsider’s perspective, and this leads to better design.
The same anti-patterns that make code hard to test make it hard to reuse, read and modify. So far, my experience with TDD in functional programming is that it is just as beneficial in this paradigm as it is in more mainstream ways to think about software.