What is Clojure?

In broad strokes

The programming language Clojure is an effort to bring the elegance, flexibility, and brevity of Lisp (which has been around since the fifties!) to the JVM platform, gaining the many benefits that come with that:

  • Performance (don't laugh, it's true!)
  • Deployability
  • Access to the many excellent Java libraries

It adds to the mix several improvements on the basic ideas and idioms of each, fine-tuning based on the wisdom gained in the years since they were released.

All those dang parentheses

If you've ever considered learning any language in the Lisp family, chances are you balked at the parentheses. I did too! They look so alien to someone who's used to the C family of languages, which are filled instead with semicolons and curly braces. But really it's just an emotional reaction to unfamiliarity, and an overreaction: why should the punctuation of a language deter you from learning it?

And I promise, the more you get into Lisp, the more you will realize there are lots of exciting benefits to having those parentheses. Here's an example of a trade-off, where parentheses are easier in one instance, and "unnecessary clutter" in another very similar situation.

            (+ x (* 10 y)) | x + 10 * y
              (+ a b c 10) | a + b + c + 10
(.update node parent data) | node.update(parent, data)

The C/Java versions of math on the right look a little more natural because that's how you're used to seeing them from math classes, and the implicit operator grouping of 10 * y is convenient.

On the other hand, Lisp's syntax is very regular, and this comes with a lot of benefits of its own. For example, there is nothing special about arithmetic operators: you call them just like any other function in the language. In example 2, we didn't need to use the + sign more than once: until the close-paren of the + form, everything is added to the final sum. And in example 3, we didn't need any special punctuation to distinguish the target object, nor to separate the two arguments - in fact, in Clojure commas are just another form of whitespace!

But enough about punctuation. Some of the benefits are listed above, and believe me when I say there are more; let's forget about that and get to the meat of the language

Clojure is functional

And I don't just mean that it works well! There are two main properties of functional languages:

  1. Functions are first-class data types, and you can do anything with them that you would do with, say, an integer: pass them around as arguments, store them locally, and create them as needed.
  2. It's very hard (or even impossible, in stricter functional languages) to change the value of a variable once you have created it. If you say x is 20, then it's 20 for as long as it's in scope. If you want x to be 21, you create a new scope in which you declare that it is 21. There are plenty of ways to create new scopes, just like in "imperative" (non-functional) languages, but one way that is much more common in functional languages than in imperative ones is recursion: an example will follow.

The problem is, if you're used to imperative languages, then (1) doesn't seem very powerful, and (2) seems incredibly limiting! Neither of these could be further from the truth: let's dive right in and I'll show you!

Functions for president

First-class functions are astonishingly powerful, in ways that will surprise you if you aren't familiar with them. One of the most crucial, but also most vague, benefits of first-class functions is that they increase the amount of abstraction in your program, pushing more book-keeping details down to the computer and leaving only the meaningful actions for you to deal with. This also means that if you understand how two separate functions behave, then you "automatically" know what it means when they interact. If this is too meta for you, here are a couple concrete examples.

Searching a list for the object with the largest "size" property

// In C, as concisely as possible but maybe with bad style
struct data_object *largest_size(struct data_object *list) {
    struct data_object *result = list, *curr = list;
    while (curr && (curr = curr->next))
        if (curr->size > result->size)
            result = curr;
    return result;
}
;; In Clojure, using max-key
(defn largest-size [list]
  (apply max-key :size list))

;; In Clojure, by hand
(defn largest-size [list]
  (reduce (fn [a b]
            (if (> (:size a) (:size b))
              a
              b))
          list))

Look how much tedious book-keeping I had to do in C. Keep track of the current winner myself, check for nulls, dictate exactly how to find the next element of the linked list...holding the computer's hand every step of the way.

In contrast, Clojure has a built-in function max-key, taking a function and a list - it walks the list, calling the function (here :size) on each element and comparing, and tracks which element returned the largest result (I need to use apply rather than calling max-key directly because of the details of how max-key wants its arguments supplied - you can ignore that and imagine it's just a direct call to max-key). You may feel that max-key is an unfair comparison: I picked a function Clojure has a built-in implementation for! Therefore, I've also included code for doing it by hand, with the more primitive reduce function. Reduce is a general case of max-key: it walks the list keeping track of "something"; for every element it calls its function-argument with [tracked-thing, next-element], and stores the result as the new tracked thing. When it's done with the list, it gives you the thing it was tracking.

This time, instead of passing a built-in function like :size, I gave reduce a function that I defined right there: it takes two arguments, and returns the one with the largest :size element. This is a very basic case of being able to combine two simple functions, reduce and my max-size anonymous function, into one more powerful function. If you know how reduce behaves and how the anonymous function behaves, it's very easy to understand how largest-size behaves. Compare that to the C, where you have to carefully examine all of the book-keeping overhead to make sure the function works right: I made at least three serious errors before getting a working version of largest_size, and I've made a living writing C. The Clojure code almost assembles itself, from two or three simple pieces into one cohesive whole that you don't need to debug.

A recurring nightmare

If you've written any software, you're probably at least familiar with the concept of recursion, although you may not use it very much, and may even regard it as an advanced or complex topic. Because Clojure data types are immutable, it leans heavily on recursion1 to solve problems that imperative languages would instead tackle by modifying variables in some sort of loop. Basically, instead of working on a task and modifying your results in place, Clojure functions typically break an operation down into smaller and smaller tasks until each one is trivially easy, and then combine them all into the final goal. Let's reimplement largest-size from earlier, without using even reduce.

(defn largest-size [list]
  (let [item (first list)
        more (next list)]
    (if-not more
      item
      (let [sub-largest (largest-size more)]
        (if (> (:size item) (:size sub-largest))
          item
          sub-largest)))))

The "let" form is Clojure's basic variable-assignment mechanism: it creates a binding with the name given on the left, and the value given on the right. So we set "item" to the first element of the list, and "more" to the rest. If more is empty, the answer is pretty clear: item must be the largest element in the list, since there are no other elements! But if there are more elements, then we simply call largest-size on the remaining elements, and see whether item has a larger size than whatever was left.

And all of that was done without ever having to modify a variable once we created it: we use the magic of recursion and the stack to do all our bookkeeping2. It's very easy to live without mutating variables once you're used to thinking that way, and doing so has a lot of benefits. For example, largest-size doesn't depend on any global variables, and it doesn't affect the state of the program in any way other than its return value. If (largest-size x) returns 10 in one test run, then it will always return 10 when given that same x, regardless of any other circumstances. If you're curious, that's called "referential transparency".

1 But usually you should let the built-in library functions do the recursion for you: they're well-tested and optimized, and you rarely need something so specialized that you have to do any recursion yourself.
2 Note to observant readers: yes, this version of largest-size will run out of stack memory on sufficiently-large lists. It's not hard to fix that problem, but it involves introducing more language constructs than are really necessary.

But how do I actually write programs?

You may have noticed that I haven't gone into detail on how to write a program from start to finish, how to run it, or even really described what defn does (hint: it defines a function). And that's on purpose: my goal here is to motivate you, to give you a taste of how beautiful Clojure (and Lisp in general) can be, not to teach you how to write it yourself. There are lots of resources available for that purpose, but a lot of people don't research them because they're afraid of Lisp, or not aware of the awesome powers it grants you. So hopefully I've gotten you interested: check out some of the links, maybe join the mailing list, or download a copy and try it out yourself. In the meantime, watch this space, as I plan to write more hubs about various interesting Clojure-related things.

More by this Author


Comments 7 comments

Ari Lamstein profile image

Ari Lamstein 5 years ago from San Francisco, CA

Thank you for this wonderful answer to my question.


nicomp profile image

nicomp 5 years ago from Ohio, USA

Wow. Very well done. Great stuff!


ezhang profile image

ezhang 5 years ago from Bay Area, CA

(define (square x) (* x x))

I have seen the above line of Scheme code enough times to last me a lifetime...


Simone Smith profile image

Simone Smith 5 years ago from San Francisco

Another great Hub - and it's a total two-for-one, since I had never heard of Lisp before either! Gosh... those parentheses...

While once again, most of this went over my head, I found your explanations to be truly top drawer. I really am motivated to learn more now!


jsa 5 years ago

I like your enthusiasm! It's nice to see this new wave of people promoting the beauty, power and elegance of functional programming, Lisp in general and now Clojure in particular.

A couple things. While :size is a function (as are all keywords), it takes an indexed collection (various kinds of maps, vectors, et.al) and will return the value of the entry whose "key" is the keyword :size.

max-key takes a "key" function, kfn, and an open set of items, applies kfn to these (and kfn must return a number) and returns the item with largest kfn value. So, your max-key version would be:

(defn largest-size [kfn l]

(apply max-key kfn l)

Also, note that Clojure is a Lisp-1, so it is not typically a good idea to rebind RTE names like list as it will not have its original meaning while so bound. In a Lisp-n (like Common Lisp) this is not an issue.

As you note, your last largest-size will stack overflow on larger collections (that's actually another thing worth noting: these largest-size functions will work on any seqable collection: various maps, vectors, sets, and lists!). The reason for the stack overflow is that the JVM has no tail call optimization, so Clojure (unlike Scheme or most any good CL implementation) can't do that automagically. That's where the special operator recur comes in (as you hint at). Here's the preferred canonical version of something like this:

(defn largest-size [kfn l item]

(if (empty? l)

item

(if (nil? item)

(recur kfn (next l) (first l))

(let [f (first l)]

(recur kfn

(next l)

(if (> (kfn f) (kfn item)) f item))))))

(largest-size identity '(1 3 8 5 77 33 3 99 22) nil)

=> 99

(largest-size identity [1 3 8 5 77 33 3 99 22] nil)

=> 99

(largest-size val {:one 1 :two 3 :three 8 :four 5 :five 77 :six 33 :seven 3 :eight 99 :nine 22} nil)

=> [:eight 99]


amalloy profile image

amalloy 5 years ago from Los Angeles Author

jsa: code doesn't format very well in comments here. I've gisted it to https://gist.github.com/902251 to make it more readable. I realize I could have generalized to key functions other than :size, but it's not really necessary for a comparison, as doing so in C is rather complicated; it also makes largest-size fairly uninteresting: it's become just (partial apply max-key). Thanks, though, for reminding me that max-key needs an apply - I'll work that in as soon as I find a good explanation for what it means.

I don't actually care for your implementation of largest-size "by hand"; I intentionally left the example simple, introducing few concepts. If I were going to do it "right", I would use cond instead of nested ifs, and probably let a (seq l) at the beginning, as well as make an effort to cache the results of (kfn item) as I iterate.

You make a fair point about shadowing globals like "list". Personally I haven't had trouble with this, and I prefer to use descriptive binding-names, but "coll" is probably a better choice (especially for readers of this Hub, who might not know list is a built-in).


jsa 5 years ago

Yes, I see the code not formatting at all.

I think the key point about :size is that it only works as you intend if your items are collections (including structures and/or defrecords) using keyword fields. If they are vectors or numbers or hash entries or whatnot, then this won't work. It may also give readers the idea that :size is some magical function that knows the size of anything, when actually it knows the size of nothing.

Using cond in the by hand largest-size seems fine. You don't need a (seq l) as next and first know how to implicitly "seq" any collection. And (if (empty? l) ...) is rather more descriptive than (if-not (seq l) ...) or some such. I'm not convinced caching would buy you anything in this sort of scenario even in real world examples. The kfn functions are either identity or simple indexers like :size. It could well make sense for more complex kfns to memoize _them_: (def mykfn (memoize mykfn))

    Sign in or sign up and post using a HubPages Network account.

    0 of 8192 characters used
    Post Comment

    No HTML is allowed in comments, but URLs will be hyperlinked. Comments are not for promoting your articles or other sites.


    Click to Rate This Article
    working