Posts Tagged: clojure


13
Jan 10

The Point of WebDSL

 Jay asks in the comments:

I don’t mean to be mean, because I really like your blog and I read it all the time… but could you just explain to me the point of WebDSL? I honestly don’t mean it in any kind of negative way, I’m just wondering why you’re dedicating your valuable time to building something like that when RoR, php, et al. already exist.

Just from a cursory look, it seems like the syntax is halfway between Visual Basic (yikes!) and C. Why do you prefer that type of syntax to the Lisp-style syntax of Clojure/Compojure?

Good questions!

WebDSL was started about 3 years ago by Eelco Visser as an exercise the the design and implementation of domain-specific languages. His focus up to then had been on parsing and meta-programming, but it was time to focus on a new domain: the web. When he started Eelco had never built a web application. He investigated a number of Java frameworks as a basis and eventually decided to use JBoss Seam as a target.

The goal of WebDSL is to get rid of the boilerplate code you would have to write when building a Java application and raise the level abstraction. The vision was to have simple, domain-specific sub-languages that allow a programmer to specify a certain aspect of the application and the WebDSL compiler would generate all the implementation code for that aspect. Initially there were three sub-languages: a data modeling language, a user interface language and a simple action language to specify logic. As others joined the project (including myself), we added more sub-languages and more features: access control, workflow, data validation, ajax support and more recently search. Work is also done in the area of data evolution (i.e. migrating databases as you change your data model).

Although WebDSL is mainly a research project, we are increasingly working to make it useable by anybody with some programming experience. We currently have a few websites in production built using WebDSL (researchr, tweetview, webdsl.org and pil-lang.org) and the manual is growing.

The idea of building abstractions for the web itself is hardly novel. As Jay mentions, there are many web frameworks that already do this: Rails, Django and so on. There are a few things that we do differently in WebDSL, compared to existing frameworks:

  • We create our own custom syntax. Whereas Rails and Django are struggling to express everything using Ruby and Python, respectively, we designed our own clean syntax. Whether you like this syntax is a matter of taste. Personally I like it, although, indeed, it inconsistent here and there.
  • WebDSL is a statically typed and checked language. I wrote a number of posts about this issue and its advantages.
  • WebDSL compiles to low-level Java code, which has good performance characteristics. The code we generate does not rely on run-time meta-programming and reflection features of the language which are typically rather slow.
  • WebDSL is platform independent. We generate Java code now, but it can be ported relatively easily to .NET, Python or PHP. We have prototypes of this utilizing the PIL language that I developed.
  • Within the next few months WebDSL will have excellent IDE support for Eclipse, built using Spoofax/IMP. My colleagues are working on this. It will feature syntax highlighting, as-you-type error reporting, code completion and eventually refactoring support.

A drawback that WebDSL has today is that it’s not trivial to install, but with the IDE plug-in and Java-version that should become a lot easier soon.

So, why am I putting so much effort into this? As you may be aware I’m doing a Ph.D. in the area of domain-specific languages, so we investigate how to best build them. WebDSL is a case study for us. Soon I intend to work on another DSL, in the domain of mobile applications (yes, a DSL to build iPhone and Android applications, people!). It’s interesting from a research perspective to see how to best do this.

In addition I regularly experiment with alternative ways of creating DSLs, like in Clojure and Scala. I’d like to see how far you can push these languages to build the DSLs you like. Clojure allows you to define your own custom syntax, in some sense, as long as you adhere to the rule of the parenthesis. Static error checking is much more problematic. Clojure is also rather tied to one platform, sure, there’s also ClojureCLR, but to write programs that work on both CLR and JVM is, well, challenging. IDE support for a Clojure DSL is also non-trivial.

On the other hand, the flexibility of a DSL like WebDSL also has its downsides. Basically you can design the language any way you like, both its syntax and semantics, you don’t get much for free. Whereas an internal DSL built on Clojure or Scala gets a lot for free: some error reporting, support for namespaces (something we still don’t have in WebDSL), a type system (in Scala’s case), an escape to a powerful language (Clojure or Scala) and a rich set of libraries you can use. In WebDSL we have to design all of this from scratch.

So in the end both approaches have their advantages and disadvantages. I intend to continue to explore them both.


5
Jan 10

On Language Design: Magic Variables in Compojure

The Perl language is riddled with special variables. Consider the following example:

open(FILE, "bla.txt");
while(<FILE>) { print; }

In case you don’t speak Perl, this is equivalent to:

open(FILE, "bla.txt");
while(<FILE>) { print $_; }

Still unclear? Alright, once more:

open(FILE, "bla.txt");
while($line = <FILE>) { print $line; }

Perl is developed by linguist Larry Wall, who likes to put all kinds of natural language things into Perl. $_ refers to the subject of the sentence, it’s "it" as it were. print $_, or simply print means "print it". How do we know this variable exists? We don’t. Unless we read the manual. Although they might be useful, magic variables like these are generally bad practice, it’s one of many reasons that Perl code is often called write only code.

PHP also has some magic variables whose behavior do not adhere to any usual rules, they are called super globals. In particular there are $_POST and $_GET. They do not adhere to the scoping rules defined in PHP.

I don’t know about you, but I like to see my variables declared. I like to see where they come from, are they local variables, parameters, or imported from a module that I can lookup? Magic variables are confusing, because they have different behavior than other variables in a language, they do not adhere to the usual rules.

As part of Adia, I played around with Compojure, a Clojure web framework. Generally, Compojure is a well-designed framework, but I came across one instance where it uses magic variables: in the defroutes macro. An example:

(defroutes webservice
  (GET "/"
       (str "Hello, "
         (:name (:query-params request)))))

This defines a mapping from the "/" URI, to code to be executed when that URI is requested. As you can see, I use the variable request in this example. Upon inspection, it not clear where it comes from. At first, you may assume it’s a global that is dynamically bound (Clojure has dynamically scoped variables). But some refactoring of the code shows that this is not the case:

(defn say-hello []
  (str "Hello, " (:name (:query-params request))))

(defroutes webservice
  (GET "/"
       (say-hello)))

This results in a "symbol cannot be resolved" error for the request variable. So clearly, request is not a dynamically bound variable either. So what is it, and where does it come from? I had to dig into the Compojure source code to find it, it is defined in the with-request-bindings macro, which defines some magic symbols: params, cookies, session and flash that are only accessible from inside the macro’s body:

(defmacro with-request-bindings
  "Add shortcut bindings for the keys in a request map."
  [request & body]
  `(let [~'request ~request
         ~'params  (:params  ~'request)
         ~'cookies (:cookies ~'request)
         ~'session (:session ~'request)
         ~'flash   (:flash   ~'request)]
     ~@body))

I find this confusing and therefore bad practice. So, what is the alternative? It turns out there’s no perfect solution here, it’s all about trade-offs.

The problem magic variables try to solve is enabling quick access to data that is otherwise not accessible in a concise manner. In this case it could have been solved by declaring the request and other variables somewhere, for instance as parameters to defroutes:

(defroutes webservice [request params cookies session flash]
  ...)

A reason not to go with this solution is likely to be its verbosity, nobody likes to write functions with a large number of arguments. A second alternative is only passing the request parameter and letting the user pull out the other from that map every time they need it. A reason not to go with that solution is user inconvenience, users likely want quick access to all five of these values and don’t want to look them up in the request map every time. A third alternative is using dynamically bound global variables. In Clojure these can be bound to a new value only for a certain thread within a certain code execution path:

(def request nil) ; root binding is nil

(defn print-request []
  (println request))

...
(binding [request (build-request ...)] ; rebind
  (print-request) ; prints request
  ...))

I use them in Adia for access to the request, post and get parameters. For this I declare 3 variables in the adia.web module:

(def *request* nil)
(def *form* nil)
(def *query* nil)

Values have been dynamically bound to their respective values by the time they are used within a webfn:

(defwebfn say-hello []
  (str "Hello, " (:name *query*)))

My thinking is that by using the *name* convention, it is at least clear these are not normal locally defined variables, they are different. Secondly, they are only available if you import the adia.web module and are part of that module’s interface, i.e. you can look up their declaration. What is still not clear is where their value comes from, but I suppose the user will have to abstract from that. The alternative to this approach would be to pass these variables as parameters to every webfn, but I decided against this because it would make writing webfn defintions too verbose. It was also not very clear what the syntax should have been: (defwebfn fn-name request form query [arg1 type arg2 type] ...)? (defwebfn fn-name [request form query] [arg1 type arg2] ...)? Neither of these options seemed very elegant to me, which is why I decided to be pragmatic and go with dynamically scoped globals.

I think this is an acceptable compromise. While the five globals are essentially still input parameters, they do not have to be passed around all the time. However, if you invoke a webfn outside a path where values are bound to the *request*, *form* and *query* variables, you are likely to receive NullPointerExceptions.

As a general rule: to keep programs readable and comprehensible, it should be easy to see where your variables and symbols come from. Are they local variables, parameters, globals? Magic variables should be avoided if possible.


4
Jan 10

On Language Design: My Problem With ClojureQL

Every programming language comes with a certain syntax, a certain feel for what feels like native use of that syntax, and the semantics of the syntax. Escapes and mixing with a completely different feeling language are generally not a good idea. My favorite example of this is Objective-C, which is a really strange mixture of C and Smalltalk. C as you will know is a curly brace language, it has a way of doing things. It’s a low-level system programming language. Smalltalk is a high-level programming language that feels very different and looks very different.

Objective-C is C with some Smalltalk bolted on to it, which gives it a strange feel:

MyObject* o = [[MyObject alloc] initWithNum: 20 andString: @"Hello world!"];

In C, a function call has the syntax function_name(arg1, arg2). However, when moving into Objective-C object land, a method call looks like: [object aMethodCall: arg1 andArg: arg2]. Alien, if you ask me.

In Lisp land, an example of this is the common-lisp loop macro:

(loop for x in '(a b c d e)
      for y from 1
      if (> y 1) do
        (format t ", ~A" x)
      else do
        (format t "~A" x))

If you’ve ever written any Lisp code, you’ll see that although this is very readable and concise, like Objective-C, by the way, it feels completely weird in a Lisp-style language.

I have a similar problem with ClojureQL, a query language for Clojure. Queries expressed in ClojureQL change the meaning of Clojure in a way that I feel is bad language design because it it breaks assumptions that hold true for the rest of Clojure.

Consider the following snippet of code:

(let [first-name "zef"]
   ...
   (= first-name "zef") ...)

This piece of code binds the value "zef" to the symbol first-name. The programmer’s expectation is that when the first-name symbol is used anywhere within the let, its value will be "zef", unless it is rebound to something else with another let. However, this assumption breaks when using ClojureQL:

(let [first-name "zef"]
  (query users * (= fname first-name)))

This is legal in ClojureQL, although it is a bit unclear where fname would come from, it comes from the * there, we can make this more explicit:

(let [first-name "zef"]
  (query users [fname lname] (= fname first-name)))

This is perfectly valid ClojureQL code, except it doesn’t do what you would expect it to do. It does not find all users with first name "zef", no, it will throw an SQL exception saying that the table users does not have a field "first-name". Huh?

It turns that when we use the query macro, we step into a different world, a world where we have to let our previous assumptions go. When first-name is used, it no longer refers to the value bound to it before, instead it’s simply a name referring to a column in a table. It is still possible to escape to "normal" Clojure semantics by escaping back into the Clojure world with a ~ prefix:

(let [first-name "zef"]
  (query users [fname lname] (= fname ~first-name)))

I’m not very fond of this type of language design. It would probably be better if a ~ would not be necessary, in that case you could read the query as a kind for loop where each result row is destructured and bound to [fname lname], which are then used in the body expression. However, still, intuitively in this interpretation the names of fname and lname should not refer to column names in the users table, but instead are only to be used for binding in the code, referring to the first and second column in the result set. Still confusing.

A syntax that is more Clojuresque, if you will, albeit more verbose would be:

(let [first-name "zef"]
  (query [u users] (= (:fname u) first-name)))

Intuitively, the query iterates over all users binding each user to u and filtering on the value of the :fname key of each user entry. I’m still not confortable with the use of users there, which seems some type of magic symbol, but I suppose that could be fixed too. Maybe of having a (deftable users) statement somewhere else in the code, or replacing with with (table :users), which, again, would make it slightly more verbose:

(let [first-name "zef"]
  (query [u (table :users)] (= (:fname u) first-name)))

The point is that with great power comes great responsibility. The macro facilities of Lisps give you enormous power to create your own language extensions, which is great. It makes experimenting with languages very easy. However, it turns out that language design is very difficult. The language syntax is the user interface of your language. Whereas typical languages like Java and C# evolve very slowly and are designed by experienced language designers, in a Lisp anybody can do it, which can result in very confusing abstractions.

Abstractions like these have to be designed very, very carefully.


30
Dec 09

Adia: A Week With Clojure And MongoDB

I spent last week with my wife and her family in Poland (my wife is Polish). Her parents do not speak English, or any other language than Polish so communication is problematic beyond the thank you, you’re welcome, yes and no thank yous. My wife also spends a lot of time meeting with her friends, so I typically spend quite some time staring at the wall among people who are speaking a language I do not know well enough yet.

So, recently I’ve been coming up with little one-week programming projects for the weeks we spend there. These projects do not have to lead to anything in particular, but give me a good amount of time to take a deep dive into something I do not ordinarily have time for.

Last week my project was building a web application and framework with Clojure and MongoDB. I already had a plan for a web application in mind before, and already read up on Clojure (through the excellent Programming Clojure book) and played with it a little bit. I have also been interested in non-relational databases for quite some time and before have played with Google AppEngine’s DataStore and CouchDB. Similar to CouchDB, MongoDB is a document-oriented database, but it has more "conventional" querying methods than CouchDB, still not SQL though.

For a week on-and-off I went to work.

There were basically two components to this project, first getting to know MongoDB and second learning to build a nice internal DSL in a functional language, specifically, a lisp (I never seriously learned a Lisp before, only supervised practical sessions for a class using Scheme). As a case study I came up with a nice not-too-complicated web application to build. I won’t go into that application in this post, it’s still secret (wooh!).

Alright. MongoDB is written in C++ and has readily compiled binaries for most platforms available and very easy to install (simply extract and run). It is known for its good performance and used by many companies, including Sourceforge and Disqus (the comment system I use on this website). After starting the server, the easiest way to start interacting with the system is the mongo javascript console.

Let me demonstrate it by simply showing you a sequence of commands (prefixed with >) and outputs:

> use people
switched to db people
> db.Person.save({name: "Zef Hemel", age: 26})
> db.Person.save({name: "Justyna Hemel", age: 26})
> db.Person.find()
{"_id" :  ObjectId( "4b3b51c24905573d69b9bd67")  , "name" : "Zef Hemel" , "age" : 26}
{"_id" :  ObjectId( "4b3b51d64905573d69b9bd68")  , "name" : "Justyna Hemel" , "age" : 26
}

Note that the people database did not exist yet, and was in fact created when it was first used, similarly, the Person collection ("mongoose" ;-) for table) was automatically created when I saved a first record to it. Like other document databases, collections are schema-less. Now that we have some data, we can start querying:

> db.Person.find({name: "Zef Hemel"})
{"_id" :  ObjectId( "4b3b51c24905573d69b9bd67")  , "name" : "Zef Hemel" , "age" : 26}

So querying happens by passing the find function a map of keys and values that must match in a document. This notation gets slightly weird when looking for age ranges, for instance between 20 and 30:

> db.Person.find({age: {$gt: 20, $lt: 30}})
{"_id" :  ObjectId( "4b3b51c24905573d69b9bd67")  , "name" : "Zef Hemel" , "age" : 26}
{"_id" :  ObjectId( "4b3b51d64905573d69b9bd68")  , "name" : "Justyna Hemel" , "age" : 26}

So there, as value of the property age, we give it another map with special operators $gt and $lt, which stand for… greater than and less than! It’s a bit odd, but it’s easy to get used to (and screaming to be wrapped in some nicer syntax on a language level).

Because no indexes have been defined on the collection yet, this lookup is still rather slow. However, indexes can easily be defined:

> db.Person.ensureIndex({name: 1})
true

This defines an index on the name property in ascending order (-1 would be descending). The index order only matters when putting indexes on multiple columns and sorting on some of them, or so the manual tells me.

So, MongoDB is fairly straight forward to play with, easily create new collections, add properties and so on. Intuitively it feels like a good match to a dynamic language, such as Clojure.

Clojure is dynamic functional language for the JVM. As mentioned, it is a Lisp. It comes with a nice interactive REPL to experiment with. The most interesting thing about Clojure from my point of view, as somebody doing research into domain-specific languages, is the ability to create domain-specific languages with it. As you will know, the syntax of Lisp is extremely simple and mostly defined by its functions and macros.

There is already a simple Clojure web framework called Compojure, which is basic but quite powerful. For my application I decided to build some layers on top of Compojure. First of all, compojure only deals with the web side of things and not with database stuff. For MongoDB there is CongoMongo, a simple Clojure interface to MongoDB. This turned out the be far from complete, however, so I branched it and added a bunch of functions to it.

I decided to call my little framework Adia, it’s available for download from github, see the Readme there for installation instructions, there’s no documentation yet, there is however a simple wiki application in the examples directory.

Although MongoDB does not enforce any schema, it seemed like a useful thing to define a simple entity language anyway, if not for the database itself, for me, as documentation and possibly for automatic form generation and data validation, later. This is what it looks like:

(defent Page
  [:title    :string {:unique true}]
  [:author   :string]
  [:text     :text])

As can be guessed, this defines a Page entity with three properties: title, author and text.

As you will be aware, I’m a developer of WebDSL, a DSL for building web applications, and came to appreciate its simple page and template abstractions. Although implementing actions in a WebDSL fashion would be against the functional character of the language, I did add a page abstraction, except I call them webfns, defined with defwebfn (similar to defn, to define a Clojure function): 

(defwebfn say-hello [nam str]
  (str "Hello, " nam))

This defines a web function with one parameter: nam, which is coerced to a string value through the str function. Similarly, every entity definition also defines a function with the same name that can coerce the URL representation (identifier) and retrieve its value from the database, e.g.:

(Page "31108a33ee093a4bdd7b5900")

Retrieves the page object with ID "31108a33ee093a4bdd7b5900". This can be taken advantage of in web functions as follows:

(defwebfn show-title [p Page]
  (str "Title: " (:title p)))

These webfns are available through a URI based on their name and the last part of the namespace they were defined in. For instance, when a webfn show is defined in namespace myapp.user, it will be available through "/user/show". Any namespace ending with .index, or webfn named index, are bound to the root, e.g. webfn index in myapp.user results in "/user" and webfn index in myapp.index is bound to "/".

Templates are, of course, simply functions with parameters. Compojure comes with a rather nice alternative HTML representation using Clojure vectors:

[:a {:href "/"} "Link text"]

Which can be used to define a main template:

(defn main-layout [title & body]
  (html
    (doctype :html4)
    [:html
     [:head
      [:title title]]
     [:body
      [:h1 "Header"]
      [:hr]
      body
      [:hr]
      "&copy; Zef Hemel"]]))

Of course, vectors like these can easily be combined with regular function calls, to build pages. Here is an example of an index page with title "Wiki home", displaying a list of current pages and a form to add a new one:

(defwebfn index []
  (main-layout
    "Wiki home"
    [:h1 "All wiki pages"]
    [:ul
     (for [p (query model/Page)]
      [:li (navigate [show p] (:title p))])]
    (form [handle-add]
          [:h1 "Add a page"]
          [:div "Title: " (input-string :title)]
          [:div (input-text :text)]
          (submit-button "Add page"))))

This will be rendered roughly as follows:

The actual adding happens in the handle-add function:

(defwebfn handle-add []
  (let [p (databind
             (model/Page
               :author (get-session :username)) 
             *form* [:title :text])]
    (redirect [show (persist! p)])))

Additionally, I can define an access control rule for handle-add:

(defac handle-add (get-session :username))

Which says that only if the session key :username has a value, i.e. the user is logged in, a page can be added, resulting in the handle-add only to be available to logged in users. In addition, the form on the index page will be hidden when the user is not logged in. Similar to WebDSL’s navigates.

Lessons learned

MongoDB is a nice and simple NoSQL database system and when you’re in the right document-vs-row no-join-required mindset, it’s easy to work with. I found that it also works well with Clojure, initially I played with Clojure and MySQL a bit, which also works fine. Still, I found it a bit slower to iterate because you keep creating and dropping tables and adding, modifying and removing columns in your table as you’re developing the application. Database migration is a pain. In MongoDB this is less of a problem I have found.

Clojure is quite a nice, elegant, simple language and macros are a very powerful way of defining new "syntax" for your own little domain-specific languages. The syntax of the language is the user interface to the developer and is therefore important. Not everybody is a fan of the Lisp syntax and it definitely takes some getting used to. Lisp programmers say you should see through the parentheses and instead look at indentation to extract meaning from programs. That works, but bites you in the ass when you do some s-expression manipulation and misplace some parenthesis and do not let your editor (I used vim with vimclojure) re-indent your code. This happened to me a few times and on occasion took me quite some time to debug.

The Clojure syntax is concise, maybe too concise. Sometimes I find it hard to e.g. see what piece of code is part of the true and which is of the false branch of an if-statement, an else keyword can be useful to make code easier to read.

Homoiconicity is cool. The defwebfn macro, in addition to defining a function and doing some other stuff, also keeps the original list structure that defines the web function in memory (the source code, as it were). The access control module takes advantage of this by taking this code, wrapping an if statement around it and recompiling it at runtime. Model transformations at runtime! Potentially more advanced program transformations can happen in this way.

Are Clojure DSLs as flexible as external DSLs? Not really. First off, you’re locked into the Lisp syntax. Second, checking is rather limited and error messages not always extremely helpful, similar to DSLs in other languages, such as Ruby. Clojure does do compile-time symbol lookups which is helpful, but beyond that a lot of errors are detected at runtime. Macros are evaluated at compile time and can therefore check a few things then, but this checking is limited to the "AST" representation of its arguments. Clojure is a dynamically typed language, so checking in general is problematic. And third, you’re bound to the JVM (or with more effort CLR), you can’t target multiple platforms.

Meta-programming in Clojure is cleaner than in many other languages such as Ruby and Python. Most of it happens using macros which are fairly clean, if used well.

As I suggested before, libraries and internal DSLs like these are great ways of prototyping abstractions. They’re easy and quick to implement. Access control was added in about 20-30 lines of extra code, OpenID authentication took about 30 (through use of JOpenID). This makes Clojure a great language to try out and play with abstractions. I find that in Stratego, which we use to implement WebDSL, this is still problematic due to the fact that (1) it is a separate language, so you have to make mental jumps between Stratego and the target language, e.g. Java, and (2) long compilation times of the ever growing WebDSL compiler.

As mentioned, if you’re interested in Adia, you can download it and play with it yourself. Documentation is essentially non-existent as of yet, but the wiki example demonstrates its basic features.


19
Nov 09

Building Clojure Projects with Leiningen

clojure-iconEverybody who once used Java, struggled with Java’s classpath at some point during their career. You have to put all the right paths in there, the right .jar files and so on, both when compiling and running your Java project. To make this somewhat simpler you typically end up doing it either in an IDE, or using a tool like Ant or Maven. These are pretty heavy weight tools, and the latter too involve writing XML, which hardly anybody does for fun anymore.

Leiningen is a simple build tool for Clojure, based on Maven (I’m pretty sure). It offers a simple, Clojuresque way of constructing build files for your Clojure projects (which run on the JVM).

To install Leiningen you only have to download one file and put it in some directory that’s on your PATH:

cd ~/bin
wget http://github.com/technomancy/leiningen/raw/stable/bin/lein
chmod +x lein

You then do a self-install:

lein self-install

This will dowload a number of jar files, including Clojure itself, so you do not even have to have Clojure installed at this point.

To make a new project, create a directory for it, e.g. helloworld:

mkdir helloworld
mkdir helloworld/src

In the source directory you put your source files, for instance a helloworld/src/helloworld.clj:

(ns helloworld
  (:gen-class))
 
(defn -main [& args]
  (println "Hello world!"))

Then, in the helloworld/ directory, create a project.clj file:

(defproject helloworld "0.1"
    :dependencies [[org.clojure/clojure
                      "1.1.0-master-SNAPSHOT"]
                   [org.clojure/clojure-contrib
                      "1.0-SNAPSHOT"]]
    :main helloworld)

The :main there defines namespace containing your -main function (analogous to the typical public static void main(...)), if any. Then, from the helloworld directory you run Leiningen:

$ lein compile

     [copy] Copying 2 files to /.../helloworld/lib

Compiling helloworld

And subsequently we can build a .jar for it, or even an uberjar, which will create a big jar file for easy distribution, also containing all of its dependencies (including Clojure itself): 

$ lein uberjar
Unpacking clojure-1.1.0-alpha-20091113.120145-2.jar
Unpacking clojure-contrib-1.0-20091114.050149-13.jar
Compiling helloworld
      [jar] Building jar: helloworld.jar
$ java -jar helloworld.jar 
Hello world!

Leiningen has some other tasks as well:

  • lein deps, installs dependencies in lib/
  • lein test [PRED], runs the project’s tests, optionally filtered on PRED
  • lein compile, ahead-of-time compiles into classes/
  • lein repl, launches a REPL with the project classpath configured
  • lein clean, removes all build artifacts
  • lein jar, creates a jar of the project
  • lein uberjar, creates a standalone jar that contains all dependencies
  • lein pom, outputs a pom.xml file for interop with Maven
  • lein install, installs in local repo (currently requires mvn)
  • lein help [TASK], shows a list of tasks or help for a given TASK

Enjoy!

 


11
Nov 09

Interesting Clojure Projects

clojure-iconSome pointers for new explorers of Clojure

IDE-related:

DSLs/libraries built for Clojure:

Other:

 


6
Nov 09

Brief Introduction to Clojure

clojure-iconClojure (pronounced "Closure") is a relatively new programming language which runs on the Java Virtual Machine. This is roughly what it looks like:

defn say-hello-to [name]
  println "Hello," name

Neat, huh? Well, ok, I was lying a little bit in order not to scare you, because… pss, Clojure is a Lisp!

You’re still here? Alright. I suppose I can show you the real program, which is what I just showed with a few parenthesis added, in fact not that many:

(defn say-hello-to [name]
  (println "Hello," name))

Being a LISP has its advantages and disadvantages. A disadvantage is that people get confused by the parentheses, or at least by their placement, because, let’s face it, in a C-style language, this definition would look as follows:

int main(char* name) {
  printf("Hello, %s", name);
}

Now, if you’re willing to extend your definition of parenthesis to include other types of brackets and braces, you will see that the number is actually the same. But the truth is, parentheses in usual places scare people, it turns out.

A nice, but potentially confusing feature of LISPy languages is that they are homomoronic… I mean, homoiconic, which Wikipedia says, means that the primary representation of program code is the same type of list structure that is also used for the main data structures. Great. What does that really mean?

Let’s see what happens when you type in a Clojure expression, for instance the following expression to add two numbers:

(+ 2 1)

The first thing that is invoked is the reader, which parses the expression and turns it into, in this case, a list containing 3 items: +, 2 and 1. Because we asked Clojure to evaluate the expression for us, this list is going to be evaluated by first evaluating all of its elements. It turns out that ‘+’ in in fact a defined function, so we get back a function object, 2 evaluates to itself as does 1. The evaluation of a list in a Lisp means to call its first element as a function with the rest of the elements as arguments. So in this case the + function will be invoked with 2 and 1 as arguments, resulting, not surprisingly, in 3. So lists are Clojure’s representation of function calls, except that the function name is put inside the parenthesis rather than before it.

Note that this makes Clojure, and Lisps in general, a language with an extremely simple and concise syntax. All there are are literals (such as characters, strings, symbols and numbers) and lists (although Clojure also adds special syntax for sets, maps and vectors to that). There are no special operators or keywords.

Fantastic. So, in a LISP all function calls (including operator calls) are written in prefix notation, i.e. instead of writing 2 + 1 you write (+ 2 1). You can also compose them, e.g. (+ 1 (* 2 3)) results in 7. Although this may seem a little confusing to read, its "regular" infix counterpart is 1 + 2 * 3, where you always have to take the language’s precedence rules into account, is it (1 + 2) * 3, or 1 + (2 * 3)? Using the Lisp notation it’s always clear. Now let’s look at the following expression:

'(+ 2 1)

Note the quote there. What does that do? Well it quotes the expression, telling the interpreter: "do not interpret this, but return it literally". So this expression results not in the value 3, but with a list containing three items: the + symbol, 2 and 1. We can now manipulate this list as we wish:

(second '(+ 2 1))

results in the value 2, because the second function returns the second item of the list that is passed to it. What we can also do is eval it:

(eval '(+ 2 1))

What eval does is *drumroll* evaluate the data structure that is passed to it as if it were an expression, resulting in… 3! 

We can also manipulate lists a bit, so let’s say, get rid of the + and replace it by -:

(cons '- (rest '(+ 2 1)))

The cons function builds a new list with its first arguments as the head and the second argument as its tail — the rest of the list. The quote before – means "don’t resolve it, just give me the literal symbol -", the rest function returns the tail of the list that is passed to it (so all items except the first one). The result of this expression therefore is:

'(- 2 1)

We can then eval this expression, resulting in 1:

(eval (cons '- (rest '(+ 2 1))))

Now what you will have noticed, as mentioned, lists and programs are represented using the same syntax in Lisp languages: lists. Therefore, meta-programming is very natural and easy. Meta-programs are programs that manipulate (other) programs. And who doesn’t want to write meta programs?

I do. It’s my job. Plus, it’s awesome.

Lisp originally came out of the artificial intelligence world, where they had the belief that a homoiconic language at some point could transform and improve itself. Programs that rewrite themselves and at some point could make themselves smarter. Although it’s a cool idea, it never really happened. Shocker.

However, a great feature of Lisps resulting from their homoiconic nature are its language extension features. Have you ever wished that your favorite programming language had feature X? That you could use LINQ-style queries in Java, or PHP, for instance? The way to "extend" your language with features like this is to develop APIs that somewhat look like queries, but this will only get you so far. For instance, here’s an example of LIQUidFORM in Java: 

Person p = LiquidForm.use(Person.class, "p");
List people = em.createQuery(
    select(p).from(Person.class).as(p).where(eq(p.getSurname(), "Smith")).toString())
    .getResultList();

It’s nicer than simply typing in "SELECT p.* FROM Person AS p WHERE p.surname = "Smith" as a string, I guess, but it’s kind of a hack. Here’s the same query written using ClojureQL, a similar framework for Clojure:

(def people-query (query * employees (= surname "Smith")))

Now although you may argue it’s not pretty (because you find Lisp syntax ugly), but it clearly fits in well with the rest of the language and it’s way less verbose.

Similarly, when playing with program transformation in Clojure, I needed a pattern matching feature, which is only offered in a very limited form in Clojure (data deconstruction). So I built pattern-rewrite macro that enables me to define pattern matching as follows, quite similar to Stratego. I gave identifiers starting with "?" a special meaning, those are the variables that are matched in the pattern. The syntax of the patter-rewrite macro:

(pattern-rewrite expr
  (lhs-pattern rhs-pattern)+
  no-match-result?)

So, let’s rewrite our replace-addition with substraction example using this macro:

(pattern-rewrite '(+ 2 3)
    (+ ?x ?y) (- ?x ?y))

Resulting in:

'(- 2 3)

Of course it also works with more complex patterns.

The idea of being able to add features like this to a language really appeals to me. You can do stuff like this in other modernlanguages, like Scala and Ruby, in a nice way as well, but Lisps still offer more power due to their simple syntax.

You can learn more about Clojure on their website. Pragmatic Programmers have a nice Clojure book: "Programming Clojure".