Tagged: big data

Why I am so excited about Clojure and why I think its going to explode.


I am by no mean an expert on this matter, but I recently went to a company job interview which dealt with Clojure, so in doing my preparation I knocked up a lot of stuff which I wanted to write about.

It’s kind of funny because it never easy to identify emerging technologies but it is possible to predict by looking at a variety of changes and how these changes come together.

There is very little doubt that big data is going to take off. Why wouldn’t the National Health Service want to know what conditions effect the likely hood of cancer or high blood presurse or, an advertising company distinguishing which advertising campaign are most successful. The meaningful information we can retrieve is endless and it just makes sense.

In addition we need to look at our current hardware, companies such as Intel and AMD are finding it harder and harder to develop processors that are faster the exponential development was saw int he 90s is over, it is just not possible to develop CPU switches that move any faster. So instead we are living in an era, and will continue to, of multi-core processors. We can see this with quad cores and even dual processors in phones. Don’t be surprised to see 1000 cores in your computer on day. In addition to this Google server developer strategy is horizontal, not vertical, meaning they are expanding by creating more server farms that they use to distribute the work load of their processing. Their not buying in more powerful servers, but instead have the challenge of dealing with computer across different network.

Over the past 20 years or so OO has dominated the landscape and rightly so. It provides structure and a simplified way of organising your code. OO design patterns such as MVC are going nowhere. But today with live in a world with events, mobile phones trigger events, more intricate interaction with UIs and apps all trigger a vast amount of events and traditional OO doesn’t embrace this style of interaction as well. Instead functional programming copes better. http://www.ibm.com/developerworks/library/j-ft20/.The ability to add a load of functions that are triggered on events and then call another function which returns a function that can then be use as an argument for another function, allows for more dynamic, flexible programming. I am not saying this is the end of OO by any means, but rather functional programming is seems to be a better fit for many programming challenges.

Now you ask how does this all fit in to Clojure.

Clojure has an amazing ability to deal with data, since in essence everything in Clojure is data it become very easy to handle and manipulate data. For example a set of functions used to manipulate data structures work the same on all data structures, so you can transform data from one structure to another and still apply the same functions. This is essence is perfect for Big Data as the data mined by companies can come is all sort of formats and structures. Most instances they may be totally unstructured.

Traditional languages use mutable data, meaning you change the state of data in objects. Clojure uses immutable data structures, meaning you create a copy of the object and then make the change to the spefic data leaving the rest of the object the same. For example if you had an object Person which had data name you would, with mutable data structures, setName and the object state would now be changed. In Clojure you would make a “copy” of the object and then change the data in the new object leaving any other attribute the same. Of course now you would have a problem since you have two copies of the object Person which takes up a lot of memory if you were to create thousands of changes, so Clojure deals with this using a GitHub version control style system for tracking changes. So it’s not that the Person object is copied when changes are made but rather a track of the changes are made which can be accessed as a object.

Clojure has a strong infrastructure for multi threaded programming. Like I said before the world we live in is strongly moving to distributed processing and multi-core processors. Clojure provides a perfect way in handling this not only in its multi threaded programming but also with the immutable data structures we talked about above. The problems arises with tracking data that is changed when streamed across different threads. If the object Person is accessed across 12 different threads and is changed the handling of that change can become very difficult when multiple threads are trying to access/set the same data. With immutable data types a tracked copy is made which can be accessed without the problem of collisions through accessing or setting data at the same time. Think of it much like a group of people modifying an excel spreadsheet at the same time.

Clojure is a functional language which provides many fun advantages to coding namely the dynamic, expressive nature of writing code. Yet Clojure has is that it still access to the full Java library because Clojure runs on the JVM. This means Clojure has the best of both worlds being fun, expressive syntax with the solid backing of a language like Java.

I probably haven’t covered all the advantages of this new language and I certainly haven’t covered the know dis-advantages but I have explained some of the pro’s that considering where things are going, are hard to argue against.