Archives for category: Uncategorized

I’ve toyed with Clojure on and off for the past 6 months or so, and I’ve gotta say I love the language.

A modern lisp on the JVM, built for easy concurrency in today’s multi-core world – how awesome. There’s just one problem if you’re from a usual OO background like me … and that’s getting your head around lisp and functional programming in general!

Once you’ve got your head around the syntax and basic semantics of a language, the best way to learn is to play – and that’s where 4clojure comes in. An extensive list of challenges ranging from the very basic to nicely challenging gives you the opportunity to play with your new found skills in a quick and easy sandbox.

Be warned though – after completing the first few challenges, you’ll have a taste for blood and won’t sleep until you’ve completed them all!!

Check it out @

4clojure.com

Due to my recent use of several dynamic and lisp based languages, most notably Ruby and Clojure, I’ve taken on the challenge of learning Emacs to increase my productivity ten fold! Using my old staple Eclipse to develop in dynamic languages just seemed like too much effort – something akin to using a sledgehammer to engrave some delicate glass.

Anyway, enough rambling, and onto the tip. To try and avoid getting the infamous ‘Emacs pinky’, I’ve remapped my CAPS LOCK key to be another CTRL, and ALTGR to be equivalent to ALT – this makes the key chords so much more comfortable. To achieve this in Linux, just add the following to your .bashrc or something similar :

xmodmap -e 'remove Lock = Caps_Lock'
xmodmap -e 'keysym Caps_Lock = Control_L'
xmodmap -e 'add Control = Control_L'
xmodmap -e 'keycode 108 = Alt_L'


							

Aha! Time to learn some Erlang.

I’ve been meaning to check out Erlang, as another awesome functional language, for some time now.

And now, by the same author as ‘Learn you a Haskell for Great Good’, is a similar tutorial for Erlang.

I can barely wait to start programming some crazy agents!

Learn You Some Erlang for Great Good

Recently released, ‘Seven Languages in Seven Weeks’ attempts to walk you through a number of today’s most interesting programming languages, in the process demonstrating a wealth of different ideas, and in general expanding your horizons.

Week one of the ‘course’ gives a good baseline by starting with the solid, but interesting world of ruby. The author gives ruby the persona of Mary Poppins – solid and practical with that little bit of magic.

I’m sure many readers have used ruby before, but regardless the chapter begins as an excellent refresher (or introduction) to the syntax and ‘everything is an object’ approach of the language. This is later built upon by seeing a glimpse of ruby’s meta-programming magic – concepts that you can see filtering into other languages, such as Groovy.

Overall the chapter is a good start to the book, easing you into the process with a clean, modern language and a number of simple but fun exercises … onto Io.

For my solutions to the exercises in the ruby chapter, check out : https://github.com/lucas1000001/seven-languages-in-seven-weeks/tree/master/ruby

Collaborative filtering is a technique using in machine learning to find similarities and trends between people and items due to choices they have made, items they have looked at and ratings they have given.

Think of Amazon.com – whenever you view an item you get that ‘similar items’ section. You also get emailed the latest offers tailored to your tastes. In fact the whole site morphs into your own personal shopping centre. There is obviously some clever machine learning going on behind the scenes, and that’s what I am hoping to outline.

There are three major sections to collaborative filtering :

Rating

We need some metrics to allow us to measure how similar two people are.

Scoring

How we use the metrics to ‘score’ one person against another.

Ranking

Once we have collected all the scores, how can we use this information?

Rating

Lets make up an example. Say we have a film rental system, that infers from old rentals the movies a customer is most likely to enjoy. The main metric in this case would be some kind of movie rating system, let’s say out of 5. So here below is an example of two members of our film site.

As you can see, these people have watched and rated different films, and this will be true of virtually every member.

Scoring

So the first problem we come up against is ‘how to compare two people?’.

As (we just mentioned) two people will have watched (and rated) different films, then the first thing we have to do is to find the intersection of films rated by them. So simply chop out non-common films :

Now taking the remaining intersection of films, we can grab a unique vector from each customer, describing their personal preferences.

Mr A          Ms B

|2|             |4|

|5|             |2|

Now we have a comparable format, we can simply use one of the ‘out of the box’ scoring algorithms such as :

Euclidean Distance

Pearson Coeffient

These basically take the pair of vectors, work their magic and return a score. The higher the score the closer match the users are.

If you’ve had a look at those above algorithms, you may be a bit overwhelmed by the maths – don’t worry, underneath it all they’re pretty simple, I’ll take Euclid as an example.

We’ll throw a new person in the mix, ‘Dr X’, he’s pretty evil, so loved ‘Aliens’ (5/5), and totally hated ‘ The Notebook’ (1/5). Let us compare these three customers using Euclidean Distance :

From the graph you can see the distance between Ms B and Dr X is smaller than that between Ms B and Mr A – this means their taste is more similar (with respect to these two films). All we need to do now to get a decent score is invert the distance, so that we have more similar people having higher scores (not smaller ones), all we do for this is :

score = 1 / (1 + dist)

and voila – we have our scoring system to compare users.


Ranking

Now the hard work is out of the way all we have to do is :

1) Choose a target user, (to find their closest matches)

2) Score this user against every other using the method above.

3) Order them by their scores.

So, as we saw, to find out Ms B’s matches, you figure out her score w.r.t Mr A and Dr X and simply rank them. So Dr X comes top followed by Mr A.

We now have a ranked list of similar users, allowing our film application to recommend like people, and the movies they have rented – and it’s as simple as that!

You can perform exactly the same actions on the items themselves (by simply swapping places of the users and items). This secondary technique is called item-based filtering as opposed to what we just went through which was user-based filtering.

If you want to check out more of this kind of thing I recommend taking a look at the book http://www.amazon.co.uk/Programming-Collective-Intelligence-Building-Applications/dp/0596529325/ref=sr_1_1?ie=UTF8&s=books&qid=1268602889&sr=8-1

Follow

Get every new post delivered to your Inbox.