I Am Not Sam Newman

It has come to this. After many years of mis-directed mail, I have finally decided to put pen to paper (well, photon to monitor, but you get the idea) and state that I Am Not Sam Newman.

World, here me now. It is possible – nay likely – that more than one person has the same combination of first and last names as another individual. We know for example that there are at least 54 Dave Gormans in the world. That bloke went to the trouble of creating an entire TV series about the fact that the whole first name/surname thing doesn’t not guarantee a unique identifier for human beings. The Chinese, to their credit, have worked this out a while ago.

Now I’m trying to be nice about it. I have decided not to publish the emails from people asking me if I want to do documentary voice overs, well-wishers hoping my testicles get better soon (well, I think they meant prostate), or the offers to speak on the corporate circuit about my hilarious non-pc anecdotes about how I once called someone a monkey. Others, in a similar position to me, have very much gone on the offensive in this regard, but I’m not quite as funny as Tony Hawks (the comedian, not the skateboarder).

So, oh blogosphere, here my cry – I Am Not That Sam Newman – the controversial Australian sports personality. And for the record, despite the fact that I live in the UK, I’m not the other Sam Newman either – the Actor known for his voice over work, appearances in Holby City and the forthcoming lead role of Prince Andrei in War & Peace starring Brenda Blethyn and Malcolm McDowell.

I Am this Sam Newman.

But yes, I am related to Paul Newman. Feel free to forward on any royalty cheques my way.

Clojure editor/IDE options – IntelliJ v Emacs

So all the cool Clojure kids keep wanting me to use Emacs. The problem is that I haven’t used Emacs for the last 10 years – since, in fact, I had to support a C application on about 7 different flavours of UNIX. As you can imagine, I’ve since expunged many of those past memories.

My IDE of choice – ever since I joined ThoughtWorks – has been IntelliJ. Yes, I had to spend my time in the wilderness with Eclipse, long enough that I feel well placed to compare the two and consider IntelliJ superior for the languages I use often. La Clojure now seems to play nicely with IntelliJ’s Community Edition, so I’m giving that a try.

Ultimately, I’m learning a new language, one which often requires my brain to work in a quite different fashion than it is used to. As such, I’m trying to limit the number of new things I have to deal with. If, however, I’m missing out on something by not using Emacs, I may be persuaded to give it a go. So can anyone out there tell me what I’m missing?

Build Pattern: Movable Checkin Gate

The Checkin Gate defines a set of tests which need to pass before a developer checks in. Typically, the tests are a subset of the total test suite – selected to provide a good level of coverage, whilst running in a short space of time.

There is an inherent trade-off with a Checkin Gate though – you may end up having blank spots in your coverage of the gate itself, which can increase the frequency of build breakages in your Continuous Build. By applying a Movable Checkin Gate, you attempt to offset this shortcoming by changing what is in the Checkin Gate suite.

Selection Based On Planned Work

Periodically, you assess the kinds of work coming up. If you are using an iterative development process, you may do this at the beginning of each iteration. Based on the kinds of changes the team will be working on during the next period, select tests which cover these areas of code, removing others which cover functionality unlikely to change. The theory is that you are selecting tests that cover areas of code which are most likely to get broken. The tests should be selected such that they don’t exceed your Build Time Limit.

After each movement of the site driving the Checkin Gate, you can assess the success by looking at the failure rate of the Continuous Build.

The key is to have a series of well categorized tests – tagging could work well here.

Selection Based On Build Failure

An alternative technique for selecting the makeup of the Checkin Gate can be based on build failures. If tests not in the Checkin Gate start failing in your Continuous Build, put them into the Checkin Gate suite, swapping out other tests to keep you below your Build Time Limit.

Updates

Added link to the new Build Time Limit Pattern.

Data Transformation and Language Syntax

I’m currently working on a personal project by way of learning Clojure – it’s actually a program to match up my itemised phone bill against my list of contacts to help me expense my calls. I find it best to have a real-world problem I need to solve to learn a new programming language. The problem itself is rather dull, but it did give me a chance to consider an issue I’ve hit with many other languages.

One of the core parts of my telephone expense program is the process of normalising phone numbers so I can match them up. What I am trying to do is something long the lines of:

Strip spaces, then add the missing area code, then internationalize it

So in Clojure there are a number of functions I’ve written, each of which take, and return, a string (the program is nowhere near finished, so consider this to be virtually pseudo code) :

(defn #^String normalize [str]
  (internationalize (add-missing-areacode (strip-spaces str))))

In Java, this would look like:

public String normalize(String str) {
  return internationalize(addMissingAreaCode(stripSpaces(str)));
}

The problem is that I, and most of the western world, read from left to right – with both Java and Clojure I’m having to read from right to left to determine what is being done. One system I use frequently has a construct which matches what I’m after – UNIX:

strip-spaces "44 1230 9183" | add-area-code | internationalize

So what other languages support this kind of construct? I suspect I could coax Scala into doing something like this, and it seems that it is right up Python’s alley (Django’s excellent templating system has filters which do exactly that). But if I want to use Clojure, am I stuck with this inside-out programming model? What other JVM-based languages would help me here – Ioke perhaps? It seems right up AINC’s alley, but that syntax makes me want to cry…

Update 11 Jan 2010: Thanks to Matt for pointing me towards Clojure’s ‘->‘ macro. This looks pretty close to what I’m after. So I *think* I should be able to do something along the lines of:

(-> phoneNumber stripSpaces addAreaCode internationalize)

Which is very cool.

The Great Rewrite

There is a rustle in the posit-in notes. The water cooler ripples. USB-powered missile launchers inexplicably fire, whilst nerf guns jam mid-battle. There is the smell of sulfur in the air. The Great Rewrite Approaches.

The signs were there. Grumbling from the developers – sometimes new to the project. “This code is horrible!”, “Completely unfit for purpose!”, “If only we could start again…”.

Delays to new functionality are laid at the door of the code. The one and only solution now on offer is to rewrite the entire codebase – nothing short of this will help. Eventually, managers are won over, and The Great Rewrite begins.

It is an epic undertaking. Some poor fools have to stay behind and look after the existing system, whilst others forge ahead into a brave, new world, leaving the horrid, old, decrepit and so uncool system behind.

Morale soars – the developers have a spring in their step. The business, initially, is confident. “Don’t worry – the new version is right around the corner!” they are told. Meanwhile support for the existing system is suffering – the team maintaining the existing codebase is a fraction of the size it used to be, and most of the senior technical people have to be on the rewrite.

The natives grow restless – the system they use, day in, day out, isn’t moving on. Feature requests seem to disappear into a black hole. “Soon” they are promised. “Soon, all your dreams will come true! Once The New System is launched, what you want is top of the list!”.

Months pass. And still, the rewrite continues. But it is closer now – inching towards readiness. Finally, long overdue, The New System is ready. The users are excited – all the recent troubles are to cease, as The Great Rewrite is over.

And now, the launch day.

There are bugs. Things that used to work, don’t work any more. There are few, if any new features. The system is new, but doesn’t offer the users anything new – but they have to learn to get to grips with The New System. The disgruntled emails start.

“Don’t worry!” says the Project Manager. Now The Great Rewrite has finished, the new features will arrive any day now!

And some of them do. Initially, at least, new features are easier than before to create, and ship. But after time, the same problems with the code base emerge. It turns out that having the same group of people building the same old system without changing their approach or ideas doesn’t lead to a different type of system. They never had to deal with the old issues head-on, they just sidestepped them, pressing on into the greenfield.

More time passes. Features take longer to ship, the code is harder to deal with. And once again, talk turns to another Great Rewrite…

A Brief And Incomplete History Of Build Pipelines

Recently, both Paul Julius and Chris Read pointed out that I was perhaps the first person to document the concept of build pipelines, at least in terms of how it relates to continuous integration and the like. As it turns out, the original posts on the subject are from further back than I remember:

I plan to pull together my previous posts on the subject and update them a little, but in the meantime thought I’d give a bit of background as to where much of this came from.

A Harsh Introduction

My first exposure to continuous integration was by being dropped in at the deep end during my first ThoughtWorks project. The project in question was for an electronic point of sale system, and at its peak had over 50 developers in three countries working on the project. During this time I started reading up on the topic, specifically Martin Fowler & Matt Foemmel’s paper on the topic (Martin has since created an updated version).

Much of the experiences at this first, large project were dominated by long, slow build times, caused in part by an inability to separate out activities being performed by individual teams. A full discussion as to things we learnt from that project can certainly wait for another time, but I came out of that experience liking the concept of Continuous Integration, but feeling incredibly constrained by the actual implementation.

Monkeying Around

Subsequently, I worked as what we used to call a ‘Build Monkey’ at a London-based ISP. My role (which we now tend to call an Environment Specialist) was typically to identify the causes of build failure, keep the build running smoothly, as well as manage deployments to a number of different environments. Throughout this time, discussions around the theory behind managing Continuous builds for larger software teams was continuing – primarily with colleagues like Julian Simpson, Jack Bolles & others.

The challenge we seemed to face, time and again, was how you balance the various activities associated with getting software from developers machines into production, all whilst providing the fastest feedback possible.

Typically, we came at the problem from two different directions – in the first instance from the point of view of how to hammer our tools into supporting the kind of processes involved, but the more important angle was understanding what the pipeline – from developer workstation to production – actually was. This thinking can now be best thought of in terms of Continuous Deployment – although that topic is far more nuanced that the often simplistic thinking regarding systems where 50 deployments a day is possible, or even desirable.

The Present Day

Since I wrote my original articles, many other people have done work in this area, to the extent that tools like ThoughtWork’s own Cruise builds support for build pipelining & visualisation directly into the tool.

Update 1: Corrected spelling of Paul’s surname – sorry Paul!

Moving to Tumblr

I always had this blog to write, not to run a blog. I’ve written less and less over the last couple of years and part of this is down to the overhead of maintaining wordpress. My plan is to switch to a clean, hosted solution – and Tumblr is looking like what I want. I plan to migrate everything over ASAP, but ASAP is proving to be not quite as soon as I’d like.

The migration plan is looking like this:

  1. Setup blog.magpiebrain.com to point to my Tumblr blog
  2. Start posting there, not here
  3. Write a script to export my posts from here to Tumblr.
  4. Write a script to export my comments from here to Disqus.
  5. Setup permanent redirects from the old posts to the new home at Tumblr.

A stub webserver for Scala using Jetty

I knocked this up to help testing on something I’ve been working on in my spare time. It would be a trivial exercise to extend this to build pages for specific URLs – this example returns the same markup for any example. The old codehaus site for Jetty contain lots of examples of how to configure an embedded server.

import org.eclipse.jetty.server.handler.AbstractHandler
import org.eclipse.jetty.server.Handler
import org.eclipse.jetty.server.Server
import org.eclipse.jetty.server.Request
import javax.servlet.http.HttpServletRequest
import javax.servlet.http.HttpServletResponse
import scala.xml.Elem

class HttpServer {

  val handler = new MutableHandler()

  def run(port: Int) = {
    val server = new Server(port)
    server.setHandler(handler)
    server.start()
  }

  def updateHtml(html: Elem) = {
    handler.html = html
  }
}

protected class MutableHandler extends AbstractHandler {
  var html = <h1>Hello</h1>

  override def handle(target: String, request: HttpServletRequest, response: HttpServletResponse) = {
    response.setContentType("text/html");
    response.setStatus(HttpServletResponse.SC_OK);
    response.getWriter().println(html.toString());
    (request.asInstanceOf[Request]).setHandled(true);
  }
}

Genius is Genius

It just threw up this playlist for me on the iPhone – it seems my own personal DJ has finally arrived:

  • Beast Of Burden – Rolling Stones
  • Idioteque – Radiohead (live)
  • Born Under A Bad Sign – Jimi Hendrix
  • I Got Mine – The Black Keys
  • It’s Hard To Be a Saint In the City – Bruce Springsteen & The E Street Band
  • The E Street Shuffle – Bruce Springsteen & The E Street Band
  • Who’s Gonna Save My Soul – Gnarls Barkley
  • Play With Fire – Rolling Stones
  • I Taught Myself How To Grow Old – Ryan Adams
  • House Of Cards – Radiohead
  • Delirious Love – Neil Diamond
  • Old Enough – The Raconteurs
  • You Don’t Know What Love Is – The White Stripes
  • Shake Appeal – The Stooges
  • Shattered – The Rolling Stones
  • Oh Yoko – John Lennon
  • True Love Way – Kings Of Leon
  • Hear My Train A Comin’ (Acoustic) – Jimi Hendrix
  • Tell Me Why – Neil Young
  • Sprit In The Night – Bruce Springsteen & The E Street Band
  • Psychotic Girl – The Black Keys
  • Suprise – Gnarls Barkley
  • New York Serenade – Bruce Springsteen & The E Street Band
  • All I Need – Radiohead
  • Rest My Chemistry – Interpol

Wordpress Site Hacked

I’ve been neglecting things here at Magpiebrain Towers since my move to San Francisco. Blame the sunshine, a demanding new client, or just a relapse of extreme apathy. Whatever the cause, it seems that my inattention has been rewarded.

Simon sent me an email the other day, wondering where the blog had gone. I brought up Firefox and checked – “No, the site is still there” I said. But it wasn’t as far as Google search was confirmed.

Redirection based on Referer

It turned out that via Google search, when you clicked on links for Magpiebrain then you were redirected to a suspected malware site called ‘your-needs.info’. I immediately blamed Google. Luckily, calmer heads prevailed, and someone far more knowledgeable than me pointed me at these interesting results:

$ curl -I http://magpiebrain.com
HTTP/1.1 200 OK
Date: Thu, 29 May 2008 18:00:52 GMT
Server: Apache
X-Pingback: http://www.magpiebrain.com/xmlrpc.php
Vary: Accept-Encoding
X-Powered-By: The blood, sweat and tears of the fine, fine TextDrive staff
Served-By: TextDrive
Content-Type: text/html; charset=UTF-8
$ curl -I -H "Referer: http://www.google.com/search?q=sam+newman"

http://magpiebrain.com

HTTP/1.1 302 Found
Date: Thu, 29 May 2008 18:00:57 GMT
Server: Apache
Location: http://your-needs.info/search/index.php?q=sam+newman
Vary: Accept-Encoding
X-Powered-By: The blood, sweat and tears of the fine, fine TextDrive staff
Served-By: TextDrive
Content-Type: text/html; charset=UTF-8

So, it seems that if google is the referer, then the browser is redirected to some shitty spam site.

Cunning barstweards.

Lax Wordpress Upkeep to blame?

When I found out that it was my site that was to blame, I immediately started poking around. I checked my .htaccess file – thankfully this was clear. Next, I disabled all the plugins in Wordpress, but still the redirect worked. Finally, I moved index.php out of the way – thereby stopping wordpress from being invoked – and surely enough the redirection stopped. So Wordpress was to blame.

However, my inaction in writing posts for the site has also extended to not actually keeping Wordpress up to date. So my first course of action was to upgrade from 2.3.3. to 2.5.1. The upgrade process was seemless as always, but the referer hack remained.

Fix

A recent thread over at wordpress.org helped me find the solution. By poking around in wp_options and removing a row with an option_name of rss_f541b3abd05e7962fcab37737f40fad8 the problem went away. Right now it isn’t clear which exploit was used, or how many sites were affected, but the thread I found was pretty recent which implies this may be a new issue.

Insidious Hack

This really is quite a good use of what appears to be a Wordpress exploit. The only way in which this hack becomes apparent is if you check your analytics frequently (which I don’t – my ego is already big enough without stoking it by looking at hits) or if so perform a google search for your own content, which happens rarely. What is in it for the hackers is harder to see, other than driving traffic to a bogus search engine that pushes prescription drugs.