Friday, 24 January 2014

A Ranty and Dogmatic Troll Masquerading as Coding Guidelines

This document represents a series of guidelines for writing code that will swiftly pass code review with my team. It doesn't attempt to be fair. It doesn't attempt to listen to your opinion. It does present a series of guidelines. You may chose to ignore those guidelines (after all, if they had to be obeyed, we'd have called them "laws") We may choose to point back to these when reviewing your code. We'll attempt to avoid being (passive) aggressive when we do so.

Hugs and cuddles.

Test first
We're working on a framework for writing automated tests. It's probably a good idea for us to lead by example and write some tests. It hasn't been proven — but it's a scientific fact — that writing tests after the fact is as boring as boring can be. So we write the tests first. This has the added advantage that we can think about the kind of functionality we want independent of how it's implemented.

KISS, YAGNI
Or "Keep It Stupidly Simple, You Ain't Gonna Need It". When writing new code, just write the code you need. Don't attempt to write some ├╝ber-generic, ultra-lightweight, whizz-bang sub-framework for handling every conceivable edge case when you only need to do "this thing" once. It's another Scientific Fact that engineers are bloody awful at spotting patterns before the patterns emerge. Wait until the third or even fourth time you repeat something before attempting to extract a common API.

A good book for this? "Refactoring to Patterns".

Prefer composition over inheritance
Let's admit this up front: Java's a deeply flawed language. If I had to write this crap in vi or emacs, I'd be sitting in a corner, rocking backwards and forwards, weeping gently and pulling at my hair. Fortunately, we have IDEs, so It's All Okay. One area where Java is flawed is that it's really easy to inherit from a base class, and really clumsy to do proper composition. Nevertheless, as a rule of thumb: inheritance of interfaces is cool, subclassing a concrete base class is not.

Now, there will be times where it'd be Really Useful to be able to share some common functionality between things that appear to be related. For some reason, this always seems to happen with base test classes. It's a weird fetish, and it's one we should be disabused of. There are better ways to handle this, possibly through JUnit's Rules, or by extracting the common functionality into its own class and just newing it up on demand.

Point is: if you're using inheritance to share some common functionality between otherwise unrelated classes (and "it's a test" doesn't make them related) you're not doing inheritance right. You are, in fact, Doing It Wrong.

Role-based interfaces #ftw
Role-based interfaces: we like them, we use them, and we encourage other people to use them. Yes, this will mean that your classes might implement a lot of interfaces. That means that they collaborate with a lot of other classes. That tells you something. It probably tells you your class has too many roles or responsibilities.

Remember, kids, inside every fat class are at least two classes waiting to climb out.

Expose Collaborators
We take the use of Dependency Injection as an article of faith. This does not mean that we whole-heartedly embrace the need for a "DI Container", but it does mean that we're huge fans of exposing the collaborators of a class via its constructor (because "constructor-based DI")

Having said that, "new" is not a dirty word. It's totally pukka to use it, particularly for those handy utility classes we mentioned above. Having said that….

Utility Classes: Just Say No
A utility class represents a severe failure of imagination. They tend to become dumping grounds for only loosely related methods, which are typically static. If you have one of these, a fun exercise is to decompose it into what I like to refer to as "Objects" and use those instead. If you're having trouble with the traditional "noun-based" approach to identifying classes, try a bit of London School TDD and see what shakes loose.

Singletons? Static Methods? Also No
Singletons (in the traditional "implemented as a static field in a class" sense, not in the "ideally we'd only have one of these" sense) destroy our ability to have fun and write tests that can run in parallel, slashing our potential productivity. Also, it leads people to start using the Service Locator pattern instead of Dependency Injection, and we take DI as an article of faith (see above), mainly because it facilitates TDD by making collaborators clear, like we (also) said above.

Use strong types where possible
We're using a high ceremony language. Might as well embrace that properly. We dislike Stringly-typed code, and we like Tiny Types. Why do we like them? Because they allow our code to express intent as clearly as possible, and we can do things like "hang behaviour" off them as the need arises.

BTW, this means that we really should never be returning "WebElement" from a Page Object. Return a class that models the thing the user would expect to be returned, even if that leads to a class with nothing but a constructor.

Use the most abstract type that conveys intent for variables and fields
The most abstract type explains to the reader what you do. The concrete type is how you're going to do it. You should be able to change your mind about "how" without needing to change the "what". The most abstract type that conveys intent for a variable ("Map" instead of "HashMap", "WebDriver" instead of "DroidDriver")

I've been asked whether a method should return an ImmutableSet or just a Set when it returns a set of things that both sorted and immutable. Redundant as this question may seem, I'll still have a bash at answering it. Return the ImmutableSet, as that conveys the intent of the return value. If the caller doesn't care about the mutability of the set, they can assign it to a Set. Everyone's happy.

Use the right naming convention
The naming convention in our codebase is a hangover from the Android coding style, which was created by people who wrote C++ for a living (which also explains why there are so many static methods in the framework too) We don't write C++ for a living, and using a foreign language's coding conventions in Java code makes you look like a clown and an arsehole.

However, it appears we're perfectly happy to present ourselves as people who find it hard to get dressed in the morning without hurting ourselves. Consequently, when writing code that's just for us, use the coding standard of the rest of the codebase and choke back the waves of nausea. If you're writing code that we'll put in front of a public who believe we're competent engineers (that is, OSS) use the Google coding standard (effectively Oracle's, but with a two space indent)

Use the Ubiquitous Language, Luke
If everyone calls it a "Self Aggrandizsing Wattle", don't name the class "AndroidPoweredMultiflexViewPort". We want people to find the classes we write, and we want them to understand how they relate to other classes in the system. Using the names that people call things as class names is Totally Cool.

Also, if you've not done so, go and grab a copy of Domain Driven Design and attempt to wade through as much of it as you can bear. Then skip to the end and read the bit about Anti-Corruption Layers. That bit's good.

Naming Things
Design Patterns are a means of communication, not blueprints. Similarly, the thing that makes classes interesting isn't what pattern they happen to implement, it's the role that they play in our system. Leave the pattern name off the class name, ok? The exception to this is the "Builder" pattern. Everyone expects the "Builder of Thing" to be called "ThingBuilder". We might as well go with the flow on this one and buck our own contrarian ways.

Similarly, every concrete class is an implementation of something, so using the postfix "Impl" (presumably if you're too lazy to name something properly, you're too lazy to type "Implementation") as a class name is a Dumb Thing To Do. Name the class for the particular thing that makes it interesting within the system, or prefix the name with "Default" if there is genuinely nothing interesting about it. Try and avoid naming the class around some obscure implementation detail that no-one using the class cares about.

BTW, it's acceptable to append the name of the interface being implemented to the class name, but it's better to try and name the class for the role it plays in the system.

Keep It SOLID
  • Single Responsibility Principle
  • Open/Closed Principle
  • Liskov Substitution Principle
  • Interface Segregation Principle
  • Dependency Inversion Principle
Presumably you're a Software Engineer. If any of those are hard to understand, then I'd suggest using your favourite search engine to read up on them.

Document in Proper English That Which Needs Documenting
It's a safe assumption that anyone actually reading your code is a professional software developer. Telling them stuff that they can see just by reading your code in javadocs and comments is Not Helping, so don't do it. Use comments to explain the reasoning behind particular design decisions, or to alert people to odd corner cases that might actually need explanation.

Consider replacing one line comments of a block of code with a sensibly named method containing that block of code. After all, when it comes to maintaining this shit, only the most dedicated of developers will actually update the code and the docs.

It's cool to use correct grammar and punctuation. Please do so, and try and end sentences with a period cretaceous

Monday, 4 November 2013

On the Naming of Tests

I'd thought that this was part of the automated testing canon already, but apparently not, so a quick note on the naming of tests appears to be in order. Well, how I think tests should be named. :)

When using an xUnit-style framework, the common pattern is to test class Foo in another class called FooTest. Within this test class, there are several methods. The principle I like to follow is that if you took the name of the test class, stripped off the "Test" postfix, and then listed the names of the tests as bullet points, you'd end up with a list of roles and responsibilities of the class under test. You'd end up with something like:

Foo
  * Should eat cheese
  * Should not consider cake as cheese
  * Should handle null cheese by throwing a SpecificException

And so on.

Put another way, if someone were to delete the class under test and the bodies of the tests, could they recreate something functionally identical to the class under test using just the test names? 

Monday, 9 September 2013

Time Keeps Ticking

A note so that I never forget again: the time used by a ZipEntry instance in Java appears to keep ticking.

ZipEntry entry = new ZipEntry("foo");
long expected = System.currentTimeMillis();
entry.setTime(expected);
Thread.sleep(3000);
long seen = entry.getTime()

// This fails
assertEquals(expected, seen);

Update: it turns out that the problem turns out to be that DOS timestamps only store seconds with a precision of 2 seconds. The above could be reduced to:

ZipEntry entry = new ZipEntry("foo");

// Note: we set the seconds to an odd number
long expected = Calendar.getInstance()
    .set(2013, SEPTEMBER, 10, 12, 14, 1)
    .getTimeInMillis();
entry.setTime(expected);
long seen = entry.getTime()

// This fails
assertEquals(expected, seen);

More on MSDN.

Monday, 22 July 2013

On Managing Time and Direct Emails About Selenium

I tend to get a few requests a day asking for help with a problem with Selenium. I almost never reply to these. It's not (just!) because I'm an evil minded, grumpy so-and-so, but because it's not a great use of anyone's time.

I am just one person. I have a full time job, and family and friends that I like to spend time with. I work on Selenium as a volunteer, and that means fitting it in where I can. Fortunately, work are understanding about this, and support my role in the project, which means I do far more than if I only had the occasional evening free. Still, it does mean that I prioritise my time spent on the project. This is how I do so:

  1. Writing code. That comes first.
  2. Spend time on the IRC channel. I often just lurk here, but it's a handy way to talk to the core development team and keep an eye on what's going on.
  3. Answer emails to the selenium-developers group. This is where we run the project and hold design discussions. It's not a good place to ask for help, unless that help is about implementing Selenium itself.
  4. I scan the webdriver and selenium-users lists, answering questions where I can and provided I have time.
Now, the nice thing with this ordering is that the further down the list you go, the more people there are who are able to help you with your issue. Put another way: asking for help in the user lists means that you're far more likely to get the help you want. It also means that if someone else runs into the same problem, Google can come to their rescue. A private email doesn't have that benefit.

I know that may be frustrating for you. I know that it seems to make sense to contact prominent people on the project directly. I understand your particular issue is urgent and important to you. I really do, and that's why I don't answer your emails.

Friday, 19 July 2013

Being Part of a Distributed Team

At work I'm part of a distributed team. Two colleagues and I are based in London, and the rest of the team is in Menlo Park, California. It's been reminding me of some lessons that I've learnt over the years about working as part of a team that's spread across time zones, and I thought it might be nice to share some of them. Without further ado:

1. Attempt to get code reviewed by the person most familiar with the area but also by someone who's on the same site as you. This suggests that singletons on a site of their own are a suboptimal thing. Which leads to....

2. ... be tolerant of clean up diffs. If the local reviewer approves a change in the code review tool the chances are high it'll be landed. The author of the code is responsible for not being a clown: if there are fundamental design decisions that are still unresolved then landing the change, even if the local reviewer is ok with it, isn't a winning move.

Code is a plastic thing and we have source control. We can fix things up. Taking advantage of that is a Good Thing.

One useful pattern I've seen is to check in a failing but ignored test. It a lovely way of moving things forward without jamming the works, and leaves a clear trail of intent.

Another example of being good at this: the code reviews which get approved with a list of nits and changes to make before landing. I think that shows great trust in your team, and I like it. It goes without saying that if you're using a system like Gerrit (update: which submits the code when it sees that a diff has been approved), then this "ok but please fix" approach won't work as well :)

3. Time zones are evil but we must live with them. A great habit for the Brits to get into is to have done code reviews for USian teammates by 3pm BST. For those teammates in America, reviewing code by the British first thing when you get to the office in the morning is an extremely helpful thing do. If you're working with colleagues in Australia, India, China or elsewhere on that side of the globe, be aware of when they come into the office and have your code reviews done by then.

The reasoning is clear: it gives everyone as much time as possible to turn code around and get it reviewed again, as this is when the team's hours overlap. That, in turn, helps the team as a whole move fast.

4. Bear in mind the hierarchy of communication:
nothing -> email group -> email -> IM -> VC -> in person

The further to the right you are, the lower the latency and the higher the bandwidth. The higher the bandwidth, the quicker misunderstandings can be resolved and design choices made.
Corollary 1: if a review is dragging on, hop on to VC, Skype or a Google Hangout, or go and chat to people.

Observation: the further to the left you are, the more asynchronous communication becomes. If time isn't of the essence, then head left.

5. There's nothing like being in the same place. Travel is bad because it means you're away from family, your regular routines and all the things you love about home. Travel is great because you get to go to new places, hang out with the locals and get to see how other offices work. Although it's probably more financially astute to make the smaller part of the team travel to the larger, it's also unfair to expect the travel to always be done by one part of the team. It turns out that a sense of fairness is the thing that will keep the spirits up in the team and keep everything ticking along nicely.

I'm sure there's more that needs to be covered. Things like scheduling meetings and alternating times so that people don't always need to stay late or get up stupidly early, or having a useful glossary of ambiguous terms ("let's table this discussion") and other issues and hiccups, but I've been writing for a while now and this is getting long. So I'll stop.

Monday, 24 June 2013

The UNIX Philosophy, WebDriver and HTTP Status Codes

The UNIX philosophy can be described in many ways (and the Wikipedia page has plenty), but I've always admired its practical application in the wealth of shell commands available to a user. Rather than having a single command that Does Everything, the UNIX shell is a place of small commands focused on doing one thing well, yet which are easy to link together.

For example, I recently needed to compare the contents of two JAR files and remove class files that were duplicated from one of those jars. I ended up generating the list of shared files via:

comm -12 <(jar tf first.jar | sort | uniq) <(jar tf second.jar | sort | uniq) | grep -v META

I doubt very much whether the authors of any of those tools thought that this is what I'd be doing, yet because the tools are carefully focused and are easy to chain together, this is a trivial thing to do.

How, you may ask, does this apply to Selenium? And specifically issue 141? For those of you who can't be bothered to read the incredibly long list of comments on that issue (now at over 100), this is the one about being able to get HTTP status codes from the WebDriver API. The comments are split between those saying that this functionality doesn't belong in the API, and those who (occasionally very vociferously) claim that it does. 

From a philosophical perspective, the WebDriver API is attempting to model a user interacting with their browser. We attempt to limit the APIs we offer to just those that meet this need, only allowing ourselves to extend it to those very clear cases where the browser is the Source of Truth about a particular thing (such as with cookies), or where there's no rational way to cleanly offer a facility (such as executing Javascript --- incidentally, something that I spent a lot of time keeping out of the API)

HTTP status codes don't fall into either category. The browser isn't the the source of truth about these codes, as that's the originating web server. The user may not be aware of them either; a 404 from a .js file? That'd most likely go unnoticed. A 500 from even the main page? That may be returned as a 200 by some app servers in certain configurations. 

So that leaves our users out to dry, right? Well, it would if it wasn't for the UNIX Philosophy. You see, it's ridiculously simple to hook up a proxy that will capture this information for the user if you can't obtain the information by instrumenting the server. You can do it like this:

// Explain where your proxy lives
Proxy proxy = new Proxy();
proxy.setHttpProxy("your_proxy:8080");

// Now tell the webdriver instance about it
DesiredCapabilities caps = new DesiredCapabilities();
caps.setCapability(CapabilityType.PROXY, proxy);

WebDriver driver = new RemoteWebDriver(caps);
That's 5 lines of code in enormously verbose Java. 

Separating the concerns of "browser automation" from "logging network" traffic allows the Selenium developers (most of whom are not paid to work on Selenium) to focus on the problem of driving the browser. It means that they're not working on writing their own HTTP proxy, which is a sufficiently taxing tax that there are many projects out there working to write something solid and stable.

Great options for users looking for a powerful and capable proxy include Fiddler and Charles. Another option is the BrowserMob Proxy, which started as being a fork from the original Selenium RC codebase (Oh! The irony!) but has since matured and grown. This is amazingly simple to integrate with a WebDriver instance, as shown in their docs. For brevity, the integration can be done like so:
ProxyServer server = new ProxyServer(4444);
server.start();

// get the Selenium proxy object
Proxy proxy = server.seleniumProxy();
Following the UNIX approach, we make it easy to use a proxy with the WebDriver API. That means that we're not implementing an API for getting HTTP status codes in the Selenium project not only because it's out of scope, but there are already people doing a great job of offering that capability elsewhere. 

Wednesday, 22 May 2013

Why I Care About Automated Testing

I was reading a blog the other day that highlighted the fact that I've only got a limited number of keystrokes to use up in my working life. I can use those keystrokes on anything: email, writing code, futzing around with the git command line, facebook status messages, posts on Plus, anything....

That got me thinking about why I think that testing is an important part of software development: not an afterthought, but something that's as vital as considering API design or how to structure methods. It's because I only have a certain number of keystrokes. I'd rather spend those working on new features and moving the bits of the world I care about forward rather than fixing bugs or chasing down regressions.

It's undeniable that writing a test and the code itself takes more time. I'm having to write more code. I'm burning keystrokes. But each of those tests may be helping to prevent regressions, or providing insight into the structure and usage of my code. And that additional insight, and those prevented regressions, mean that cumulatively I have more time to spend hacking on features, and that's what I love to do.

So that's why I care about automated tests. That's also why I think you should care too.

Tuesday, 26 March 2013

Speaking Engagements

One of the things that I love most about working in tech is the chances I get to speak in public and share some of the knowledge I've somehow managed to accumulate (even, sometimes, about work that I've actually done myself!)

For someone who enjoys speaking in public, I find it hard to actually promote the fact that I'll be in places, but some friends have recently asked when and where I'll be, so without further ado here are my next confirmed appearances:

7th-10th April: DroidCon, Berlin
23rd-24th April: GTAC, NY
2nd May: Facebook Mobile Developer Conference (a brief stint in London!)
10th-12th June: Selenium Conf (tickets on sale!) (It looks like it'll be great fun!)

For DroidCon, GTAC and the Facebook MobDevConf, I'll be talking about various aspects of my day job at Facebook, though each conference, because of their different focuses, will get to hear about different parts of it!

Selenium Conf is something that I always look forward to. It's a great chance to meet the people who are using the tool that has meant so much to my professional career, and it's also a fantastic opportunity to meet up with the rest of the selenium developer team and buy them a steak dinner. We've yet to have a veggie join us as a core team member, but we'll figure out a suitable way to reward their efforts too one day! If you're interested in becoming that first vegetarian contributor, then you can always start by contributing some code!

I've also recently become Facebook's W3C AC representative, and will be attending the TPAC in November in China.

Tuesday, 8 January 2013

Clue

Note to self: as an OSS project becomes more successful, the level of clue on the mailing lists drops.

Corollary: a successful project will have a mailing list containing several knuckle-bitingly painful posts.

Why is this? I believe that it's all to do with the motivation of the person emailing the list. To begin with, an OSS project only has people who are very interested in using it posting to the lists. These people tend to be technically savvy and can identify what might be causing problems. The signal to noise ratio is therefore very favourable to a sensible discussion.

As a project gets used more widely used, the people using it move from those who are actively interested to the people who are sitting around them. They may not be quite so engaged, but they're probably still relatively tech savvy. They have far less incentive or desire to dig into the code and understand why things are failing. The level of clue on the mailing lists appears to drop.

This process continues until the project is so successful that it's a mandated part of people's jobs. They have no choice but to use it. At this point, there is absolutely no incentive to understand why things are not working as intended, and every incentive to try and see whether someone else has already solved the problem. The level of clue in the mailing lists appears to drop through the floor.

I remind myself of this on a regular basis, and you know what? I'm okay with it too. In fact, sometimes I even smile when I see my project's lists filled with poorly researched, ill considered emails. It's a sign of success.

Where Computing Lives

It's hardly an original thought, but it's interesting to me that the location of a majority of someone's computing power has moved from their personally controlled space to datacentres. It's allowing companies to do some extraordinary data processing, but just surface the results to the user.

To prove my case, I cite Google Now. Which is awesome, in the traditional meaning of the word. Now, if I could just make it stop accurately telling me that it's time to "head home" and then highlight a nearby pub.

The other thought that occurs to me is that the constraining factor on someone's ability to utilise computational resources is going to become wireless bandwidth. I wonder whether there's a Moore's Law for that....