Rocket-Powered Jet Pants

We're Going to Make Out Like Bandits

Sun, 12 Apr 2026 00:00:00 +0000

Here’s the plan. I reckon it’ll take about five years in all, and I think we’re about 1.5 to 2 years in already.

Our starting point is that the AI models are now good enough at coding. Not necessarily “perfect”, just “good enough”. Often, they’re as good (better!) than a junior developer. I don’t know about you, but I’m finding the new models pretty impressive. So are other people.

Next, let’s “eat the seed corn”. Junior developers can now be considered (by some) to be superfluous to our needs; for the same cost, we could get soooo many tokens, and get so much more done! Job openings for junior devs will dry up. This is already happening.

One of the interesting things about LLMs is that they love to generate new code. Given their vast corpus of training data, they also know how to write those supporting functions and methods we commonly get from third party libraries. The result? We’ll start writing more and more code. The size of our repos will balloon. This is already happening.

Now, we also know that the tendency of the AIs isn’t to go and clean up code, reduce duplication, or focus on maintainability. Context windows (even the large ones!) can’t hold entire modern repos. They miss things. Most AI written code is additive, and is frequently duplicative. This is already happening.

Machines have a higher “tolerance” for complexity than people, which means we can now bear higher complexity budgets and tech debt. That is, modern AIs tend to be very good at reading code and following the flow of control, often faster and better than people. However, “debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?” still holds true. At some point the tech debt and complexity will be so high even the AIs won’t be able to deal with it. We’ve already frequently blown past human levels of code complexity. This is already happening.

And code is never bug free. Defect rates in AI generated code are typically above that of human-written code, but even if it drops below that, because so much code is being written, the overall number of defects will climb. Because code is seldom properly factored, fixing a bug in one place won’t fix it everywhere. We’ll play whack-a-mole with bugs. This is already happening.

So, we’re going to end up with (we already have!) complex, poorly structured codebases, rife with bugs and duplicated code. Who has the judgment to decide what to delete, which abstraction is wrong, or whether the whole approach needs rethinking? Who do we call in to fix this kind of mess?

Senior developers.

However, there’s attrition in the industry. Senior developers leave, not only because of the regular attrition that occurs over time, but also driven by a 22% spike in critical burnout as they are forced to manage the massive influx of AI-generated complexity. Because we’ve stopped hiring juniors, there are few new seniors coming up to replace them. The ones who do make it through haven’t experienced life before AI; they’ll be good, but they won’t necessarily be great at writing maintainable code. In a market where senior, production-ready engineers are already the primary bottleneck, I think we’ll need seasoned developers with good taste more than ever.

In any market where there’s increased demand and reduced supply, prices go up. This happened for the COBOL programmers of yore during and shortly after Y2K. It’ll happen for senior developers soon, and just like our COBOL-wielding brethren before us, there will a shining window of opportunity. We just need to hunker down and survive the storm of AI-driven layoffs.

Then, my friends, we’re going to make out like bandits.

The Dorodango Technique: Agents in Agents Driving Scripts

Thu, 29 Jan 2026 00:00:00 +0000

TL;DR

“Agents in agents driving scripts” is a pattern for handling large-scale, mechanical changes using AI. Each layer of the pattern is developed iteratively, using failures from previous runs to refine either the prompts or the scripts being used.

The long version

I’m giving a worked example in this post in order to demonstrate how to apply the idea. This post isn’t about either bazel or gazelle. With that out the way….

I’ve been exploring how I might generate build files in a large Bazel-based repo which currently uses hand-crafted build files. The theory of the migration is simple. We know the code compiles now, and that information is encoded in the build graph. We can use bazel query to inspect this graph, and the graph output format makes it possible to topologically sort these by their dependencies. Once that’s done, we can start the migration, beginning at the directories with no dependencies and ending with those which need the entire build graph migrated.

We can tell whether a build file is being managed by gazelle by checking that neither the exclude nor ignore annotation is present.

With this information in mind, the process starts by setting up gazelle in the repo, and updating all the build files to indicate that gazelle shouldn’t process them by adding the exclude directive. Once that’s done the fun can begin.

Ultimately, I know the process of migrating a build file to gazelle is pretty mechanical. I also know that running a script is often faster, cheaper, and more deterministic than using an LLM for the same task.

But we don’t have the script yet. So we use an LLM to write one.

To begin with, I start an interactive LLM and work with it to migrate a single directory. Once I’m happy with that, I ask the LLM to create a (python!) script to perform the same actions. The key thing here is to insist that if there were any errors or problems, the script must fail and report the problem.

The next step is to write a prompt that uses this script and write that to a file. The prompt checks that we’re ready to start (is the build still green, is running gazelle a no-op, etc) and then runs the script to migrate the directory, and then is told how to validate that the change is safe (normally very similar to the start conditions, with some extra checks). The end goal is to “one shot” an update, so the prompt is told to stop if there are any issues at all.

This is where the iterative improvement happens. Each time an issue stops the process, we use that failure to immediately update the script, add additional scripts, or refine the original prompt to avoid future problems. This continuous feedback loop ensures that the next iteration will be smoother

Once we achieve this single-directory ‘one-shot’ success, we move up a level, and start on an orchestration prompt. Again, we start this manually. My approach is to tell the orchestrator to start a number of subagents, each of which should use the polished prompt and script we’ve developed above. Again, if there are any problems, the orchestrator should stop everything and report the issues, grouping them as required. Each time we resolve a problem, we either ask for the scripts or the prompts to be updated. It’s an iterative process.

Then we get to the fun bit.

We start the orchestration agent using our orchestration prompt, that fires up the subagents using the prompt for the migration, and that uses the scripts we’ve been refining. Agents in agents driving scripts.

Why “Dorodango”?

Dorodango is the Japanese art of polishing mud, taking something that’s rough and unformed and, by iterating and smoothing, making a beautiful sphere out of the most unlikely starting points. In the same way, this pattern takes a rough and unformed idea, and by iterating and smoothing ends up with an efficient way of managing change.

Comparison with Wiggum Loops

While both the Dorodango Technique and Wiggum Loops are methodologies for using LLMs in iterative, self-correcting development, they differ in their structure and approach to failure. The Dorodango technique is about iterating on the design (the scripts and prompts) after a failure in order to build a robust system. It uses the LLM to write the system, but you are in the loop to refine it. Wiggum Loops are about iterating on the code/output until the LLM successfully completes a well-defined task autonomously. It is designed for maximum automation once the loop is started.

Principles of software development

Tue, 20 May 2025 00:00:00 +0000

I’ve written code which is still in widespread use today that’s old enough to be considered an adult in the UK. I’m pretty proud of that. I’ve also written plenty of code that was absolutely terrible. Sometimes these bits of code are the same. I’ve learned from that.

I’ve had the pleasure of working with some brilliant software engineers who opened my eyes to better ways of working. I’ve also worked with developers who were still learning the craft, or who didn’t care for the thing they were working on. That taught me things too.

As I write this, I’m on a plane armed with only my phone, so perhaps that will help me keep things concise, but I think it might be useful if I outlined some of my principles for software development. So, without further ado, here they are, pulled from the aether at 30,000 feet!

The best code is the code you don’t write
Fast feedback loops are vital
Assume no one reads the docs
Favour simplicity
Do the simplest thing that can possibly work
Fear of code is a sign of where to start
Collocate things that are related
Long lived code needs tests
Source control gives you freedom
Don’t let “perfect” be the enemy of “good”

The best code is the code you don’t write

The advice is simple: avoid writing code if you can.

Why? Because every line of code you write is one that needs to be maintained, tested, and otherwise cared for. If you can avoid needing to do that, you can focus your limited time on other, more exciting, things.

If there’s an existing tool that does what you want, use it. If there’s a library out there which supports the features you want, go with that. If a shared calendar would work, use that instead of creating some bespoke booking system.

We all know that not everything will be a perfect fit for you. That’s fine. Sometimes you can adjust your processes or approach. Alternatively, you can try and modify or extend whatever it is that needs to be changed. And you know what? Sometimes, it might just be better to write something of your own, but that should be a last resort after considering the alternatives. As a rule of thumb: don’t write that code!

Attempting to make changes is especially important when dealing with Open Source software. If it’s Open Source, contribute upstream instead of maintaining a fork.

It may be painful thanks to whatever combination of corporate and OSS politics and personalities you need to wrestle with, but contributing back is the Right Thing to do. Most importantly (for me!) it means that you’ve improved part of the shared commons, and that allows us all to move forward. However, if an appeal to altruism isn’t something that appeals to you, contributing your patches means you won’t need to maintain a patch or fork moving forwards. Contribute enough and you can build a body of work to advertise yourself with.

Just remember, if someone else’s solution is better than yours, it’s fine to abandon your efforts for theirs. We want to solve problems, not massage our egos.

Fast feedback loops are vital

This is the biggie for me.

You can’t know if you’re heading in the right direction without feedback. The sooner you get it, the sooner you can course correct (or carry on doing more of the same). If feedback loops are long, by the time you discover something went wrong you’ve often lost the context needed to fix it quickly or cleanly.

This is one reason why I like compiled languages. There’s no faster feedback than the IDE telling you “this thing won’t even compile”

It’s also why I’ve spent the past 10 years or so working on build systems. I want the distance between finishing a thought and finding out whether it works to be as short as possible. Dan Bodart coined the term “10 second build”, and I think that’s something to strive for.

Smaller changes being pushed to production more quickly also lowers the risk of each change. Faster feedback is safer feedback too.

I’m willing to give up a lot for a tight feedback loop, but with modern tooling, distributed builds, and careful design choices, I usually don’t need to.

Assume no one reads the docs

“I’d rather spend an afternoon debugging and struggling than 10 minutes reading the docs”

Documentation is read at most twice: once when we start using something, and once when we’re really lost and can’t find the answer anywhere else. That’s certainly true for me. My experience tells me that’s true for many of you too.

This obviously impacts how I design APIs that I expect others to use. This is also why I like uniformity and consistency around how things work.

Put another way, you can rephrase this as “don’t make me think”. As someone solving a problem, I really don’t want to be taken out of the flow to deal with figuring out some weird little detail. It’s one of the reasons why I like tooling that works the same way for the same kinds of tasks across different languages.

This should also colour how documentation is written. If you assume the reader hasn’t read the rest of the documentation or the source code, but comes with a half-formed query (and probably a certain amount of frustration if they’ve been working on a problem for a while), how would you want the documentation to look? How would it best support them?

Maybe in the future this won’t matter so much. After all, the AIs have read the docs, and they’re excellent at retrieving knowledge. Perhaps the consistency I crave will be the friendly prompt of my AI assistant, one level removed from the daily grind of the tooling and APIs I regularly use.

But for now…

Favour simplicity

There’s a quote from Kernighan that states “Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?” That alone makes a good case for simplicity.

But for me, simplicity isn’t just about debugging. It’s more than that. As someone trained to spot patterns, I’m predisposed to create premature abstractions. Favouring simplicity reminds me to resist that urge and just solve the problem in front of me.

This is good for any number of reasons, not least of which is that I tend to get done sooner; a faster feedback loop.

I think this is also part of being a good neighbour. After all, I’m sure we’ve all wandered into codebases where simplicity was an afterthought. It’s seldom a joyful experience.

Having said this, I should acknowledge the idea of “essential complexity”. Sometimes things are complex by nature. I’m okay with that, just so long as we’ve boiled away as much of the “accidental complexity” as possible. After all, complexity is the enemy.

Do the simplest thing that can possibly work

More simplicity? Yes! It’s so important, I mention it twice!

Doing the simplest thing helps rein in my tendency to over-abstract early. Abstractions reveal themselves with time. Let me repeat that: you’d need a Muad’Dib-level gift for prescience to accurately anticipate where the real seams will appear in a problem you’ve yet to solve.

When writing code, I take this to mean writing something that’ll work now, but which I know can be changed later. After all, code is infinitely plastic, and if we allow it we can reshape it indefinitely.

Now, I’ve seen this applied maliciously too as “do the stupidest thing that will possibly work”. Don’t do that. That’s the kind of thing foolish people with little care for others do.

The core idea is to give yourself options. Chris Matts expressed this well with the concept of Real Options: options have value, they expire, and you should never commit early unless you know why. You may have heard this expressed as “making decisions at the last responsible moment.”

Fear of code is a sign of where to start

You own your code – it shouldn’t own you.

If there’s a place in your code base that you fear to tread, view that as an invitation to tame the horror and deal with the complexity, not as a reason to shy away and retreat.

Fear spreads if you don’t address it. What starts as a small area that troubles us leads to bigger, more terrifying things.

I’ve seen this manifest so many times as sclerotic and moribund code bases, where people are scared to change anything lest the whole thing collapse. That leads to more bureaucracy, more caution, and delays implementing new patterns and approaches. That, in turn, kills your feedback loop.

Which, as you now know, is a thing I can’t abide.

So! Use fear as a guide for where to go next. It may not be comfortable, but it’ll be a step towards simplicity and faster feedback loops, and that’s an unalloyed good.

This principle is foundational to object-oriented programming: keep data and behavior close together. The thing is, this principle applies to more than just a style of programming.

For example, if two codebases are tightly coupled, they should live in the same repository. Yes, that might mean fewer, larger repos. That’s fine because by collecting coupled things together, we front-load the moment where we integrate them. It may make an individual change seem more painful, as suddenly all the integration points are known and need to be fixed, but all we’ve done is move those fixes earlier in the development cycle, and made them visible. Earlier fixes mean tighter feedback loops.

(See? I told you I liked those.)

Long lived code needs tests

How do you know whether a change you’ve made is safe and correct? As Nat Pryce once quipped “I can make changes really fast if I don’t have to prove they work”, so one way is to YOLO it and hope for the best.

But relying on production to tell you whether something’s broken is the longest possible feedback loop. And the bigger your system gets, the more fragile that becomes.

So: write tests.

What kind of tests? Context matters, but my rule of thumb is: the smaller, the better.

I like Google’s test size model, which I once wrote about on their testing blog. However you choose to name your tests, aim for speed and precision. Small tests run fast and isolate failures well, even if their individual coverage is limited. In aggregate, they’re powerful.

Of course, you also need some larger tests too, but it’s likely your CI will run for too long and will be prone to flakiness if they form the bulk of your testing. It’s why I’m still a fan of the testing pyramid, even though it’s a model that some people find dated.

But doesn’t writing tests mean you can’t write as much production code? While that’s true in the short term, I’m firmly convinced that a lack of tests means more bugs, which means more bug fixing, which means less time writing production code in the long term.

This suggests that all code should have tests, right? Not really.

If you don’t expect the code to live long, or for it to be an exploration of a problem you’re using to expand your knowledge, it’s quite alright not to have any tests at all. Dan Terhorst-North talks about the pattern of “spike and stabilise”, which is a useful way of figuring out which tests are needed, and as Liz Keogh points out, it’s a powerful way of getting fast feedback from stakeholders.

One last thing: I find writing tests after the fact boring and thankless. That’s why I’m a fan of TDD. Even if I throw some tests away later, starting with them helps me focus and clarifies what I’m trying to achieve.

Source control gives you freedom

I’ve been on a number of projects where dead code is kept around, sometimes in an “archive” directory, or (more frequently) still in place, untouched and unloved. This has always puzzled me. You use source control (you do use source control, right?) and that means it’s perfectly safe to delete code: it’s still there in our source control system if we need it.

Similarly, if you want to try out something new or a risky change, you can do that with complete confidence that you can go back to a known good state without needing any fancy shenanigans. A single “git checkout” and you’re back where you started.

By eliminating dead code entirely, you reduce the maintenance cost of our code (coincidentally allowing faster feedback loops, as there’s less code to compile and test) By being able to experiment freely, you can reduce our fear of the codebase, which can help you tame it.

Source control is your safety net. Use it. Trust it. It won’t let you down.

Don’t let “perfect” be the enemy of “good”

The temptation when working on a system is to want to be able to make it work for all cases, all the time. But often, solving a few cases well – without making others worse – is already a huge improvement.

Until code is in production, it has no value. So ship something. Learn from how people use it. Iterate.

A couple of examples of this spring to mind.

Meta was notorious for its mantra of “move fast and break things”. Many people focused on the second half of that slogan, but that was merely emphasising how important it was to be able to move fast. By putting something out there and seeing how people responded to it, Meta were able to nimbly adjust a project’s direction, even if that sometimes meant that not everything was working as well as it should.

Or take the example of how we generate Bazel build files for Java using a tool. To do this perfectly would have been to have special-case handling for cases that seldom occur, and to generalise for all manner of source code layouts. However, it turns out that just having anything in place offers value, and makes people’s lives better.

Don’t worry about solving every problem. Start by solving one, then take it from there. Doing so tightens your feedback loops, and that can only be a good thing.

Wrapping up

If I gave this more thought, I’m sure the list would be longer, and if I did so I am confident you’d lose patience reading this!

I’m also sure that reasonable people would make different choices, or have entirely different principles. That’s good. The gentle tension between approaches often leads to better outcomes. All we need to do is assume the best of each other.

In a way, a lot of these principles are self-reinforcing. Simplicity and not letting “perfect” be the enemy of “good” can be seen as two sides of the same coin. Having tests makes it easier to be fearless. Colocating things really does lead to faster feedback loops. That begs the question: how short could this list be? If I reduced this down to the very smallest number, I think I’d end up with:

Be fearless
Fast feedback loops are essential
Code is worthless until it’s being used

Software development is more than just a set of principles. I am utterly convinced that writing software is a team sport and the most important thing on any software project is the people. Finding ways to work together effectively is far better than sticking dogmatically to The One True Way. Maybe that’s a blog post for another day….

But for now, it’s pretty hard to type this much on a phone keyboard while jetting through the air, so I’m going to stop here.

rules_jvm_external's New Maven-based Resolver

Fri, 26 Apr 2024 20:16:00 +0000

The code you’ve written means nothing until it’s in the hands of users.

Normally, the way I like to write features is to put together the smallest thing I can think of, release that, and then incrementally add features and fix bugs. This lowers risk, and makes it far easier to move quickly, but sometimes that’s just not possible.

With the new release of rules_jvm_external 6.1, I finally managed to ship a new Maven-based dependency resolver. I started work on this at the tail end of 2022, with the first proper commit in late January 2023, so it’s been in development a loooong time. I’ve not actually spent that much time hacking on it. It’s taken a long time because it’s something that I’ve hacked in when I have a little time to spare.

The main motivation behind this was to add support for using Maven BOMs in dependency resolutions for Bazel projects, and the initial version included not only a Maven-based resolver, but also a Gradle-based resolver too. There are some lessons I’ve learned along the way, and maybe it would be useful to share some of them.

I realise that these are particularly reflective or considered. I’ve literally just pressed the buttons and thrown the switches for the release, so I’ve not yet sat back and thought about things, but it’s always nice to get some thoughts down while they’re fresh in the mind.

With that said….

A little is better than nothing

It was deeply frustrating to have progress be so slow, but the thing that helped make it bearable was being able to break the problem down into bite-size pieces that I could work on. Being able to see slow progress, rather than nothing at all, was a huge boon.

Small tests #FTW

The way I was able to break things into bite-size pieces was to make use of relatively fine-grained tests. These were hooked up into my test suite and could be run at the drop of a hat, whenever a change was made. Because of careful choices, running the entire test suite takes about 3 seconds, which means I could run it whenever I felt like it. Tightening the feedback loop like this was essential.

Software is collaborative art

Along the way, I got stuck several times, and needed help figuring things out. There were other times when I thought things were done and I could relax. All along the way, there were fellow developers and geeks who were happy to help. Jin and Chris, who are my co-consipirators in the rules_jvm_external repo, have provided endless patient feedback and support. Jon Tirsen helped me find bugs and issues, and even supplied patches (thank you, Jon!)

Now that the software is out, I’m expecting bug reports and feature requests to come rolling in. That’s fine. That’s half the fun, because I know that the people asking for things will be doing so looking to improve things.

Patience is a virtue

Gotta be patient if you’re going to take over a year to ship a feature!

A New Approach to CI

Tue, 05 Sep 2023 12:00:00 +0000

In one of the groups I’m part of, someone recently asked “Does anyone have opinions on monorepos? It seems like they should be beneficial to CI in the same way that trunk-based development is, but I’ve never used one in anger.” This is a edited version of my response.

I’m a huge fan of co-locating code as a driver for making feedback loops tigher. Taking that co-location to its logical extreme, that naturally leads you towards a monorepo. Very few people go to the extreme, but it does suggest moving to fewer, larger repos.

There’s a whole bunch of issues that surround larger repos, but my view is that they can transform CI entirely. Why is that?

The main question we’re trying to answer with CI is “is this change safe to land into production?” If we believe the change is safe, we can push ahead. If we don’t, then we need to weigh our options, but typically we won’t push to production.

Let’s think about how traditional CI has attempts to answer the question of “is this safe to land into production?” by taking a “belt and braces” approach: we can’t prove what has been impacted by this change, so instead we’ll start by running a series of pipelines to try and make sure all the bases are covered.

The traditional CI pipeline looks something like:

Run formatters or linters that run almost instantly.
Run the small tests (or unit tests, if you prefer that terminology). This gives fast feedback.
Run the medium tests (or integration tests, if you prefer)
Fan out and run the large tests in buckets.
Fan back in, and build artifacts for deployment.
…
Profit!

Of course, there are almost as many ways of creating this as there are projects, but the general pattern is to front-load fast tests, and then fan out to run slower tests in parallel where possible.

What’s notable is that we’re generally running everything. That’s because we don’t really know what has been affected by a change, and for the sake of safety, we just run everything we can, hoping that it’ll catch any problems. For a small repo, this approach is probably fine, but as the repo grows, it lengthens the feedback loops, and the likelihood of something in an unrelated project to yours causing your builds to fail increases.

Now, I’ve seen plenty of teams attempting to write sophisticated tooling to use machine learning (or just good old fashioned statistics) to try and figure out which tests need to be run for which change. The results are never completely reliable, so there’s always the fallback of running everything.

The problem is that the repo is too large, and has become unwiedy to work on with the tools we’ve grown used to. You definitely need the right tooling to make a monorepo (or larger repo) work, and my tool of choice at the moment is Bazel.

That’s not because Bazel is an amazing tool (it has a wickedly steep learning curve, and it’s demand to completley enumerate inputs is deeply frustrating), but that it’s great at handling larger repos in the way that other tools just can’t, and of the new generation of build tools out there, it’s the one with momentum (meaning that you can find help in Stack Overflow)

One thing that Bazel allows is the ability to query the build graph. You can do some really nice things with this ability. For example, using a tool like Target Determinator, you can identify every single test that needs to be re-run, or library or binary that needs to be rebuilt for each change.

So your CI build stops being “run everything” and starts being “run just what needs to be run”, and that can save astonishing amounts of time, if you can determine what that is reliable and at speed. At a high level, your CI run becomes:

Use Target Determinator to identify the targets to rebuild and test.
Rebuild and test those targets.

One nice side-effect of this is that there’s no need to keep the entire tree green all the time. We all know that flaky tests sometimes creep in, or a test starts failing because some external system is down. Using target determination allows us to know that our change is fine, even if the rest of the repo is on fire.

Better yet, that irksome habit Bazel has of requiring you to list all your inputs has the handy side-effect of making remote builds far simpler, and not constrained to a single language (like distcc does). After all the build is just taking inputs, laying them out on disk, running a command, and collecting outputs. If those inputs are specified in enough detail, there’s no reason to be constrained to a single machine.

Being able to do distributed builds, either on locally managed infrastructure such as BuildBarn, or using a “build as a service” provider such as EngFlow or BuildBuddy, is another way of tightening feedback loops by scaling the build horizontally (though this relies on builds being broad, rather than a single, narrow critical path) You’ve got 300 tests to run? Just run them all at the same time. It takes the same amount of CPU, but the wall clock time drops dramatically.

Combine distributed caches, builds, and target determination, and your CI pipeline becomes a lot easier to manage. In many cases, it will look like the pipelines of the old days: just a straight list of steps that are carried out in sequence, without any fan-out or fan-in.

Of course, there are a host of problems that come with the approach of using larger repos, and the two I see people get most incensed by are:

Single version requirements of dependencies.
Having to fix other people’s builds when you break them

I’ve blogged a little about both of these here, and a little more about the cost savings that monorepos can represent here, but going into these issues in depth will need to wait for another day.

However, in short I strongly believe that monorepos are as beneficial to CI in the same way that trunk-based development is.

Sotto voce

I should really blog about the single version thing. It’s a pain, but largely because it surfaces incompatibilities and makes more visible the amount of work that needs to be done to make an update stick everywhere it should. I liken it to how Agile used to be compared to other methodologies (at least, how they were compared when I was at ThoughtWorks back before 2010).

All software development projects start in a relatively chaotic way, with uneven progress, and unforeseen hiccups. After a while, they settle down into their own rhythms, and become more predictable. The problem was that if a project reports progress every week or two, that initial chaos is far more visible than something that reports progress every month or even longer. It’s not that there’s a difference, it’s that the visibility is far higher (we shall set aside that I hope that most people here believe that the visibility is something that is ultimately useful and leads to better outcomes). I think the same applies for the single-version thing too: it surfaces incompatibilities so much sooner, and front-loads a pile of engineering effort that would otherwise have to be spent (with interest!) later on in the process, where change is harder.

There's No Such Thing as a Free Lunch

Mon, 12 Jun 2023 20:16:00 +0000

One of the things I like to talk about with my teams is the goal of getting a new line of code into production as quickly, as safely, and as cheaply as possible.

The first part of the goal – getting code into production quickly – is something every team strives for.

The second part? In my experience, it tends to be eclipsed by the first part. After all, everyone is under pressure to get the feature they’re working on into people’s hands, and there’s always a backlog of work to do that’s longer than the time available to do it in.

That third part, about the cost? It’s very seldom considered.

But for any organisation, cost is a vital part of software development. More importantly, “cost” is something that needs to be measured at the team, organisation, and company levels: just looking at one isn’t enough. Allow me to explain….

I like to loosely define safety as “confidence that a change doesn’t break anything”. That doesn’t mean that the change is entirely perfect; after all, defects always slip through. It just means that we have a sense of confidence that a change won’t make our systems worse. It’s not a perfect description, but it helps to guide some of my thinking.

This definition also leaves open the question of “what can break?” with a change.

There are the obvious things that spring instantly to mind, such as the feature not working at all, or having unforeseen and undesirable side-effects (using too many resources, accidentally wiping data, or being slower than is useful, for example). There are also less obvious concerns, such as how we interact with other parts of the system, or how changes in APIs may cause code that depends on ours to fail to compile. If you’re using services (micro, macro, I don’t mind. Whatever makes you happy), then the contracts between those services are also places where we can expect – and frequently find – breakages.

The worst possible time to find out about breakages is in production. For applications and services, what “production” means is clear. For libraries and shared utilities, “production” may be the point where someone else takes a dependency on that code (that is, when someone updates the version of the library to the recently released version and tries to recompile) Depending on how frequently dependencies are updated, there may be a lag of months before there’s proper confidence that a change is safe.

For all changes, it’s a wise idea to depend on some level of automated tests. Pull the update in, compile and run any small tests, deploy if necessary to some environment, and then run the medium and large tests. If we’re in a single company with many repositories, it may be possible to identify other repos that depend on the artefacts you’re producing, and to “grind and fix” each of them with the latest change.

So far, we’ve been considering this from a pure engineering perspective, but now we need to don the hat of some kind of manager-crossed-with-an-accountant, and consider costs. How do we make sure our changes are delivered quickly, safely, and at as low a cost as possible?

Back in the Old Days, we used to talk a lot about the cost of change curve, which posits that finding and fixing issues earlier in the development life cycle is cheaper than doing so later on. I think that’s an axiomatic truth, even if the exact details might be something we can quibble over. Put another way, the longer the feedback loop, the more expensive it is to react to the results of that feedback loop; shorter feedback loops are cheaper.

With a compiled language, the earliest point we can get feedback about a change is at compile time. Change an API, and the code won’t even compile. Magical!

The next cheapest way is to run the tests in our repo. Assuming those tests pass, we then need to publish snapshots, and try to coordinate changes between downstream multiple repos (maybe pulling in the snapshot, recompiling, and running all those other tests). Of course, each of those downstream projects need to be updated and tested in a specific order. Your repos all have a graph of dependencies, and we need to follow that graph, so each repo tested needs to publish more snapshots that can be consumed further down the line, and so on, and so on.

The Apache folks tried this with Gump, for Ant, Maven, and other build tools they own. Gump “builds and compiles software against the latest development versions of those projects” It is relatively limited in scope, but it’s already pretty complicated. It’s not a cheap thing to do. Coordinating between the Apache projects is done on a “best efforts” basis, rather than being something that’s mandated, which mirrors what happens in organisations – if you identify something that needs fixing in someone else’s repo, often you have to report it as an issue rather than delving in to fix yourself. I’m sure we’ve all experienced how slow that process can be.

Attempting to detect and follow the graph of dependencies between repos in a company would be challenging, especially if the dependencies are indirect (for example, if a URL for something is hard-coded somewhere, and that’s how the dependency between components is expressed)

The complexity and cost of building and maintaining infrastructure to test and detect this has to be factored into the cost of making the change. You might take a shortcut, and say that you’re only interested in specific downstream consumers of your change, but even then, there’s a cost to be borne, and it’s higher than making a change in a single repo. How come? Because there’s more coordination to manage, and longer feedback loops. As I’ve already mentioned, the inference from the cost of change curve is that the longer feedback loop is more expensive.

In the “farm or grind” blog post, the missing first step is “find out where the changes need to be made”. In the post, Jesse says, “you use a combination of GitHub search, ripgrep and zoekt to find the impacted codebases”, which sounds like something that might work for a majority of cases, but I’m also confident that things would be missed (if, for example, the repos weren’t public, or accessible to the person making the change) Worse, you’ve still got to figure out the graph of dependencies between repos to increase the safety of the change. It ain’t cheap.

So, how do we reduce the cost of building our confidence?

Co-locating code helps an awful lot. Running “ripgrep and zoekt” in a single repo is cheaper than doing so over dozens. Taken to an extreme, this leads you to a monorepo (Yay! Monorepo! Yay!), but there may be perfectly sensible reasons why that’s impractical. In any case, reducing the number of repos reduces the cost of a change. The downside is that the cost of a change becomes more readily visible, and the visibility of future pain is seldom something that excites developers, but from the perspective of the organisation as a whole, the cost of the change has reduced.

A second strategy is to reduce the number of dependencies, and where that’s not possible to have clear and explicit tests in each repo which describe the contract between dependencies. Nat Pryce talks about simplicators, Eric Evans about anti-corruption layers, and Alistair Cockburn introduced the world to hexagonal architectures. All of these help provide that insulation and isolation.

Put another way, the looser the coupling between repositories in an organisation, the cheaper a change in one is likely to be, since it’s less likely to affect the others. Conversely, tight, implicit coupling between repositories is an argument for merging those repos — a change in one is very likely to require a change in another, and inter-repo testing is expensive.

A third strategy is to use a modern build tool which understands the build graph within a single repository, supports caching, and which can identify the subset of targets that need to be built for each change. Right now, I advocate for something like Bazel to support this, but really any tool that properly supports caching and which avoids unnecessary rebuilds that you and your team is happy with is a great choice.

Finally, we need to be conscious that someone needs to pay the cost of each change. As an engineer on a team, the smaller the repository, the cheaper the change appears to me. However, all we’ve done is distribute, delay, and escalate the cost of validating a change because we’ve extended the feedback to production. So, while our cost appears reduced at the team level, the cost to the company is larger.

Worse, I don’t believe that the cost to the team is really as small as we believe. Bug reports coming in from other teams many months after a feature has landed are later on the cost of change curve, and so more expensive to fix. Worse, the context for a change is no longer readily available, so it takes extra engineering effort to properly respond to those bug reports and requests to change when they do eventually arrive.

Does this all mean that a single repository with everything in it is “cheap”? Absolutely not. It’s astonishingly expensive, and the tooling required is cutting edge. However, the alternative is more complex, requires an array of tooling that doesn’t even exist yet, and has lengthed feedback loops. It’s definitely not cheaper in the long run.

Your Roots Are Showing

Wed, 10 May 2023 20:16:00 +0000

To know where we’re going, it helps to know where we’ve come from. While this is true for ourselves, it’s also true of software. It helps explain why some of the things that look unusual or unlikely have been done that way, and helps demonstrate some of the forces that have acted on the design of an apps UI and UX.

Why do I mention this? Because when I introduce people to Bazel I find it helpful to explain where the tool came from in order to understand why it has the UI that it does. A lot of this is hinted at in the post that Mike Bland wrote over 10 years ago, but perhaps now is the time to flesh out the story a little more. I’d suggest going to read his post before carrying on here. It’s an interesting read, and we’re not in a rush. Go for it.

The first place where Bazel’s roots are showing is in its own name. Looking at the source of Bazel, it should come as no surprise that it’s derived from blaze: Google’s own build tool. Indeed, “bazel” is an anagram of “blaze”.

But blaze wasn’t created in a vacuum. When it was introduced at Google, it replaced the older build system, which relied on a two-step process to perform a build. The first step was to run a tool that took build files and converted them into a Makefile. The second step was to run the build itself.

The build files were an amazing abstraction. Rather than describing the individual steps required to build an artifact, they simply described the kinds of artifacts to be built. If you saw one now, it would feel remarkably familiar. Without that abstraction, I’m not sure how easy it would have been to keep growing the Google monorepo.

The build files needed to be written in a language of some sort, and at Google there were (notoriously) four “blessed” languages for writing code: C++ for performance critical code and if you liked it, Java for other server-side code, JS because that’s what you needed to run in a browser, and Python for everything else. Clearly, the sensible choice from this list if you needed a programmatic way of describing your build was Python.

And, sure enough, originally the build files were interpreted using Python.

As a little historical note, this is the same approach we took when we were developing Buck too, and that shouldn’t come as a surprise since the team working on Buck were largely part of the Xoogler diaspora. But I digress….

However, there’s one huge problem with interpreting user-supplied build files written in fragments of Python in a build tool that’s meant to be deterministic and reproducible: you can do just about anything, including futzing with the file system, or reaching out to network resources. Worse, there was no way being able to determine whether “parsing” the build files would ever finish, or could be done without undue computational load on the machine doing the build.

So, it was decided that it was better to use a tightly constrained subset of Python. By providing a different interpreter, it would be possible to avoid accidentally relying on modules that were only installed on a handful of machines. It would also be possible to prove that parsing the build files would complete (yay! No halting problem!)

And if you go and read the goals of Starlark, you’ll see that this is exactly what happened. Put another way, Starlark is another place where the roots of Bazel shine through — it looks like Python because at one stage is was Python, and it was simpler to slowly tighten the constraints of what was allowed in build files over time than to rewrite every build file in the whole of Google’s monorepo. Fortunately, originally most of the build files weren’t doing anything fancy to begin with, and so could be interpreted using this new subset of Python.

But we’re not done yet! There’s one other thing that Mike mentions in his post that is pertinent to this discussion of Bazel’s past leaking into its UX, and that is that Google used Perforce for source control.

Now, if you’re been fortunate enough to be introduced to source control in the modern age, you may not be aware of just how many source control systems there used to be. In the Open Source world, the move from RCS to CVS allowed us to group changes to multiple files into a single commit. The move from CVS to Subversion made those commits atomic (prior to Subversion, if two people used CVS to commit a change at the same time, it was possible for two separate commits to get the same revision number, and that lead to plenty of hilarity).

But there weren’t just Open Source source control systems out there. For example, the well-known falling out of the Linux Kernal devlopers with BitKeeper lead Linus Torvalds to create git, and which also lead to the creation of mercurial.

But not all source control tools are, or were, Open Source. There were many commercial ones, and Google had settled on Perforce, which had a reputation for being flexible, fast, and capable.

The way that Perforce works is that you create a Perforce client. This is done by specifying paths within the repo that you want to check out, and then run the p4 tool to get everything dragged down from the Perforce server to your local disk.

These paths will look familiar to anyone who’s used Bazel because they look exactly like the label syntax that is used for specifying targets, //they/look/like/this/...

The original tooling at Google took advantage of this by providing a utility that allowed a developer to clone a minimal but sufficient part of the larger monorepo. They did this by specifying the build targets to build, finding the relevant build files by converting build target paths to the perforce equivalents (a trivial transformation), and then parsing those to extract more paths, and so on, until you had everything you needed.

So, this has been a lot of words to describe three places where Bazel’s history have leaked into its current incarnation:

The name is an anagram of blaze.
Starlark looks like Python because it once was Python.
Bazel labels look like Perforce paths because they were originally Perforce paths

The Social Expectations of Source Repos

Mon, 24 Oct 2022 12:53:00 +0000

I have Keanu Reeves to thank for one of my favourite life mottos: Be excellent to each other. It seems to be something that I can apply so often, in so many ways.

Take “working with other people on source code” for example.

For the sake of this post, I’m talking about large repository with many people working semi-independently on it. Those are most often found in companies, but there are some examples in the world of Open Source.

Setting the scene

There are always social expectations anywhere you have groups of people attempting to collaborate and get along with one another. You and whoever you live with? There are social expectations. You and the people you work with? There are social expectations. You and whoever you share a code repository with? There are social expectations.

One of the things that I’ve noticed is that people seldom think about the social expectations of the repos they inhabit. And I guess that’s only natural.

Maybe we don’t share our repo with anyone else. What we say should happen, happens, because we’re the only ones that are impacted by it.

For some of us, we share a repo with only a few people, and we tend to work fairly closely with them. It’s normally pretty easy to come to an agreement on whatever the hot-button contentious issue of the day is.

But, as teams grow, and the repository becomes larger, we end up at a point where we don’t necessarily know all the people. You’ll see things like code owners files appearing at this point, and the number of people who can commit at the root of the repository tends to be dramatically reduced.

And this is where we should be excellent to each other.

But, if we have the freedom to do whatever we want in a small repo, why would we possibly want to be in a larger one, let alone taking things to the extreme of a monorepo?

The more, the merrier

Well, for the same reason that living in cities is so popular: for the trade-off of some freedom and some additional costs, there are a whole heap of advantages, and these come from the scale of what you’re sharing.

For example, let’s take the classic bugbear of anyone working in a large repo: updating shared dependencies.

You might think thay updating a dependency in a small repo isn’t really that much work, and most of the time you’d be right. Someone gets assigned the task to do the update, they do the work, fix any issues, and they’re done. Simple.

Except, when you’re doing large scale programming, it’s not just one repo. There might be dozens (hundreds!) of repos. For each of those repos, an engineer must handle the update. Cumalatively, the aggregated engineering hours required to perform the update tends to be higher than changing a similar number of projects in a larger repo.

It’s all to do with context — if the update doesn’t cause any problems, and there are no weird bugs, then the cumulative effort of getting one engineer in one place to do one update is clearly lower than the cost of gettings dozens of engineers to do the same work across multiple repos. When things go wrong, the experience gained fixing one project can be applied to others in the same repo.

That is, in the best case scenario, the larger repo is cheaper overall to update than the smaller repos, even if it requires more work for the engineer doing the heavy lifting.

Put another way, smaller repos optimise for the micro-case. Larger repos allow optimisations for the macro-case. When you have limited engineering capacity and many projects, it most often makes sense to optimise for the macro-case.

But no repo is an island. We deploy our software. At some point, our code will need to integrate with someone else’s code. It’s only then that we find out our assumptions about how other projects will act are right or wrong.

With a small repo, we can iterate quickly. We can do so because we delay the point at which we integrate with others. But! It’s a truism in software development that the later a defect is found, the more expensive it is to fix. By delaying the point of integration, we’ve increased the cost of fixing any integration.

OTOH, while it’s deeply frustrating for someone working in a larger repo to find out that they’ve broken something, we’re front-loading the cost of integration. Intuitively, this means that the overall cost of this integration will be lower.

Put another way, smaller repos optimise for local changes, trading that for increased integration costs. Larger repos optimise for reduced overall integration costs, trading that for more effort being required to land a single change in the tree.

And then, there’s the costs of CI and build infrastructure. Unless a repo is particularly simple, there is likely to be some kind of build process, and some kind of CI pipeline. As the amount of code grows, the CI pipelines may slow down. We tend to end up with someone being assigned to “make the build faster”, or with dedicated build engineers.

Again, with many small repos, the individual costs may not be high (after all, these repos tend to be simpler by definition), but aggregated over an entire organisation, the total engineering cost tends to be higher. Larger repos can aggregate this cost into experts in the build process, who can focus on improving feedback loops for significantly more engineers.

Put another way, we can view the build as the fulcrum on which we’re trying to move the world. The larger the codebase, the longer our lever, and the more impact an improvement can have on people.

And these are only some of the ways that working in a larger codebase can be more efficient than working with the same amount of code spread across multiple repos.

But it hurts

There are major downsides to large repos: some technical, some social. I clearly have an agenda, but it would be wrong to ignore them entirely 😁

Technically, larger repos need a build tool that can cope with large repos. They take longer to clone. Thought needs to be put into making CI processes more efficient for repos with multiple projects, and not all CI tools are set up for this challenge. But the build tool is the kicker: over a certain size, most tools end up weeping gently and not working well at all. That’s one of the reasons I’m a fan of tools such as Bazel.

But the social issues are the important ones. Most of us have only ever spent time in smaller repos. It chafes and hurts to have to meet the social contract of working in a larger repo.

We want to update a dependency for ourselves: it seems like wasted effort on our part to have to upgrade everyone else. And, guess what? You’re right! It is more effort for you to do that update for everyone, but it’s still more efficient overall. And it ignores all the times when someone else has jumped through the upgrade hoops and silently updated something you depend on.

Our PR takes longer to land because the CI build tells us some project we know nothing about now has failing tests. What a pain! I don’t care about them! Figuring out what’s wrong with their code is slowing me down! And, guess what? You’re right! It is slowing you down. But, here’s the thing, overall it’s more efficient. You have the context for your change right there in your mind. While fixing a failing build or test is seldom fun, it’s still easier to fix it with better context. So, yes, it’s slower for you, but it’s faster overall.

A brief discussion about updating shared dependencies

It’s worth spending a bit of time thinking about updating dependencies. When talking to people who are sceptical about working in a large repo, this is normally presented as the number one problem to solve.

One argument against shared repos is that the difficulty of making dependency updates can lead to the ossification of the repo. Imagine someone wants to try an experiment in production with some fancy new library, but pulling this in will mean that a transitive dependency needs to be updated, and this causes some other service that they don’t care about to need work done. Do you try the experiment, or not? It takes engineering effort to do this work, and because the cost is higher in a shared repo, the possibility that the investigation isn’t worth that effort is higher.

So this is a good time to have a thoughtful conversation about the risk/reward trade-off that needs to be made. If I were working on this, the first thing I’d do is a quick spike to see if the work was as complicated as I feared.

This is why most large repos I’ve seen have had a mechanism in place to allow multiple versions of the same dependency to live in the same repo for a very short period of time, or allow people to release experiments from a relatively short-lived branch. In both of these cases, the choice is made thoughtfully and carefully. Yes, it does mean that sometimes less experimentation happens, but the question remains of the value of those experiments.

The other argument aginst a shared repo when discussing updating deps is what to do when you can’t clean up someone else’s code because you don’t have enough context about how it works. There’s a simple answer to this, but it’s not one that goes down very well: have A Conversation.

Now, this implies that the social contract in a large repo is that teams are aware that they’re sharing the repo with others, and they’re willing to be good neighbours. If someone comes to you asking for help in a part of the tree that you’re familiar with, then offering the help they need is a neighbourly thing to do.

Some people may recoil from this because they don’t like talking to people, and that’s unfortunate. Other people may shy away from having a conversation because they know the team they want to talk to has absolutely no capacity or time to help them. I’d suggest that running a team ragged like this isn’t necessarily in the best interests of the long-term health of the codebase. Another reason not to have this conversation is because the work culture precludes this, for whatever reason. If that’s the case, then the social pressures against having a small number of large repos will cause fragmentation and separation into smaller repos, no matter the engineering costs.

Social considerations almost always end up trumping technical concerns.

As a final note, I’ve observed that dependency updates (no matter the size of the repo) tend to be bi-modal: most tend to be pretty easy and straight-forward, but some turn out to be absolute monsters. At some point, I should blog some strategies for dealing with these.

Choose what to optimise for

Really, the social contract of a larger repo is that you accept that there will be times where what you want to do is slower and more difficult, because overall that discomfort will lead to reduced effort overall.

Conversely, the social contract of a smaller repo is that we’re optimising for our smaller team’s comfort, at the price of higher integration costs, and needing to be responsible for all updates to our dependencies ourselves.

Which really means that you have a choice: do you optimise for the smaller or larger repo? Do you choose to spend more engineering effort overall in a less visible way, or do you spend less engineering effort overall, but because integration happens sooner, in a more visible way?

To me, it’s obvious which approach I’d pick in almost all cases.

bzlmod Modules

Mon, 24 Jan 2022 00:00:00 +0000

Recently, Bazel 5 was released. Hidden behind a flag is the new bzlmod tool. This is effectively a package manager for Bazel rulesets that’s baked into Bazel itself, the goal being to replace complicated WORKSPACE stanzas with a simple and purely declarative model. Ultimately, there should be no need for a workspace file at all.

At work, we maintain a relatively complicated suite of choreographed rulesets, and simplifying maintaining that seemed like too good an opportunity to miss, so I dove in to figure out how to get the most from bzlmod. Here’s what I learnt.

Quick Overview

Although bzlmod is actually baked into Bazel itself and not a standalone tool, I still refer to it by a separate name because that’s how my brain works. If that confuses you, I apologise. If it helps, you can think of bzlmod as the tool that does dependency resolution for the rulesets you’re using before it hands off the build to Bazel “proper”.

bzlmod reads a MODULE.bazel file. To begin with, this is relatively simple, consisting of an opening call to a module function, and then a series of calls to bazel_dep to declare a dependency on another ruleset.

At resolution time, bzlmod will check in the bazel central registry for the lowest version of each dependency declared within the MODULE.bazel file (much like Go does). The nice thing here is that each module (including your project!) need only declare its first order dependencies. That’s different from the approach taken in the regular WORKSPACE-based projects, where you’re responsible for ensuring that the transitive deps of the rulesets you use are also loaded.

If you’re on a corporate network or don’t want to depend on the regular central registry, you can override the location of the registry by using the --registry flag. This takes a URL as its argument, and that URL can be a file:// URL.

Just as regular rulesets can declare repository rules, bzlmod modules can declare “extensions”. These are regular bzl files, written in Starlark, that contain a combination of “tags” and module_extensions. They’re loaded using a call to use_extension More on this later!

One consequence of this design is that a ruleset can be both a regular “workspace”-based ruleset, as well as a module.

I’m calling the process of converting a ruleset to be a bzlmod module “modularisation” (that’s with a “z” in the middle if you’re using US English 😀)

Preparing for `bzlmod`

By default, bzlmod isn’t enabled. To opt into using it, the following needs to be added to a project’s .bazelrc:

common --experimental_enable_bzlmod

And, of course, you should pin the repo to using Bazel 5 or above. If you’re using bazelisk this is as easy as echo 5.0.0 >.bazelversion. If you’re reading this in the future (one hopes you are), then just use the version of Bazel that’s current at the moment.

The Wrong Module Development Workflow

My original attempt to work with modules involved making a local clone of the bazel central registry, and adding the ruleset I wanted to modularise by running the //tools:add_module.py script (by hand: there’s no bazel build file here, and you may need to install some python dependencies to get the thing to work)

Once the module was added, I created a fork in the ruleset I wanted to modularise, created a branch in that and pushed to a private GitHub repo. That’s because I’d used the branch’s URL as the location of the module when calling add_module.py

Within the Central Registry clone, there’s an integrity code. This is a base64 encoded sha256, and every update to the ruleset needs to also be matched with an update to that integrity code.

My workflow was therefore:

Make a local change to my fork.
Commit the change and push to the GitHub repo
Update the integrity code in the Central Registry clone
Kill the running bazel instance in the project that uses the module I’m working on because Bazel stores the resolution.
Attempt to use the change, find a typo, go back to step 2.

Needless to say, this process was slow and very tedious.

The Right Module Development Workflow

What I should have done is modify the MODULE.bazel file in the project I was working on that used the ruleset I was modularising by adding a stanza like this:

# I was working on `rules_jvm_external`. This version hasn't been
# released yet!
bazel_dep(name = "rules_jvm_external", version = "5.0.0")

# And then later
local_path_override(
    module_name = "rules_jvm_external", # matches the name of the `bazel_dep`
    path = "../path/to/my/clone/of/rules_jvm_external",
)

Now, every time I made a change in the ruleset I was modularising (rules_jvm_external in this case), it was picked up automatically, without needing to restart the bazel daemon. This sped up development an awful lot.

The only caveat with this approach is that the local_path_override only works in the “top level” project. That is, while the module override works in the project that’s importing the modularised ruleset, if there was a similar call in that, it would be ignored.

Tags are Strongly Typed Macros

Let’s take an example from rules_jvm_external in a workspace-based project:

maven_install(
    artifacts = [
        maven.artifact(
            group = "com.google.guava",
            artifact = "guava",
            version = "27.0-android",
            exclusions = [],
        ),
        "junit:junit:4.12",
    ],
    repositories = [
        "https://repo1.maven.org/maven2",
    ],
)

The way to handle this in a “MODULE.bazel” is to use tags. These are like stripped down rules, in that they have no imeplementation function, but they do have a set of attrs, each of which are defined as being one of the entires in the attr module.

One limitation of the MODULE.bazel file is that you’re not allowed to use functions or load an external resource. This means that you can’t use macros in the way that we’re used to, and this caused me some serious head-scratching. Fortunately, after a conversation with Xudong Yang, it became clear there was another way to think about this problem.

The trick is that bzlmod will agregate all the tags defined transitively in a module, and the module implementation function can iterate over them. That means that the above stanza in a MODULE.bazel could be written as:

maven.install(
    artifacts = [
        "junit:junit:4.12",
    ],
    repositories = [
        "https://repo1.maven.org/maven2",
    ],
)

maven.artifact(
    group = "com.google.guava",
    artifact = "guava",
    version = "27.0-android",
    exclusions = [],
)

There are two things to note here:

maven is the value returned by module_extension, and both install and artifact are both tags classes.
The module extension’s implementation function aggregates the data from both of these into a single data structure, which is then resolved.

While this allows rules_jvm_external to express what needs to be said, it’s pretty clear that if the artifact tag needed a macro itself, we’d rapidly be in a whole world of pain. Fortunately, in my case, we don’t, so that’s fine 😀

Module Implementation Functions Replace Workspace Stanzas

A module_extension’s implementation function gets access to a pretty anaemic module_ctx. This would be a problem, but the implementation function is free to call as many repository_rules as it wants to. You can also rely on the rulesets declared as a bazel_dep to be present.

This allows the module implementation function to effectively contain the bulk of what would normally be in the stanzas of code that get added to a workspace file.

Care must be taken to avoid the need to call load in the implementation function: although the implementation is a lot like a subset of a workspace file, it’s not exactly the same. In the case of what I wanted to do for rules_jvm_external, this means that the lock file gets parsed at least twice: once so that I can generate a series of http_file dependencies, and once so that the actual @maven workspace can be set up.

`bzlmod` Lock Files

At some point in the future, bzlmod will have its own lock file. I’m not entirely sure what this will look like, but my belief is that this will be structured in such a way that your module implementation function should only be called when one of its inputs (including attributes of tags) has changed. In the case of rules_jvm_external, this will most likely be when the artifacts being imported into your project changes.

This will be incredibly useful. One of the painful things when working with workspace files is waiting for all the transitive deps to download and be set up so Bazel can figure out which bits of them it needs to use. On larger projects, this can take a long time. Being able to start building faster, can only be a Good Thing, and I look forward to it!

Managing Your `MODULE.bazel`

Right now, you can’t. Because there’s no way of calling load in a MODULE.bazel file, there’s no way to segment the thing in a meaningful way. I’ve raised an issue to do with this, and I know it’s something the Bazel developers are aware of, so I hope that once bzlmod is no longer hidden behind a flag, this will be possible.

Final Thoughts

Overall, after kicking the tyres and trying it out, I think that I like bzlmod, and it’ll be fun to see how it grows and changes, especially as rulesets migrate to using it.

Right now, it’s usable, but there are some corner cases where it’s not quite there yet (notably when a module declares a dependency on a repository via a generated build file) Having seen how quickly the Bazel team have leaped on the issues I’ve filed, I’m very confident that problem will be resolved.

My advice? Try migrating your ruleset to bzlmod, and see what works for you and what doesn’t. I suspect there’s enough there for it to work just as you’d expect in many cases.

My thanks to the Bazel developers, Xudong Yang and Alex Eagle for their help as I delved into bzlmod. Alex’s blog post gave me the incentive to start digging into bzlmod and provided enough scaffolding for me to get started. It proved invaluable!

Gravity

Tue, 30 Nov 2021 00:00:00 +0000

Our solar system formed from a cloud of dust and gas. Sometimes, gravity pulled together spectacularly large quantities of matter together, and we ended up with the Sun and the gas giants. Sometimes, the pull of gravity ended up forming something far smaller, like Pluto or Mercury. The asteroid belt? It’s the place where gravity wasn’t quite enough to coalesce matter into Yet Another Planet.

Gravity: it’s a big deal.

Our code has gravity. Each repository begins with a single line of code in a single file. Code accumulates. We write more to extend the functionality of our programs. And then, we find that we need to share some code between two repositories, and we have a choice.

We go one way, and each repository pretends it’s independent of the other, pushing and pulling that shared code to a central place such as a package manager. But they still perturb each other’s orbit. A backward-incompatible change in one repo causes the other to flex and change too. A feature request from one leads the other to change. They are independent, but tied, inextricably, by an invisible force.

We go the other way, and colocate code until we have a giant repository. A monorepo, or one of several überrepos. In a larger repo that invisible force is far stronger, because the code and the thing that depend on it are far closer. The same backward incompatible change that took a long time to spot when the repos were separate becomes visible instantly.

You see? Code has gravity.

Just like our solar system, that gravity leads to different outcomes. Sometimes, the gravitational pull is sufficient to pull almost everything into a single place. Sometimes, there are other forces at play that lead to the exact opposite outcome. Frequently, I’ve seen repositories start as independent bodies, and then, as their sizes increase and their interactions multiply, they come every closer together, sometimes merging. I’ve seen the opposite happen too. Repos figuring out ways of overcoming the gravity between the components and breaking out of the larger body to stand alone.

Our code has gravity. Our repos are caught in an endless dance, drawing closer, fragmenting, pulling apart, and then coming back together again.

Cloud Workstation

Wed, 17 Mar 2021 00:00:00 +0000

Sometimes, just sometimes, I need to work on a machine that isn’t running macOS. Previously, I’ve done this by installing either a local Docker image, or by running a VM. Neither of these approaches was terribly satisfactory, and both became less viable when I switched to an M1 MacBook Air (which, let it be known, is an absolutely lovely piece of kit) Today, I explored a new approach: using a cloud-based workstation.

The downside of this approach is that the workstation costs actual money to use, but the advantage is that I now have a machine that I can use whenever I need it, from wherever I happen to be. Because I’m only using this occasionally, as long as I remember to shutdown the image, the costs are pretty reasonable.

Setting up my first one was ridiculously easy using GCP!

Since I’m not the sort of person who’s great at reading docs, I just followed the steps from a YouTube video. Summarised, this is:

Create a new VM instance in GCP. I picked an instance with 8 CPU cores and 32GB of RAM. That may be a little much, but the CPU is important for me, since I want my compile and test runs to be as fast as possible.
Set the OS to Ubuntu, and choose which version you want to use.
Make sure you pick the right kind of disk. For development work, using an SSD is a really good idea, and making sure that there’s enough disk space is vital.

Once that’s done, you can start up the VM and use gcloud to SSH on to it. Once you’ve done that, just go ahead and install everything you need to get your development done. We can use pinned browsers running headless in the Selenium build, so there’s less need for an X Windows environment, but setting one of those up seems easy enough.

Of course, I had no idea how much disk I actually needed, and had seleted a “balanced” disk that was waaaay too small. Fixing that was pretty simple:

Stop the VM.
Create a snapshot of the disk.
Create a new disk from the snapshot, choosing both SSD and a reasonable amount of disk space.
Edit the VM instance that you’re using as a workstation, and scroll down to “boot disk”.
Nuke the existing disk, and then select to use the newly created disk image using Add Item

After this, restart the VM, give it a chance to reboot, and then you’re all good to go!

Migration Complete?

Mon, 31 Aug 2020 00:00:00 +0000

If everything has gone according to plan, my blog should now be hosted properly on the new site. Fingers crossed, eh?

Regular and Infrequent Performance Reviews

Sat, 22 Aug 2020 00:00:00 +0000

Ah! Performance reviews! Love ’em or loathe ’em, they’re a feature of corporate life. I’ve only worked in places that do performance reviews every six-to-twelve months. Maybe you work somewhere where performance reviews are done differently. If that’s true, I’m happy for you, and this post won’t be useful for you.

If that’s not true, and you’re in a place that does these “regular but infrequent” perf reviews, let me explain why I hate them with a passion, and then suggest some changes that might make them more tolerable and fit for purpose.

First of all, why do we do performance reviews? There are two main reasons:

To help us figure out what our areas of strength are, and which areas we should be focusing on improving or to avoid doing.
To help our manager (and reporting chain) identify the same.

That is, performance reviews aren’t just a feel-good (feel-bad?) exercise. Not only do they give us a chance to pause and reflect about what we’re doing in a way that we seldom do, but they also serve a purpose for the company that we’re part of. Frequently, performance reviews feed into things like promotions and whether or not to put someone on a “performance improvement plan”.

The problem is (IMHO) performance reviews run at “regular but infrequent” intervals are incredibly unhelpful. How come?

If the only time I’m encouraged to really stop and think how I’m doing in my job is once every six months (or, worse, twelve months) then something has gone horribly awry. The feedback we get as part of a performance allows for “course correction”. Our 1:1s with our managers should be providing some of this, but getting feedback from peers is absolutely essential. Without that feedback, we can go for an excessive length of time accidentally making things worse.

So, the first reason I hate “regular and infrequent” perf reviews? They’re too little, too late.

Honestly, I could stop here. That’s the biggest reason I hate them. I could stop, but I’m not going to.

What would I prefer? Some places offer “continuous” feedback (Goldman Sachs for example. Goldman Sachs!) That is, when there’s a suitable point for feedback to be collected and given, it’s collected and given. That may be at the end of a project, at particular milestones, or at some finer-grain than “every six months”

Another problem with “regular and infrequent” feedback loops is that it doesn’t foster a culture of people giving helpful feedback at the point where it could be most effective. Instead, people end up feeling ambushed by feedback that people have been harbouring, holding, and (possibly) festering on for up to a year. The best feedback I ever got was mid-way through helping to organise a conference, where one of the other people helping took me aside and told me exactly how they perceived my work, and what I could do to improve. It wasn’t comfortable to hear, and it can’t have been comfortable to give, but I listened and changed, and that helped everyone around me, and helped us smooth the work of getting that conference sorted out.

What would I prefer? At the very least, having feedback after milestone events in a project, and ideally when I (or my manager) think it would be helpful. Even better would be to be in a culture where people felt able to provide feedback as they deemed it necessary and when it would be helpful. It’s probably not a revelation that timely, helpful feedback is preferred to untimely, ambush-style feedback, so it puzzles me why I’ve seen so much of the latter and so little of the former.

Giving and receiving feedback can be very difficult. Pat Kua has a series of blog posts on this subject, but I find this post a good jumping off point.

However! Perf reviews aren’t just for the individual! They’re for the company too, and “regular but infrequent” perf reviews are a disaster for them.

Consider the way that feedback is normally gathered. Junior engineers typically ask people in their own team, and their tech leads. Senior engineers, ask folks on other teams they interact with, frequently the other senior engineers. Managers seeking to support people looking for promotions tend to ask tech leads of teams and senior engineers for more detailed feedback too.

What does this mean?

It means that during “perf review season” a company’s most senior and influential engineers are no longer writing code and guiding teams. Instead, they’re writing walls of feedback for people, at best only being able to focus part of their attention on the projects they’re working on. The junior engineers tend to have less feedback to write, and get back to the grindstone sooner. This is clearly deleterious to the quality and progress of the projects.

That’s compounded by the fact that management is also soaked up in a massive effort to collect, collate, and standardise the feedback that’s coming in. Frankly, it leads to a huge uptick in stress and chaos, neatly targeted at a company’s leadership.

That can’t be good.

What would be better? Spreading the load over the course of the year would help. Doing one or two pieces of feedback every week is far less of a chore for the people providing that feedback. People who have just reached a milestone and have some space to breathe, with lessons still fresh in their minds, provide more useful feedback.

The downside with continuous feedback like this is that more rigour needs to be put in place to ensure that people are judged fairly and consistently. I don’t know the best way of doing that (sampled feedback, collected and reviewed using the current techniques? A subset of people asked for feedback using the “regular and infrequent” process?), but I bet you someone smarter than me has been thinking about it already.

Rebuilding the Site

Mon, 27 Jul 2020 00:00:00 +0000

The last time I updated this site was sometime in 2012. I put up a placeholder page, meaning to do far more, started using Blogger for my blog rather than a home-grown solution (again, meaning to find a way to import the old content), and then… and then… and then life happened, and I never seemed to do anything about it.

Then came the pandemic, and eventually, encouraged by my wife, I decided to refresh the whole thing.

The old site had been hosted on Google App Engine and I was pretty happy with it: SSL was zero-hassle, setup was simple, and it seemed like a safe bet. Eight years later, and App Engine isn’t the shinest, spiffiest thing on the block right now, but it has shown itself to be reliable and unfussy. I like reliable and unfussy in my tech. I like boring. So it seemed like a good idea to continue using GAE.

However, I wanted to move off Blogger, and I wanted the site design to be a little more interesting, to work on mobile, and to be as fast and light as possible. So I had a look around, and chose the following:

Google App Engine
Hugo with the Minimal theme
A private repo on GitHub
Deployed using GitHub actions

That last point is particularly nice: I just need to push, and the site gets updated automatically at some not-too-distant point in the future. So, how did I do it?

Hugo

Hugo is pretty easy to set up. The docs are excellent, and after browsing through the Hugo Themes site, it looked like the “Minimal” theme hit many of my requirements and was also pretty light. Getting the skeleton set up was an absolute breeze.

Rather than using a git submodule, I used GitHub’s SubVersion support and grabbed the theme using svn export: I knew that I’d be making tweaks, and I really do want to understand CSS and modern HTML better, so being able to do an update wasn’t so important. This also meant that I avoided the full horror of git submodules, which have always seemed an inelegant way of expressing modularity to me.

Migrating from Blogger

I had thought that this would be a nightmare. I wanted the old URLs to continue to work, and it was this that had stopped me dead in my tracks. I need not have worried. A little hunting found blog2md, which is a useful tool to take a Blogger (or, it turns out, Wordpress) backup, and convert that into MarkDown with front-matter that Hugo can consume.

My blog on Blogger was pretty minimal, and there were no images, so running node index.js b blogger-exoport.xml out did what I wanted. The only problem were the comments, but since I’d done a pretty poor job keeping spammers out of them, I felt it okay (though a little sad) to delete them.

The hardest bit was navigating the Blogger admin console to find out how to do the backups. If you’re following in my footsteps, that’s currently hidden under Settings -> Manage blog -> Back up content. It’s quite a long way down the settings, and I managed to overlook it at first. D’oh!

Hosting on App Engine

The nice thing with a static site is that it’s really simple to host. Anywhere with a working web server will do. However, I’ve used App Engine for a while now and I quite like it, so I don’t see a need to jump on to the Fastly of Netlify trains yet. Besides, I’ve already spent part of my “interesting” budget on Hugo (which we also use for the Selenium website), and I want to spend the rest on the deployment pipeline. App Engine it is, then.

The app.yaml is remarkably simple:

runtime: python27
api_version: 1
threadsafe: true

handlers:

- url: /
  static_files: public/index.html
  upload: public/*
  secure: always

# Standard index.html check
- url: /(.*)/
  static_files: public/\1/index.html
  upload: public/*
  secure: always

- url: /(.*\..*)
  static_files: public/\1
  upload: public/(.*)
  secure: always

Most of this is deeply unexciting (yay!) We’re using a lightweight runtime (python), but we could have used anything. We force all the handlers to use TLS where possible, and we check for an index.html file every time you visit a directory (which has been standard practice since the dawn of the Web)

One thing to note is that the content is all served from public. That’s the directory generated by Hugo when it’s run.

Having set up the basic app, it was possible to deploy from the command line, and everything looked pretty good, except I needed to configure TLS in order to have everything work on a custom domain. To do this, I needed to prove that I was the owner of the domain, and then the App Engine console has a handy option in Settings -> Custom Domains to Enable Managed Security. This will take care of creating and renewing SSL certs for you, so it becomes a no-brainer to just flip the switch.

Deploying via GitHub Actions

The final piece of the puzzle was to use GitHub Actions to deploy the site when I push to the repo hosting the content. I didn’t want to have to check in generated content (because “whhhhyyyy!?”). In the end I ended up with:

name: CI

on:
  push:
    branches: [ trunk ]
  workflow_dispatch:
    branches: [ trunk ]

jobs:
  build:
    # The type of runner that the job will run on
    runs-on: ubuntu-latest

    - uses: actions/checkout@v2

    - name: Setup Python
      uses: actions/setup-python@v2
      with:
        python-version: "3.8"

    - name: Hugo setup
      uses: peaceiris/actions-hugo@v2.4.12
      with:
        hugo-version: "0.71.1"

    - name: Build site
      run: hugo 

    - id: Login
      uses: GoogleCloudPlatform/github-actions/setup-gcloud@master
      with:
        version: '290.0.1'
        project_id: "mysite"
        service_account_key: ${{ secrets.GCP_SA_KEY }}
        export_default_credentials: true
    - run: gcloud info

    - id: Deploy
      uses: GoogleCloudPlatform/github-actions/appengine-deploy@master
      with:
        project_id: "mysite"

The hardest bit was figuring out that I needed to install Python into the pipeline, or the Google tools would all get very grumpy. That, and remembering to set up a service account to do the deployments for me, remembering to store the secrets in GitHub Secrets.

Actually, I lied. The hardest bit was figuring out what IAM roles were needed for the service account to use. I’ve ended up with:

App Engine Deployer
App Engine Service Admin
Cloud Build Service Account
Storage Object Creator
Storage Object Viewer

After creating those and manual reruns (which explains the workflow_dispatch section at the top) I finally got everything up and running.

In the end, the actual effort was a pleasant evening’s worth of mucking around with Hugo, App Engine (it’s been a while), and GitHub Actions. I’m pretty happy with the result :)

Writing Again

Mon, 27 Jul 2020 00:00:00 +0000

A friend of mine messaged me a while ago, and reminded me that my blog had been dormant for a very long time. They were right. I’ve not sat and written down my thoughts for a long time. It’s time to get started again.

Now, what to write about….

Many Months in Selenium: to November

Mon, 26 Nov 2018 16:36:00 +0000

Well, it’s been a long time since I sat down and wrote a post about the adventures in Selenium-land. Time for an update!

Since I last wrote, the work of updating the JSON handling code in the java tree has been completed, and it appears to be stable. However, it would be a terrible waste of time if that was all that we had done, and fortunately it’s not :)

The big thing is that we’ve now finished the 3.x release cycle, and we’re getting ready for 4.0. Someone foolishly let me pick version numbers, so the last few releases of 3 tended to get ever closer to π. Judging by some of the bug reports, the initial jump, from 3.14 to 3.141 appears to have confused some folk, but now that we’ve reached 3.141.59 I think the point has been made (and, just maybe, the joke is wearing thin)

The 4.0 release is going to be a lot of fun. My main focus has been the new Selenium Grid, which features a more modern design, for use with things such as AWS and Kubernetes (and docker compose). Of course, maintaining a simple user experience has been high on our list of goals, so people used to the existing approach of “hub” and “node” will continue to be able to run the system like that. The biggest change is that under-the-covers, the new standalone server and the new Grid are exactly the same software, which is a huge change from the current approach where we have two not-terribly-well-integrated codebases in the same binary.

Another big change is that we’re exploring the move from using Buck to build much of Selenium to Bazel. This hasn’t been something I’ve been keen on doing, since I used to be the tech lead on Buck, and I think it has a number of useful properties. Despite this, Bazel has a comparatively huge amount of community support, and that means that people wanting to hack on the project have a smaller learning curve to climb.

A Month in Selenium: March

Sat, 31 Mar 2018 17:56:00 +0100

The month from February to March has been a fun one. At the beginning of March, I attended SauceCon, and gave a keynote on “Lessons From a Decade in Selenium”. While the original talk had focused on milestones such as when we first started shipping code, or when we switched to Git, or when someone joined the project, as I sat in the airport waiting to fly, I realised that this was an incredibly dull talk; surely the point of keynote is to give people something to think about and consider?

This explains why I was busy rewriting the entire thing at 12km above the ground in a metal tube zipping along at a smidge over 900km/h.

In the end, I spoke about what makes working on Selenium so rewarding, focusing on the themes of “Joy”, “Serendipity”, “Thankfulness”, “Community”, “Growth”, and “Striving”. I’ve yet to see the official feedback, but I believe that the talk was well received, as people kept returning to the main themes throughout the conference.

SauceCon itself was a lot of fun. We were lucky to have some of the Selenium committers (old and new), and also supporters of the project from companies such as Sauce Labs itself, and Applitools (who are providing almost all the effort going into the new Selenium IDE) In addition, the Appium developers were well represented too. It was great to be surrounded by so many people who have spent so much time pouring energy into Open Source Software, and to catch up with some of my favourite people. There’s a lovely photo of Jonathan Lipps and myself in matching bowling shirts, which I’m happy to see he tweeted.

Since we had so many people in the same place, we decided to release Selenium 3.10. The main highlights of this release were behind the scenes for most users, as we focused on the continued clean up of the internal of Grid, and the continued use of our own abstractions to handle HTTP and JSON. Having said this, there were user-supplied patches, notably moving us from Selenium’s own “Duration” class to the one that ships with the JRE. Deleting code is a lot of fun.

One reason for shipping 3.10 was to lay the groundwork for a terrible dad-joke: releasing Selenium 3.11 on the 11th March (3/11 in US date format). Jim Evans and I had noticed that 3.11 was also one of the most famous of the Windows releases, so we decided to lean into the joke, and shipped “Selenium for Workgroups” as well in March. The Selenium server even reports this to users. 3.12 won’t have this feature.

In a bid to help our Windows developers ship the Selenium jars, I merged a ton of upstream changes to our fork of Buck, and then spent some time attempting to resolve the issue where zips created on Windows create unreadable directories when unpacked. My fix doesn’t resolve the issue, so I filed an issue with the upstream Buck project in the hope that they’d fix it for me. If I get some free time, I’ll try this as well.

I’ve also been working on replacing GSON within our tree (though on my local machine). By the end of the month of work, I had a forked version of Selenium that didn’t use GSON at all for outputting JSON. Sadly, I was a little over-ambitious when attempting to finish the work by also deserialising from JSON to proper types. It turns out that there’s a bunch of code in Grid that relies on the current semantics of GSON to function. Stepping back, it looks like most of this is because GSON isn’t aware of our own types, and it should be relatively easy to replace some of this. At least I know I should be working on next….

Well, that, and a new way of starting sessions that allow users to properly make use of all the features that the W3C New Session command offer.

A Month in Selenium: February

Tue, 13 Mar 2018 18:53:00 +0000

January was a quiet month for Selenium hacking, but it laid the groundwork for February’s efforts. These largely centred around code cleanup in the Grid server, and migrating the project to make better use of our own abstractions over JSON and HTTP.

Why do we have our own abstractions for these incredibly common tasks? There are two main reasons. The first is that we’d like freedom to be able to choose our underlying implementation for these things, without needing to extensively rework our own first-party code. The second is that third party libraries offer generalised APIs that need to meet the needs of all users, whereas we have very specific needs met by these APIs and may need to work around some of the sharp edges (for example, in the java code, lots of classes that need to be serialised to JSON have a toJson method that GSON knows absolutely nothing about). This is typically done by writing adapters.

We started using the Apache HttpClient by default as it’s the HTTP library used by HtmlUnit, which we used to ship as part of the core Selenium distribution. In keeping with the other drivers out there, the HtmlUnit team now work on the HtmlUnitDriver, so it’s no longer kept in the main project source repo. The interesting thing is that since we made the choice a long, long time ago to use the HttpClient, the HTTP standard has moved forward. HTTP/2 is now a thing. HTTP/2 support is coming as part of HttpClient 5. In order to take advantage of the new options and capabilities, we’d have to rework our existing abstractions anyway, so why not take a look around for something else to use? Better yet, if we use an HTTP library that isn’t a dependency of one of our dependencies, we’re less likely to end up with clashing versions.

One of the reasons that Java has a terrible reputation for start up speed is because people have massively bloated classpaths. As it stands, the Selenium standalone server weighs in at a portly 24MB. The Apache HttpClient weighs in at about 1.4MB of this total, before we do the update. After the update, the beta of 5 is a touch under 1MB. In comparison, OkHttp (which already supports HTTP/2) with its dependencies is approximately 500kb. In other words, OkHttp is smaller, already supports HTTP/2, and isn’t a dependency of our dependency.

So, we switched the project to use OkHttp instead of the Apache HttpClient.

Within the client code, making this change was relatively trivial. The problem is that the server-side code had leaked Apache’s APIs into the code. Before we can replace the Apache HttpClient, we need to first of all replace all those usages. That’s made somewhat harder by the fact that it’s exposed as part of the public APIs of various classes that other libraries extend.

Fortunately, we have a process for deprecating and deleting APIs. First of all, we mark the methods to be deleted as “deprecated” for at least one release. And then we delete them. Of course, if you’re going to deprecate a method, you really should provide an alternative and migrate as many uses as can be found to use the replacements. A bulk of my work this month was spent making these changes.

Of course, we needed to do a release, so we lined up 3.9 to start the process. In order to do the release, we needed to actually build it. There had been reports of some issues building the release artefacts on Windows. To resolve this, I had to update our fork of Buck to pull in the latest changes from Facebook, and then to try and work around those issues. Naturally, the Buck developers aren’t aware of our fork, so merging in their changes was a somewhat time-consuming affair. Once that I was done, I wrote what I thought was a fix and pushed a new version of our fork of Buck.

I didn’t work. Oh well.

The final step in doing a release is trying to get our CI builds green. These take an incredible amount of time to run, and I wondered whether we could speed them up. Travis has support for caching, so it would be nice to use that. My attempts to use caching were foiled because the cache takes into account environment variables, which we use to separate our builds. There’s a bug open in the Travis tracker to allow us to name builds, which would have allowed us to work around this, but it’s still open. Ho hum. As a work around, I wrote a simple wrapper around Buck that we can call within our CI servers. This makes better use of Buck’s ability to parallelise work automatically, and this has helped bring our build times down. Hurrah!

Two Months in Selenium - November and December

Sun, 21 Jan 2018 18:19:00 +0000

You may have noticed a distinct lack of an update last month. It’s because I was focused on client work, Christmas, and the New Year, and took some time away from the keyboard. But I’m back now!

The W3C WebDriver spec is now at the stage where we need to demonstrate multiple compatible implementations. Realistically, this means that we need two passes for each test in our test suite. The browser vendors are working hard to get things working, and progress is being made. There’s not been a huge amount for me to do here, so this is more of a waiting game than anything else from my perspective. Having said that, I’m on the hook for some sections in Level 2, so I should really sit down and write those (and the matching tests)

The main thing I’ve been focused on has been the Selenium Grid. There are a couple of things that we really need to solve with Grid. The first is that the code is complex and hard to deal with. When we originally released it, it took a huge amount of work to review the code for thread-safety and to debug many of the issues. That code has not become easier to reason about, which makes it harder to foster Open Source contributions.

Of course, that’d be fine if we didn’t care about making any changes, but we do. When Grid came into being, it was normal to have a physical server for each node in the grid. If you were lucky, you might have a massive server with VMWare running on it, which you’d cycle virtual machines on to keep the Grid healthy. The world has changed. Docker is now A Thing, and there are multiple “Selenium as a Service” (SaaS) cloud providers.

There are some projects out there that implement some of the functionality of Grid. For example, selenoid makes use of Docker, but it doesn’t use the W3C dialect of the webdriver protocol, which means it doesn’t do protocol conversion, and it doesn’t natively support cloud providers. Zalenium builds on top of Grid, and provides support for Docker and SaaS, but they’ve had to work within the existing architecture, and there are obvious rough edges.

Finally, we’ve wanted the selenium server to be a “Grid of one”. If you go into the code of the server, you’ll see that there are two fairly separate trees that live side-by-side. When you start the server, it picks one and then goes with it. It’d made things like supporting the W3C protocol harder than it should be, and it’s not an elegant way to run things.

As a solution to this, it seems obvious that there should only be one code path. The problem is that the standalone server is too simplistic about how it assigns work, and (as discussed) the Grid code is too complex. Over the past few releases, I’ve been landing code to help resolve this:

The pass through mode: this makes the server proxy requests without doing an parsing or changes unless necessary.
The ActiveSession abstraction has been added. This makes adding new types of provider (SaaS, Docker) far easier to write.
A “new session pipeline” has been added, and this is being used to handle things like multiple versions of the webdriver protocol.

The most recent thing I’ve been working on has been a new scheduler. This will be rolled into the new session pipeline, and is composed of a number of pieces:

The “Scheduler”: this is responsible for queuing new session requests, handling retries, and waiting until nodes become available. This class is thread-safe and designed to be the main entry point.
The “Distributor”, which is solely responsible for ranking and ordering available hosts, and the sessions that can run on them.
The “Host” abstraction, which represents a physical place where sessions can be run. Each of these has a number of….
The “Session Factory”, which is responsible for creating a new session.

The scheduler will sit within the new session pipeline. For the standalone server, we just add a single host. For the grid, we can add an arbitrary number of hosts (and therefore session factories)

As well as the new scheduler, we’re preparing the 3.9 release. It should be out next week, if everything goes according to plan. :)

The Selenium Server & Creating New Sessions

Mon, 15 Jan 2018 11:18:00 +0000

I’ve had the pleasure of being a co-editor of the W3C’s WebDriver spec, as well as the original author and one of the current maintainers of Selenium’s Java bindings, and one of the main authors of the current Selenium Server, particularly the pieces to do with implementing the W3C spec. So, as one of the few people on the planet who knows how all the pieces fit together, and why they fit together that way, I thought it might be helpful to explain how and why the Selenium Server handles a request to create a new session.

For this discussion, I’ll use the terminology from the spec. A “remote end” refers to the Selenium Server, and a “local end” are the language bindings you’re probably familiar with — there’s some in all the major programming languages, and about a million of them in the JS space too.

First of all, it’s advisable for the local end to send a single request for a new session that includes the expected payloads for both the W3C and the JSON Wire Protocol dialects at the same time.

Consider the case where you just send the w3c payload ({“capabilities”: {“browserName”: “chrome”}}). In this case, a w3c server would correctly attempt to start a chrome session. However, a server that only obeys the JSON Wire Protocol will see an empty payload, in which case it’s free to do whatever it wants.

Sending just the JSON Wire Protcol payload ({“desiredCapabilities”: {“browserName”: “firefox”}}) will create a firefox session in a server that understands the JSON Wire Protocol, but will cause a “no session created” error in a W3C compliant server (since that expects at least {“capabilities”: {}} to be set).

So, we have the expected, legal behaviour of the remote end layed out.

For historical reasons, most bindings only accept a “desired capabilities” hash as the argument when creating a new driver instance. Converting the old-style payloads to legal W3C ones is a non-trivial exercise (for example, {“firefox_profile”: “sdfgh”} is now {“moz:firefoxOptions”: {“profile”: “sdfgh”}}, but what happens if both are set? Also “platform” has become “platformName”, but do the values match? Probably only at the OS family level, according to the note in the spec)

Most local end bindings get this mapping wrong, but the user doesn’t care why their session isn’t as they’d expected it to be, they just know it’s not right. What to do? What, my friends, do we do?

The answer is to be generous about what we receive from the user and attempt to do what they want. Knowing that most local ends have at least a few problems converting the old format to the new format, the selenium server creates an ordered list of capabilities, putting the OSS ones at the front of the list to ensure maximum compatability.

So, now you know.

Why Use a Monorepo?

Mon, 15 Jan 2018 10:56:00 +0000

A monorepo helps reduce the cost of software development. It does this in three different ways: by being simpler to use, by providing better discoverability, and by allowing atomicity of updates. Taking each of these in turn….

Simplicity

In the ideal world, all you’d need to do is clone your software repository, do a build, make an edit, put up a pull request, and then repeat the last three steps endlessly. Your CI system would watch the repository (possibly a directory or two within it), and kick off builds as necessary. Anything more is adding overhead and cost to the process.

That overheard starts being introduced when multiple separate codebases need to be coordinated in some way. Perhaps there’s a protocol definition file that needs to be shared by more than one project. Perhaps there’s utility code that’s shared between more than one project.

In many organisations developers may not have the ability to set up a repo on demand, so there’s a time and political cost in creating one. Then there’s the ongoing cost of maintaining them, backing them up, and so on. Especially if data is being duplicated between repositories, the aggregate total space used by these repos will also be larger.

Multiple repositories are not necessarily “simple”.

One straw man solution to the problems of coordination is to copy all required dependencies into your own repo, but then we’ve a huge pile of duplicated work that opens up the possibility of parallel but incompatible changes being made at the same time.

A better solution is to build binary artefacts that are stored in some central location, and grab those when required. Bad experiences with storing binaries in the VCS make many people shy of just checking in the artefacts, so this storage solution seems attractive. But the alternatives introduce complexity. Where previously we only had to worry about maintaining the uptime of the source control system, there’s now the additional cost of maintaining this binary datastore, and ensuring its uptime too. Worse, in order to preserve historical builds, the binary datastore needs to be immutable after a write. In my experience, rather than being a directory served using nginx or similar, people turn to commercial solutions even when free alternatives are available. The cost of building and running this infrastructure raises the total cost of development.

Another area where monorepos bring simplicity is when a package or library needs to be extracted from existing code. This process is simple in a monorepo: just create the new directories, possibly after asking permission from someone, and check in. Every other user receives that change with their next update, without needing to re-run tooling to ensure that their patchwork clients are up to date. Outside of a monorepo, the process can be more painful, especially if a new repository is needed for the freshly extracted code.

Identifying every place that is impacted by such a code change is also easy in a monorepo, even if you’re not using a graph-based build tool such as bazel or buck, but doing something like “maven in a monorepo”. The graph-based build tools typically have a mechanism to query the build graph, but if the tree is one place and you don’t have code-insight tools, then even “grep” can get you quite far.

There are arguments about monorepos stressing source control software, requiring different tool chains, or not being compatible with CI systems. I addressed those concerns in an earlier post, but the TL;DR is “modern DVCS systems can cope with the large repos, you don’t need to change how you build code, and your CI pipelines can be left essentially ‘as is’.”

Discoverability

One of the ways that monorepos drive down the cost of software development is by reducing duplication of effort.

It’s a truism that the best code is the code that is never written. Every line of code that’s written imposes an ongoing cost of maintenance that needs to be paid until the code is retired from production (at the very earliest!). Fortunately, a good software engineer is a lazy software engineer — if they’re aware of a library or utility that can be used, they’ll use that.

In order to function properly, a monorepo needs to be structured to ease discoverability of reusable components, as covered in the post about organising a monorepo. One of the key supporting mechanisms is to separate the tree into functional areas. However, just because a monorepo is structured to aid discoverability, it doesn’t do anything to prevent “spaghetti dependencies” from appearing. What it does do is help surface these dependencies, which would exist in any case, without fancy additional tooling.

Naturally, a monorepo isn’t the only way of solving the problem of discovering code. Good code insight tooling can fill the same role (go Kythe!), as do central directories where people can find the code repositories that house useful shared code. Even hearsay and guesswork can suffice; after all, the Java world has coped with Maven Central for an incredibly long time.

Discovering code has other benefits. As a concrete example, it becomes possible to accurately scope the size of refactorings to APIs within an organisation: simply traverse the graph of everything impacted by a change, and make the change. What used to be a finger-in-the-air guess, or would require coordination across multiple repositories, becomes a far simpler exercise to measure. To actually perform the change? Well, there’s still politics to deal with. Nothing stops that.

Being able to identify all the locations that are impacted by any change makes CI infrastructure easier to write and maintain. After all, we use CI to answer the questions “is our code safe to deploy? And if not, why is it not safe?” In a monorepo, the graph of dependencies is easier to discover, and that graph can (and should!) be used to drive minimally-sized but provably correct builds, running all necessary build and test and not a single thing more. Needless to say, this means that less work is done by the CI servers, meaninging tighter feedback loops, and faster progress. Do you need a monorepo to build this graph? Of course not. Is building that infrastructure to replicate this something you’ve time to do? Probably not.

There is also nothing about using a monorepo that precludes putting useful metadata into the tree at appropriate points. Individual parts of the tree can include license information (particularly when importing third party dependencies), or READMEs that provide human-readable information about the purpose of a directory or package, and where to go for help. However, the need for some of this metadata (“how do I get the dependencies?”, “what’s the purpose of this package?”) can be significantly reduced by structuring the monorepo in a meaningful way.

Atomicity

Occasionally there are components that need to be shared between different parts of the system. Examples include IDL files, protobuf definitions, and other items that can be used to generate code, or must exist as a shared component between client and server.

Now, there’s reams to be written about how to actually manage updating message definitions in a world where there might be more than one version of that protocol in the wild, and having a monorepo doesn’t prevent you from needing to follow those rules and suggestions. What a monorepo allows is a definitive answer to the question of where these shared items should be. Traditionally, the answer has been:

In one of the clients
In the server
In a central location, referenced by everything
Gadzooks, we’ll copy the damn thing everywhere

Needless to say, the last approach is remarkably painful, since all changes to the definitions need to tracked across all repositories. In the first two cases, you may end up with unwanted dependencies on either client or server-side code. So the sensible thing to do is to store the shared item in a different repository. This will lead you to the horror of juggling multiple repositories, or, if you’re lucky, taking a dependency on a pre-built binary that someone else is responsible for building.

Interesting things happen when the shared item needs to be updated. Who is responsible for propagating the changes? Without a requirement to update, teams seldom update dependencies, so there’s out-of-band communication that needs to happen to enforce updates.

Using a monorepo resolves the problem. There’s one place to store the definition, everyone can depend on it as necessary, and updates happen atomically across the entire codebase (though it may take a long time for those changes to be reflected in production) The same logic applies to making small refactorings — the problem is easy to scope, and completion can be done by an individual working alone.

Summary

Monorepos can reduce the cost of software development. They’re not a silver bullet, and they require an organisation to practice at least a minimal level of collective code ownership. The approach worked well at Google and Facebook because those companies fostered an attitude that the codebase was a shared resource, that anyone could contribute to and improve.

For a company which prevents people from viewing everything and having a global view of the source tree, for whatever reason (commercial? Social? Internally competing teams?) a monorepo is a non-starter. That’s a pity, because there are considerable cost savings to be made as more and more share a monorepo. It’s also possible to implement a monorepo where almost everything is public, with parts selected pieces being made available as pre-compiled binaries or otherwise encrypted for most individuals.

Monorepos help reduce the cost of software over the lifetime of the code by simplifying the path to efficient CI, lowering the overhead of ensuring changes are propagated to dependent projects, and by reducing the effort required to extract new packages and components. As Will Robertson pointed out, they can also help reduce the cost of developing development support tooling by providing a single-point “API” to the VCS tool and the source code itself.

Complementary practices

Monorepos solve a whole host of problems, but, just as with any technical solution, there are tradeoffs to be made. Simply cargo-culting what Google, Facebook, or other public early adopters of the pattern have done won’t necessarily lead you to success. On the flip side of the coin, sticking with “business as usual” within a monorepo may not work either.

Although complex branching strategies might work in a monorepo, the sheer number of moving pieces means that the opportunity for merge conflicts increases dramatically. One of the practices that should be most strongly considered is adopting Trunk Based Development. This also suggests that developers work on short-lived feature branches, hiding work in progress behind feature flags.

Software development is a social activity. Merging many small commits without describing the logical change going in makes the shared resource of the repo’s logs harder to understand. This leads to a model that is less common than it used to be — squashing the individual steps that lead to a logical change to a single commit, which describes that logical change. This makes using the commit logs a useful resource too. Code review tools such as Phabricator help make this process simpler.

Most importantly: stop and think. It is unlikely your company is Google, Facebook, Twitter, Uber, or one of the other high-profile large companies that have already adopted monorepos (but if you’re reading from one of those places, “Hi!”). A monorepo makes a lot of sense, but simply aping the big beasts and hoping for the best won’t lead to happiness. Consider the advantages to your organisation for each step of the path towards a monorepo, and take those steps with your eyes open.

Thanks

Thank you to Nathan Fisher, Josh Graham, Paul Hammant, Felipe Lima, Dan North, Will Robertson, and Chris Stevenson for the suggestions and feedback while I was writing this post.

Organising a Monorepo

Wed, 22 Nov 2017 15:39:00 +0000

How should a monorepo be organised? It only takes a moment to come up with many competing models, but the main ones to consider are “by language”, “by project”, “by functional area”, and “nix style”. Of course, it’s entirely possible to blend these approaches together. As an example, my preference is “primarily language-based, secondarily by functional area”, but perhaps I should explain the options.

Language-based monorepos

These repos contain a top-level directory per language. For languages that are typically organised into parallel test and source trees (I’m looking at you, Java) there might be two top-level directories.

Within the language specific tree, code is structured in a way that is unsurprising to “native speakers” of that language. For Java, that means a package structure based on fully-qualified domain names. For many other languages, it makes sense to have a directory per project or library.

Third party dependencies can either be stored within the language-specific directories, or in a separate top-level directory, segmented in the same language specific way.

This approach works well when there aren’t too many languages in play. Organisation standards, such as those present in Google, may limit the number of languages. Once the number of languages becomes too many, it becomes hard to determine where to start looking for the code you may depend on.

Project-based monorepos

One drawback with a language-based monorepo is that it’s increasingly common to use more than one language per project. Rather than spreading code across multiple locations, it’s nice to co-locate everything needed for a particular project in the same directory, with common code being stored “elsewhere”. In this model, therefore, there are multiple top-level directories representing each project.

The advantage with this approach is that creating a sparse checkout is incredibly simple: just clone the top-level directory that contains the project, et voila! Job done! It also makes removing dead code simple — just delete the project directory once it’s no longer needed, and everything is gone. This same advantage means that it’s easy to export a cell as an Open Source project.

The disadvantage with project-based monorepos is that the top level can quickly become bloated as more and more projects are added. Worse, there’s the question of what to do when projects are mostly retired, or have been refactored to mostly slivers of their former glory.

Functional area-based monorepos

A key advantage of monorepos is “discoverability”. It’s possible to organise a monorepo to enhance this, by grouping code into functional areas. For example, there might be a directory for “crypto” related code, another for “testing”, another for “networking” and so on. Now, when someone is looking for something they just need to consider the role it fulfills, and look at the tree to identify the target to depend on.

One way to make this approach fail miserably is to make extensive use of code names. “Loki” may seem like a cool project name (it’s not), but I’ll be damned if I can tell what it actually does without asking someone. Being developers, we need snazzy code names at all times, and by all means organise teams around those, but the output of those projects should be named in as a vanilla a way as possible: the output of “loki” may be a “man in the middle ssl proxy”, so stick that in “networking/ssl/proxy”. Your source tree should be painted beige — the least exciting colour in the world.

Another problem with the functional area-based monorepos is that considerable thought has to be put into their initial structure. Moving code around is possible (and possible atomically), but as the repo grows larger the structure tends to ossify, and considerable social pressure needs to be overcome to make those changes.

Nix-style monorepos

Nix is damn cool, and offers many capabilities that are desirable for a monorepo being run in a low-discipline (or high-individuality) engineering environment, incapable of managing to keep to only using (close to a) single version of each dependency. Specifically, a nix-based monorepo actively supports multiple versions of dependencies, with projects depending on specific versions, and making this clear in their build files.

This differs from a regular monorepo with a few alternate versions of dependencies that are particularly taxing to get onto a single version (*cough* ICU *cough*) because multiple versions of things are actively encouraged, and dependencies need to be more actively managed.

There are serious maintainability concerns when using the nix-style monorepo, especially for components that need to be shared between multiple projects. Clean up of unused cells, mechanisms for migrating projects as dependencies update, and stable and fast constraint solving all need to be in place. Without those, a nix-style monorepo will rapidly become an ungovernable mess.

The maintainability issue is enough to make this a particularly poor choice. Consider this the “anti-pattern” of monorepo organisation.

Blended monorepos

It’s unlikely that any monorepo would be purely organised along a single one of these lines; a hybrid approach is typically simpler to work with. These “blended monorepos” attempt to address the weaknesses of each approach with the strengths of another.

As an example, project-based monorepos rapidly have a cluttered top-level directory. However, by splitting by functional area, or language and then functional area, the top-level becomes less cluttered and simultaneously easier to navigate.

For projects or dependencies that are primarily in one language, but with support libraries for other languages, take a case-by-case approach. For something like MySQL, it may make sense to just shovel everything into “c/database/mysql”, since the java library (for example) isn’t particularly large. For other tools, it may make more sense to separate the trees and stitch everything together using the build tool.

Third party dependencies

There is an interesting discussion to be had about where and how to store third party code. Do you take binary dependencies, or pull in the source? Do you store the third party code in a separate third party directory, or alongside first party code? Do you store the dependencies in your repository at all, or push them to something like a Maven artifact repository.

The temptation when checking in the source is that it becomes very easy to accidentally start maintaining a fork of whichever dependency it is. After all, you find a bug, and it’s sooo easy to fix it in place and then forget (or not be allowed) to upstream the fix. The advantage of checking in the source is that you can build from source, allowing you to optimise it as along with the rest of the build. Depending on your build tool, it may be possible to only rely on those parts of the library that are actually necessary for your project.

Checking in the binary artifacts has the disadvantage that source control tools are seldom optimised for storing binaries, so any changes will cause the overall size of the repository to grow (though not a snapshot at a single point in time) The advantage is that build times can be significantly shorter (as all that needs to be done is link the dependency in)

Binary dependencies pulled from third parties can be significantly easier to update. Tools such as maven, nuget, and cocoapods can describe a graph of dependencies, and these graphs can be reified by committing them to your monorepo (giving you stable, repeatable historical builds) or left where they lie and pulled in at build time. As one of the reviewers of this post pointed out, this requires the community the binaries are being pulled from to be well managed: releases must not be overwritten (which can be verified by simple hash checks), and snapshots should be avoided.

Putting labels on these, there are in-tree dependencies and externally managed dependencies, and both come in source and binary flavours.

Thanks

My thanks to Nathan Fisher, Josh Graham, Will Robertson, and Chris Stevenson for their feedback while writing this post. Some of the side conversations are worth a post all of their own!

Tooling for Monorepos

Mon, 20 Nov 2017 14:43:00 +0000

One argument against monorepos is that you need special tooling to make them work. This argument commonly gets presented in a variety of ways, but the most frequent boil down to:

Code size: a single repo would be too big for our source control system!
Requirement for specialised tooling: we’re happy with what we have!
Reduces the ability of teams to move fast and independently
Politics and fiefdoms

Let’s take each of these in turn.

Code size

Most teams these days are using some form of DVCS, with git being the most popular. Git was designed for use with the Linux kernel, so initially scaled nicely for that use-case, but started to get painful after that. That means that we start with some pretty generous limits: a fresh clone of linux repo at depth 1 takes just shy of 1GB of code spread between in over 60K files (here’s how they make it work!). Even without modifying stock git, Facebook was able to get their git repo up to 54GB (admittedly, with only 8GB of code). MS have scaled Git to the entire Windows codebase: that’s 300GB spread between 3.5M files and hundreds of branches. Their git extensions are now coming to GitHub and non-Windows platforms.

Which is good news! Your source control system of choice can cope with the amount of code a monorepo contains. Hurrah!

But how long does that take to check out? I’ll be honest, checking out a repo that’s 1GB large can take a while. If that is, you check out the whole 1GB. Git, Mercurial, Perforce, and Subversion support “sparse” working copies, where you only clone those directories you need. The sparse checkout declarations can either be declared in files stored in source control, or they can computed. They likely follow cell boundaries within the monorepo. It should be clear that in the ideal case, the end result is a working copy exactly the same size as a hand-crafted repository containing just what’s needed, and nothing more. As a developer moves from project to project, or area to area, they can expand or contract their current clone to exactly match their needs.

So your checkouts don’t necessarily get larger. They may even get smaller.

But, what if you do have everything checked out? Your source control tool needs to know which files have changed. As the size of the repository grows, the slower these operations become, impacting developer performance. Except both Git and Mercurial have support for filesystem watching daemons (notably “watchman”) These allow file checking operations to scale linearly with the number of files changed, rather than with the number of files in the repository (I’d hope that even those using a “normal” large checkout would consider using this)

So everything is fine with the raw tooling. But what about your IDE?

I mean, yeah, if you’ve checked out the entire source tree, surely your IDE will grind to a halt? First of all, don’t do that — use a sparse clone — but if you insist on doing it, update your tooling. Facebook spent a chunk of resources to help make IntelliJ more efficient when dealing with large projects, and upstreamed those changes to Jetbrains, who accepted the patches. It was possible to pull in the source code for every Facebook Android app at the same time in IntelliJ. You may have a lot of code, but it’s unlikely to be that much. Other editors can also happily work with large source trees.

So, code size isn’t the problem you might imagine it is.

Requirement for specialised tooling

Quite often when people talk about monorepos, they also talk about the exotic tooling they use, from custom build systems, tricked-out source control servers, and custom CI infrastructure. Perhaps a giant company has the time and resources to build that, but you’re too busy doing your own work.

Except a monorepo doesn’t require you to do any of those things. Want to use a recursive build tool you’re already familiar with? Go ahead. Paul Hammant has done some interesting work demonstrating how it’s possible to use maven (and, by extension, gradle and make) in a monorepo.

Switching to a build tool such as buck or bazel does make using a monorepo simpler, because these tools provide mechanisms to query the build graph, and can be simply configured to mark various parts of the tree as being visible or not to particular rules, but using one of these isn’t required. One nice thing? You don’t need to write buck or bazel yourself — they’re both already out there and available for you to use.

Similarly, if you’re comfy with jenkins or travis, continue using them. Admittedly, you’ll need to configure the CI builds to watch not just a repo, but a subdirectory within a repo, but that’s not too hard to do. If you’re using a graph-based build tool, then you can even use jenkins or buildbot to identify the minimal set of items to rebuild and test, but, again, there’s no need to do that. Just keep on trucking the way you do now.

Reduces the ability of teams to move fast and independently

Having a repository per-project or per-team allows them to operate entirely independently of one another. Except that’s not true unless you’re writing every single line of code yourself. It’s likely you have at least a few first and third party dependencies. At some point, those dependencies really should be updated. Having your own repo means that you can pick the timing, but it also means you have to do the work.

Monorepos naturally lead people to minimising the number of versions of third party dependencies towards one, if only to avoid nasty diamond dependency issues, but there’s no technical reason why there can’t be more than one version of a library in the tree. Of course, only a narcissist would check in a library without making an effort to remove the old versions. There are a pile of ways to do this, but my preferred way is to say that the person wanting the update manages the update, and asks for help from teams that are impacted by the change. I’ll cover the process in a later post. No matter how it’s done, the effect of having a single atomic change amortises the cost of the change over all the repos, reducing the cost of software development across the entire organisation by front loading the cost of making the change.

But perhaps it’s not the dependencies you enjoy freedom on. Perhaps it’s the choice of language and tooling? There’s no reason a properly organised monorepo can’t support multiple languages (pioneers such as Google and Facebook have mixed language repos) Reducing the number of choices may be an organisation-level goal, in order to allow individuals to cycle quickly and easily between teams (which is why we have code style guidelines, right?), but there’s nothing about using a monorepo that prevents you from using many different tool chains.

As a concrete example of this, consider Mozilla. They’re a remote-first, distributed team of iconoclasts and lovely folks (the two aren’t mutually exclusive :) ) Mozilla-central houses a huge amount of code, from the browser, through extensions, to testing tools, and a subset of the web-platform-tests. A host of different languages are used within that tree, including Python, C/C++, Rust, Javascript, Java, and Go, and I’m sure there are others too. Each team has picked what’s most appropriate and run with those.

Politics and fiefdoms

There’s no getting away from politics and fiefdoms. Sorry folks. Uber have stated that one of the reasons they prefer many separate repositories is to help reduce the amount of politics. However, hiding from things is seldom the best way to deal with them, and the technical benefits of using a monorepo can be compelling, as Uber have found.

If an organisation enthusiastically embraces the concept of collective code ownership, it’s possible to avoid anything other than purely social constructs to prevent ego being bruised and fiefdoms being encroached on. The only gateways to contribution become those technical gateways placed to ensure code quality, such as code review.

Sadly, not many companies embrace collective code ownership to that extent. The next logical step is apply something like GitHub’s “code owners”, where owners are notified of changes before they are committed (ideally. Using post-commit hooks for after the fact notification isn’t as efficient) A step further along, and OWNERS files (as seen in Chromium’s source tree) list individuals and team aliases that are required to give permission to land code.

If there is really strong ownership of code, then your source control system may be able to help. For example, perforce allows protection levels to be set for individual directories within a tree, and pre-commit hooks can be used for a similar purpose with other source control systems.

Getting the most of a monorepo

Having said that you don’t need to change much to start using a monorepo, there are patterns that allow one to be used efficiently. These suggestions can also be applied to any large code repositories: after all, as Chris Stevenson said “any sufficiently complicated developer workspace contains an ad-hoc, informally specified, bug-ridden implementation of half a monorepo”

Although it’s entirely possible to use recursive build tools with a monorepo (early versions of Google’s still used make), moving to a graph-based build tool is one of the best ways to take advantage of a monorepo.

The first reason is simply logistical. The two major graph-based build tools (Buck and Bazel) both support the concept of “visibility”. This makes it possible to segment the tree, marking public-facing APIs as such, whilst allowing teams to limit who can see the implementations. Who can depend on a particular target is defined by the target itself, not by its consumers, preventing uncontrolled growth in access to internal details. An OOP developer is already familiar with the concept of visibility, and the same ideas apply, scaled out to the entire tree of code.

The second reason is practical. The graph-based build tools frequently have a query language that can be used to quickly identify targets given certain criteria. One of those criteria might be “given this file has changed, identify the targets that need to be rebuilt”. This simplifies the process of building a sensible, scalable CI system from building blocks such as buildbot or GoCD.

Another pattern that’s important for any repository that has many developers hacking on it simultaneously is having a mechanism to serialise commits to the tree. Facebook have spoken about this publicly, and do so with their own tooling, but something like gerrit, or even a continuous build could handle this. Within a monorepo, this tooling doesn’t need to be in place from the very beginning, and may never be needed, but be aware that it eases the problem of commits not being able to land in areas of high churn.

A final piece in the tooling puzzle is to have a continuous build tool that’s capable of watching individual directories rather than the entire repository. Alternatively, using a graph-based build tool allows a continuous build that watches the entire repository to at least target the minimal set of targets that need rebuilding. Of course, it’s entirely possible to place the continuous build before the tooling that serialises the commits, so you always have a green HEAD of master….

Thanks

My thanks to Nathan Fisher, Josh Graham, Paul Hammant, Will Robertson, and Chris Stevenson for their feedback and comments while writing and editing this post. Without their help, this would have rambled across many thousands of words.

Some Useful Monorepo Definitions

Sun, 19 Nov 2017 23:14:00 +0000

The concept of a monorepo seems so self-evident that there is little need to define it. Just co-locate all your code in one place, and you’re done, right?

The problem is that this doesn’t capture lots of the nuance of the term. After all, if all you have is a single project, then, by this definition, you have a monorepo. While technically correct (the best kind of correct!) this doesn’t feel right. There has to be more to it than that.

Monorepo

Summary:

A monorepo represents the body of code and supporting digital assets owned by an organisation. Within that body of code, it’s possible to draw logical boundaries around certain areas, either shared libraries, individual projects, or other groupings.

Discussion:

Previously, I’ve written that a monorepo is “a unified source code repository used by an organisation to host as much of its code as possible.” That does the job, but I think it falls short of succinctly describing the goals of a monorepo in favour of an implementation of the pattern. Oh well, exploration of an idea is an iterative process, with each iteration being able to use the insights from previous iterations. Let’s iterate again!

Cell

Summary:

A cell is an atomic unit representing a single logical piece within the monorepo.

Discussion:

When we were working on Buck, we struggled for a long time to come up with the best name for the logical areas with the monorepo. Initially, they were formed from the individual repositories we were coalescing into the monorepo. However, “repository” was an overloaded term, and so one we wanted to avoid. Similarly, “module” already has established meaning in some of languages we wanted to support.

In the end, we settled on using a biological metaphor. Because a monorepo represents a body of code, and these logical groupings represent the atomic units that the monorepo is constructed from, we called them cells. In many organisations, pre-monorepo, a cell represents a single repository.

Because of this mapping to a conceptual repository, a cell is a great candidate for Open Sourcing. Should this happen, it’s entirely possible that there needs to be some tooling to map file structure from the shape used within the monorepo to the shape expected by the OSS library. Ideally, that tooling would allow code to be both imported and exported to and from the monorepo, rather than only allowing a push in a single direction.

Projected Monorepo

Summary:

A set of repositories presented as if they were a monorepo, typically via additional tooling.

Discussion:

Monorepos may be classified by the way that the code within is organised, but there is another approach: the projected monorepo. This isn’t a monorepo in the (umm…) traditional sense, where all the code is in the same code repository, but something that acts as if it were a monorepo through external tooling. An example would be the Android Open Source project, which uses “repo” to stitch together multiple separate repositories into something that acts as a single cohesive whole. To a lesser extent, things like git submodules also fulfill the same role of creating projected monorepos.

In a projected monorepo, it is clear where the cells lie — they’re the individual repositories that are being stitched together to form the new whole.

Target

Summary:

The individual units addressable by the build tool, which are used to declare dependencies.

Discussion:

Within a monorepo there are targets. These are units that are addressable by the build tool, and are also typically used to declare dependencies. They typically have concrete outputs, such as libraries or binaries. Targets are human-readable, and are most commonly given as a path within the repository.

A cell is typically composed of many targets. As an example, perhaps a cell consists of a single library. There might be targets within that cell would allow the library to be built, the tests for that library to built, and (perhaps) another to allow those tests to be run.

Graph-based build tool

Summary:

A build tool designed for use within a monorepo where build files are located throughout the source tree and used in a non-recursive manner.

Discussion:

It’s common to use a graph-based build tool with monorepos. These are tools that are natively designed for a monorepo, and operate on the directed acyclic graph of dependencies between targets. They typically provide the ability to build polyglot projects, and the ability to query the build graph. The two major examples are Google’s bazel and Facebook’s buck. Both of these tools can trace their user-facing design to Google’s “Blaze” build tool.

Admittedly, behind the scenes almost every build tool makes use of basic graph theory in order to work: after all, most tools to a topological sort of targets in order to work their magic, and they frequently have commands that allow that graph to be queried. The major difference between these other tools and what I’m terming a “graph-based build tool” is the use of build files throughout the tree that are used in a non-recursive way. This encourages the creation of relatively small compilation units.

Hopefully these terms, and the various ways of organising a monorepo, give us a common language to discuss monorepos in a meaningful way.

Thanks

My thanks to Kent Beck, Nathan Fisher, Josh Graham, Paul Hammant, Will Robertson, and Chris Stevenson for their comments and feedback while writing this post. The conversations have definitely helped clarify and improve this post.

A Month in Selenium - October

Sun, 19 Nov 2017 20:58:00 +0000

Another month, another update, you lucky people. The highlights:

W3C TPAC.
I attended the W3C’s TPAC meeting in California. This is the main get-together for many of the “working groups” that are working on standards as part of the W3C. It’s also where the Browser Tools and Testing Working Group met to discuss progress on the WebDriver spec.

Good news! Once we clean up the implementation report, we’re ready to move to “Proposed Recommendation”, which is the last step before becoming a standard (or “Recommendation” in W3C parlance).

More good news! The “Level 2” version of WebDriver will have a new logging infrastructure added. This will make it easier for you (yes, you!) to figure out where failures have occurred. Better insight should lead to more stable tests.

Even more good news! Some of the folks from Sauce Labs attended the face-to-face meeting. They help bring an additional perspective to the design and use cases of the protocol. Until now, the group has been mostly composed of browser implementors and people from the Selenium project. The more people involved with the spec, the better it’s going to be.

The minutes for this face-to-face session are available, as are the minutes for the other face-to-face sessions.

Hacking on Selenium
Last month, we were closing in on the Selenium 3.7 release. This month we shipped 3.7.0 and then, because of a small oversight where we missed a jar file in the downloadable artefacts, 3.7.1. There are some nice changes in there. As mentioned last month, one of the areas of focus has been improving how we handle the New Session command when dealing with a local end that might speak both the W3C and JSON Wire Protocol dialects of the webdriver protocol. One of the things that the spec says we’re meant to do is pass though additional top-level fields in the new session payload. 3.7.1 now does this (hurrah!)

One of the nice things from the work in 3.7 is that we’ve laid the groundwork for a clean up of the Selenium Grid code. As part of that, we restored a behaviour where a Grid Node, configured with a path for the Firefox or Chrome binary would have this path injected into any capabilities when starting a session. Making the nodes even more configurable is something that’s on the road map for a later release.

More next month!

A Month in Selenium - September

Wed, 18 Oct 2017 10:46:00 +0100

I realise that this blog has been pretty quiet. Part of the reason for that is that I’m terrible at sitting down and just writing. What I really need is an incentive. That incentive arrived this month in the form of the Selenium Fellowship, which takes the form of a stipend to fund work hacking on Selenium. Part of the agreement is a monthly blog post. So, you all have the Software Freedom Conservancy to thank :)

So, what contributions have I been making to the Selenium project this month?

There are two major highlights. The first of these is Selenium Conf, which was in Berlin. I gave the State of the Union keynote (so called because the first one was an update of how the merger of the Selenium and WebDriver projects was going) Over the past few Selenium Conferences, the theme has slowly been building that Open Source Software depends on people to move it forward. This time, the message was far starker, as I counted the number of people who contribute to key parts of the project — for some pieces, we depend on one person alone. I also covered the various moving pieces in the project, using Kent Beck’s “3X” model as a framework to hold the talk together.

As well as being part of the show at SeConf, I also had the pleasure of helping out Jim Evans in the “Fix a Bug, Become a Committer” workshop. He did a great job explaining how the pieces fit together, and by the end of the workshop, we had everyone building Selenium and running tests in their IDEs of choice (provided that choice wasn’t “Eclipse”), which is a testament to the hard work he’d put into preparing the session. It did highlight that the “getting started” docs probably need a bit of a polish to become usable. I was also invited to do a Q&A with the folks in the “Selenium Grid” workshop, where I broke from theme to talk about the role of QA in a team. Thanks for being patient, everyone!

In terms of code, as I write this, I’ve landed 57 commits since September 17th. Part of this was to help shape the 3.6 release. For Java, the theme of this release was the slow deprecation of the amorphous blob of data that is “DesiredCapabilities” to the more strongly-typed “*Options” classes (eg. FirefoxOptions, ChromeOptions, etc). The idea behind the original WebDriver APIs was to lead people in the right direction: if they could hit the “autocomplete” keyboard combination in their IDE of choice, then they’d be able to figure out what to do next. The strong typing is a continuation of this concept, and is something that all the main contributors are fans of.

One implementation detail we made in the Java tree is that each of the Options classes are also Capabilities. I made this choice for two reasons. The first is philosophical. We don’t know ahead of time what new features will land in browsers (headless running for Chrome and Firefox are examples), so we’ll always need an “escape hatch”, to allow people to set additional settings and capabilities we’re not aware of. The second is pragmatic. The internals of Selenium’s java code is set up to deal with Capabilities, and people extending the framework have been dealing with them as an implicit contract of the code.

In the wild, there are two major, and one very minor, “dialects” of the JSON-based protocol spoken by the various implementations. The first is the original “JSON Wire Protocol”, and the second is the version of that protocol that has been standardised as part of the W3C “WebDriver” specification. We took pains when standardising to make sure that a JSON Wire Protocol response is almost always a valid W3C response (technical note: because all values are returned as a JSON Object with a “value” entry, which contains the return value), but there are two areas where the dialects diverge wildly.

One area is around the “Advanced User Interactions” APIs. The end point offered by the W3C spec is significantly more flexible and nifty than the original version in the Selenium project, but it is also a lot more complex to implement.

The other area is around “New Session”, which is command used to create a new Selenium session. The JSON Wire Protocol demands that the user place the set of features that they’re interested in using into a “desiredCapabilities” JSON blob. This was originally designed as part of a “resource acquisition is initialisation” pattern — you’d load up the blob with everything you might want (a chrome profile, an equivalent firefox profile, the proxy you’d like to use) mashing together items that theoretically only belonged to one browser into a single unit. The remote end was then to do a “best effort” attempt to meet those requirements, and then report back what it had provided. The local end (the driver code) was then to test whether or not the returned driver was suitable for whatever it was that users wanted to do. Which is why they were called desired capabilities — you made a wish, and then could look to see if it came true. If nothing matched, it was legit for a selenium implementation to just start up any driver and give you that.

The W3C protocol is a lot more structured. It provides for an ordered series of matches that can be made, with capabilities that must be present in all cases. For our example above, the proxy would be used for any driver, and then there’d be an ordered set of possible matches for chrome and then firefox (or vice versa). Each driver provider gets a chance to fulfill that request, and if it can, then we use that driver. If nothing matches, then we fail to initialise the session and return an exception to the users.

The more structured data used by the W3C New Session command is sent in a different key in the JSON blob, and this is by design. In theory, it’s possible to map a JSON Wire Protocol “New Session” payload to the W3C one, and to map the W3C structure to something close to the JSON Wire Protocol payload. Sadly, this process is complex and error prone, and there are language bindings that have been released that get this wrong to one degree or another (and, indeed, some that don’t even make the effort) All this means that the Selenium Server has to try and discern the user’s intent from the blob of data sent across the wire. Getting this right, and flexible, has been the focus of the forthcoming 3.7 release. It’s fiddly work, but it’ll be worth it in the end.

Another common problem we see is that some servers out there speak the W3C protocol natively (eg. IEDriverServer, geckodriver, the Selenium Server) and others don’t yet (eg. safaridriver, chromedriver, and services such as Sauce Labs). A big part of the 3.5 release was the “pass through” mode, which means that if the Selenium Server detects that both ends speak the same “dialect” of the wire protocol, it’ll just shuttle data backwards and forwards. However, if it detects that the two ends don’t speak the same protocol, it’ll do “protocol conversion”, mapping JSON Wire Protocol calls to and from W3C ones. This has been made easier by the fact that the W3C spec is congruent with the JSON Wire Protocol – the two have identical end points for many commands.

But not all commands. The main ones that have been causing grief have been the advanced user interaction commands, particularly when a local end speaks the JSON Wire Protocol, and the remote end speaks the W3C one. Just such this situation arises for users of some cloud-based Selenium servers, and its been a constant source of questions from users. To help address this, I’ve landed some code that does emulation of the common JSON Wire Protocol advanced user interaction commands (things like “moveTo”). Hopefully this will address the majority of headaches that people are experiencing using this new functionality.

Let’s see what the next month brings. Hopefully, we’ll ship 3.7 :)

The Poetry of Code

Fri, 03 Jun 2016 09:37:00 +0100

Write a poem about a sunrise. Perhaps you’ll leap straight in, and start writing freeform verse. Perhaps you’ll choose a style; a haiku, or a limerick? Something using iambic pentameter or rhyming couplets? Your choice of approach places tangible constraints on how you express yourself.

What aspect of the sunrise will you write about? The sun itself, or the environment it rises over? Maybe there’s a seascape to be evoked, or mountains. Maybe a city?

Now ask a friend to write a poem about a sunrise. I promise you, it won’t be the same. To the outside observer watching you work, both of you will look alike — scratching words on a page with a pen — but the results are wildly different.

You both work alone. Your art is your own. It’s wonderful.

Write a program to sort some numbers. Perhaps you’ll leap straight in, and start writing freeform code. Perhaps you’ll choose a style; Object Orientation perhaps, or a functional approach? Something using Java or Python? Your choice of approach places tangible constraints on how you express yourself.

What algorithm will you choose to write? A bubble sort, or a quick sort? Maybe a shell sort to be implemented, or a sleep sort? Maybe there’s some other approach?

Now ask a friend to write a program to sort some numbers. I promise you, it won’t be the same. To the outside observer watching you work, both you will look alike — typing words on a keyboard — but the results are wildly different.

You seldom work alone. Your art is a collaborative exercise. It’s wonderful.

Sometimes I'm an Idiot

Fri, 04 Dec 2015 15:00:00 +0000

Recently, I’ve had a few too many things on my plate to deal with, and have been flirting with burn out, so it’s time to take stock and stop being an idiot. In order to stop doing something, one must understand the things that are being done. In light of that, let me enumerate some of the ways in which I am an idiot:

Recently, for only the second time this year, I went to one of the many social events organised by work. They’re a great chance to hang out with people I’d not normally see. I’ve been an idiot for prioritizing sitting at a keyboard over spending time getting to know other people and getting a better view of the company I work for. More broadly, I spend too much time at work, and it doesn’t bring happiness. Don’t you be an idiot too. Go and talk to someone.
I’ve still got about half my annual leave to take even though it’s now December. I’ve been an idiot for not prioritizing resting and looking after myself. I’m in touching distance of finishing a big project, and once that’s done I’ll be taking all my remaining holiday. You should also take your leave. No project has failed because someone went on holiday.
For months, I was “too busy at work” to go to the doctor about a nagging pain in my foot. After it became so chronic walking to the office every morning was painful, I finally caved and went to seek help. It’s going to take months to sort this shit out. If I’d taken the time to go to the doctor sooner I’d be better already, and things wouldn’t be as painful or complicated as they are. I was an idiot for not prioritizing my own health. How can I work when I’m sick? If you’re feeling rubbish, or you need treatment, go and get it. Then, once you’ve done that, come back and be busy.
Switching off from work is a must. Drawing a line between the office and home is vital. I’ve been an idiot for looking at work stuff after hours, when I can’t really do anything about it, and never properly disconnecting. Facebook have an app called @Work and a recently-launched @Work Chat app, and I use these extensively. Your work may use a different email or calendar server you use for your personal life. When someone who’s not an idiot leaves the offices, they mute work-related conversations, calendars and emails. Not doing that is a great way to burn out. Don’t be an idiot.
At both Google and Facebook, I’ve had regular international travel in order to talk to people and collaborate with teams. This has meant I’ve not felt able to book myself into after-work courses even though I’ve wanted to. I am an idiot for letting work stop me from improving myself and my life. As a concrete example, my girlfriend was Turkish (she’s still Turkish. Figure the rest out for yourself). I thought it would be nice to learn the language so that we could go on holiday, visit her family, or just chat at home (and — hey! — learning a new language is always fun). I tried a combination of Rosetta Stone and text books, but I always let work get on top of me, so I never put in the work required. I knew I needed to take a structured class, and I knew that would need to be in the evening. I never signed up, because I was “too busy”. Now I deeply regret that, and kick myself routinely for being an idiot. Yesterday, I finally signed up for a Turkish class even though it’s a ten week course, and even though we’re no longer going out. At least I can still enjoy learning. I really am an idiot for not doing this already.

The main lesson I’m (finally!) learning is that I’ve been an idiot because I’ve let work dominate my life. Neither Facebook nor Google made me work these hours, or worry this much, or stress about all the things that I do and have done. I let work do that to me. I’ve had enough of being an idiot.

I have a horrible feeling that once I’m well rested, de-stressed, and feel like there’s more to life than a constant grind of work, I’ll be better at my job, and happier too. I’ll have no way of dealing with not feeling like shit, but it’ll be fun to find out. I don’t know whether this will be the case for sure, but doing the same-old, same-old isn’t working well.

I’ll keep you posted.

Monorepo --- One Source Code Repository to Rule Them All

Tue, 21 Apr 2015 12:11:00 +0100

What is a monorepo? It’s a unified source code repository used by an organisation to host as much of its code as possible.

This is the pattern followed by companies such as Google, Facebook and the BBC, and is the way that I prefer to structure large scale code, as discussed in my post ruminating on codebases I have known.

I’m not entirely sure where I first heard the term but I like the way it demonstrates an intentional approach, rather than being the result of happenstance, so I use it when I can.

Of course, you need some tooling around a monorepo of any significant size. At some point, there should be more posts about that, but for now, take a look at the video of how Facebook handles this from F8 in 2015.

Android: Forking Java by Mistake

Thu, 26 Feb 2015 21:46:00 +0000

Java has been forked, and Google is the reason. Allow me to explain.

Back in the days of Cupcake and Donut, when Android was new and shiny, one of the things that made it attractive to developers was that they could use a language they were familiar with on this new platform. That language was Java, and of course the version used was a modern one. The most recent version of Java at the time was version 6.

Of course, Java moved forward. 6 was end-of-lifed in February, 2013, and version 7 is now the oldest version supported by Oracle. Java 7’s end of public updates is looming, coming as it does in April 2015. Java 7 introduced a bunch of new APIs and useful features. Some of these, such as better generics inference, are syntactic sugar and provided by the compiler, but some of these (notably “try-with-resources”) need support from the runtime. That support only appeared in Android KitKat.

Versions of Android before KitKat still account for just under 60% of the market according to Google’s own dashboards. That means that someone who wants to target as much of the Android market as possible has a choice to make: use Java 7’s fancy new features, or stick with Java 6. Android’s toolchain supports taking Java 7 bytecode, so all the syntactic sugar provided by the compiler is available, but you can’t use the new features. Things are only going to get worse as Java 8 gets wider adoption — features such as lambdas look like they’re going to be widely used, especially as the functional paradigm becomes more widespread.

Java has a vast collection of OSS and commercial libraries out there for Doing Useful Things. If an app chooses to target Java 6, every library it depends on must also make that same choice.

This means that the Java market is now forking. Server-side Java is forging ahead, and the libraries that it uses are increasingly starting to use modern Java features, unless enough of their users ask for Android compatibility.

So, what can Google do to keep the world moving forward? How can developers use a more modern Java whilst still being usable by the largest part of the Android market.

The simplest thing would be to release a shim that developers can optionally load on pre-KitKat Android. Oracle’s lawsuit about API infringement may make this a deeply undesirable route for them to follow.

Alternatively, individual developers can pack any required classes and APIs into their own apps. That seems like an error-prone way of doing things — it’s way too easy to accidentally use these new APIs by accident, and someone who only tests on a recent device will miss the versioning problem.

Finally, I guess some benevolent third party could create the required shim, but getting widespread usage may be difficult, and it’s hardly ideal.

Once Java 8 features come into widespread use (or the use of invokedynamic gets more traction), the situation won’t be so simple. I seriously hope the big brains running Android are getting ahead of this problem — we’ll need platform support to solve this problem properly, and we’ll need it soon.

Until that support arrives, Java has been forked, with two family trees each with Java 6 as their common ancestor.

Pioneers and Settlers

Wed, 01 Oct 2014 13:27:00 +0100

I enjoyed breaking the world up into two kinds of developers so much, I thought I’d do it again.

The problems that we ask developers to solve are many and varied, but they all contain some mixture of the known and the unknown (sorry to come over all Rumsfeld this early in the post). A new project, the kind of which the company has never undertaken before, is riddled with the unknown to start with, whereas rewriting the legacy system in another language with a team who know the system and both the old and new languages is a pretty solid slab of known nastiness.

The kind of person you need for each type of problem differs. I used to think of them as “Starters” and “Finishers”, but those terms are anodyne and lack the opportunity to be grossly misinterpreted. I call them “Pioneers” and “Settlers” now.

A pioneer is the kind of person you want to tackle a problem rife with the unknown. They’re undaunted by the lack of a map, and positively enjoy the uncertainty. They tend to operate on their own, or in very small groups, and explore a problem with gusto. There’s a strong chance code will be written, discarded, and written again, many times over. They might stand at a whiteboard and argue about design, covering it with boxes, arrows and misleading labels, before deciding the best thing to do is to build both approaches and see which one works.

But they get the job done. And once they know that the unknowns have been worked on, reduced to a manageable level and understood, they lose interest. The thing that drives them is taking a challenge that no-one else has overcome and showing that it’s not really _that_ hard.

A pioneer is just the kind of person you want to get a project off the ground. And then you probably want them off the team. They’ve served their purpose, and now they’re going to look for trouble.

What you want after the original skeleton is in place are Settlers. They take the rough trial laid out, and they make it habitable, maintainable and a Nice Place To Be. The problems that they solve tend not to be the “what the heck are we trying to do?” ones, but the “how the heck are we going to make this work?” ones. They’re qualitatively different types of question, with very different challenges.

It’s entirely possible that the pioneer, hotshot iconoclast that they were, has blazed a trial through the least pleasant route. They were only interested in getting things working, not doing it the best way possible. The settlers may be forced to throw out everything that the pioneer has done, leaving just the faint whiff of the original scheme in the air. I suspect that they do this more than most teams would like to admit.

It is vital to note that the challenges and design question a settler faces are no less taxing than those faced by pioneers, they’re just different.

The pioneer finds “what” satisfying, and the process of solving “how” an anathema once there’s working code. A settler finds iterating on the “how” a deeply rewarding experience, but the “what” may not hold much interest at all.

Of course, it depends on the project to determine what the correct mix of pioneers to settlers is. Sometimes, you just need a small team of pioneers. Sometimes you need a room full of settlers. Sometimes you need to start with pioneers, and then replace them with settlers. It’s okay. The types of problems that need to be solved on a project vary over time. If they didn’t, software development wouldn’t be this much fun.

If you ever work with me, the chances are that you’ll find I like the pioneer work an awful lot. I like to go shooting off into the darkness, meandering with glee into rough edges and emerging, triumphant and bleeding into the light having shown that, yes, yes it is possible to do something crazy. It takes a force of will for me to be a settler, and I rapidly get uneasy and unhappy when I try it.

Cavemen and Plainsmen

Fri, 26 Sep 2014 15:37:00 +0100

It is, of course, a gross overgeneralisation to say that there are two kinds of developers in the world. In this post I plan to grossly overgeneralise by discussing the two kinds of developers that are out there in the world.

Software developers like to write code, but the way that a developer likes to write code can vary wildly. Take, for example, the caveman. When given a problem, the caveman likes to retreat into his cave. There, in the dark, and away from the prying eyes of the people around him, they can take the rough stone of the basic problem and fashion it into a thing of beauty. The art of creation take a long time, so for those of us watching from the outside, it’s as if the developer has ceased to exist.

Finally, the caveman emerges from the gloom, clutching the gorgeous artifact you’ve been waiting to see for so long. Or rather, they sometimes emerge from the gloom clutching a precious artifact. Sometimes, in the dark, they drop the rough stone they’re working on and accidentally pick up a coprolite. This fossilised turd has now been thoroughly polished and shaped. It’s perfect, but there’s no mistaking that it’s a turd.

Oh well. At least while it was happening the manager could relax and rest easy: although they had no idea what their caveman was working on, it was clear that they working on something because they were in the cave.

Contrast this with the plainsman.

Give a plainsman a problem, and they will gleefully leap about, showing it to everyone and anyone. It’s amazing to watch one of these in action. Ideas fizz about them, and new approaches to tackle the problem are tried and discarded. You’ll know that a plainsman is working on a problem because you’ll see it. They might be vocal, they might be chatty on groups. Who knows? But you’ll see and feel the heat of creation.

Eventually, the plainsman will come off the savannah and show you the end product. It may be the glorious artifact you hoped for, or, as with the caveman, they may have become distracted and hand over some sort of steampunk turd.

Of course, in the process of doing this, they may well have given their manager a few scares and worrisome moments — progress may have appeared slow, or a deadend may have been investigated for too long. Their manager may well be ever-so-slightly balder than before. Stress does that.

Taking a step back, and only looking at the starting point and the end point, the two kinds of developer are identical. They’re both given a problem, and they both sometimes solve it, and sometimes they royally screw it up. It turns out that they both probably follow the same techniques and processes to figure out how to build the software they’re crafting. It’s just that the caveman keeps this quiet, and doesn’t like people to know how things are going until they’re done, whereas a plainsman has never managed to figure out how to turn off the noise, or has consciously dialed up the volume.

Managers may actually prefer a caveman. If the end result is going to be the same anyway, it’d be nice to have a quiet life, free of stress, so that the manager can get on with the important work of whatever it is that managers do (Gantt Charts? Going to meetings? Browsing the web? Who knows — they’re a mystery to me, much like cats)

The manager is wrong, of course. It’s infinitely preferable to work with a plainsman.

The reason is that it’s a rare developer who has to work entirely alone and isolated. They tend to work in teams (as an aside, what is the collective noun for developers? A confusion? A multi-faceted-opinion?) Within that team, there’s normally someone who needs the code that our hypothetical developer is working on. Being able to see progress allows others to prioritise their own work. And that moment, where the idea is dropped, and the turd picked up? That moment may not go unobserved in a group.

Managers know this too. I poke fun at them because I can. Not cats, though. Never poke fun at a cat.

Now, although I present this in black-and-white terms, it should go without saying (though I’ll say it anyway) that it’s a rare developer indeed who sits at one or other of these extremes. You can spot a caveman by the feedback from their peers. Things like “needs to work on communication”, or “where the did this highly polished turd come from?” A plainsman might have feedback saying that they’re noisy.

If you ever work with me, I’m a plainsman if you’re within earshot. Ask anyone who’s worked with me, that’s quite a distance. However, if you’re not on the IRC channel I’m on and out of earshot, I’m a pretty effective caveman. Which means that I should never be left on my own. Or fed after midnight. No. Hang on. That’s Gremlins, right?

“I’m not trapped in here with them. They’re trapped in here with me.”

Update: I really like the terms “caveman” and “plainsman”, mainly because I find that they’re ones that people remember easily, and which fit the premise of the analogy well, but I’m aware that they’re not gender-neutral. Suggestions for replacements have been “morlocks” and “eloi”, or “troglodytes” and “herders”, but both of those cast the caveman in a pretty negative light, which isn’t really what I want to say. “Cavedweller” and “plains-dweller” are probably the best alternatives.

A Ranty and Dogmatic Troll Masquerading as Coding Guidelines

Fri, 24 Jan 2014 13:55:00 +0000

This document represents a series of guidelines for writing code that will swiftly pass code review with my team. It doesn’t attempt to be fair. It doesn’t attempt to listen to your opinion. It does present a series of guidelines. You may chose to ignore those guidelines (after all, if they had to be obeyed, we’d have called them “laws”) We may choose to point back to these when reviewing your code. We’ll attempt to avoid being (passive) aggressive when we do so.

Hugs and cuddles.

Test first

We’re working on a framework for writing automated tests. It’s probably a good idea for us to lead by example and write some tests. It hasn’t been proven — but it’s a scientific fact — that writing tests after the fact is as boring as boring can be. So we write the tests first. This has the added advantage that we can think about the kind of functionality we want independent of how it’s implemented.

KISS, YAGNI

Or “Keep It Stupidly Simple, You Ain’t Gonna Need It”. When writing new code, just write the code you need. Don’t attempt to write some über-generic, ultra-lightweight, whizz-bang sub-framework for handling every conceivable edge case when you only need to do “this thing” once. It’s another Scientific Fact that engineers are bloody awful at spotting patterns before the patterns emerge. Wait until the third or even fourth time you repeat something before attempting to extract a common API.

A good book for this? “Refactoring to Patterns”.

Prefer composition over inheritance

Let’s admit this up front: Java’s a deeply flawed language. If I had to write this crap in vi or emacs, I’d be sitting in a corner, rocking backwards and forwards, weeping gently and pulling at my hair. Fortunately, we have IDEs, so It’s All Okay. One area where Java is flawed is that it’s really easy to inherit from a base class, and really clumsy to do proper composition. Nevertheless, as a rule of thumb: inheritance of interfaces is cool, subclassing a concrete base class is not.

Now, there will be times where it’d be Really Useful to be able to share some common functionality between things that appear to be related. For some reason, this always seems to happen with base test classes. It’s a weird fetish, and it’s one we should be disabused of. There are better ways to handle this, possibly through JUnit’s Rules, or by extracting the common functionality into its own class and just newing it up on demand.

Point is: if you’re using inheritance to share some common functionality between otherwise unrelated classes (and “it’s a test” doesn’t make them related) you’re not doing inheritance right. You are, in fact, Doing It Wrong.

Role-based interfaces #ftw

Role-based interfaces: we like them, we use them, and we encourage other people to use them. Yes, this will mean that your classes might implement a lot of interfaces. That means that they collaborate with a lot of other classes. That tells you something. It probably tells you your class has too many roles or responsibilities.

Remember, kids, inside every fat class are at least two classes waiting to climb out.

Expose Collaborators

We take the use of Dependency Injection as an article of faith. This does not mean that we whole-heartedly embrace the need for a “DI Container”, but it does mean that we’re huge fans of exposing the collaborators of a class via its constructor (because “constructor-based DI”)

Having said that, “new” is not a dirty word. It’s totally pukka to use it, particularly for those handy utility classes we mentioned above. Having said that….

Utility Classes: Just Say No

A utility class represents a severe failure of imagination. They tend to become dumping grounds for only loosely related methods, which are typically static. If you have one of these, a fun exercise is to decompose it into what I like to refer to as “Objects” and use those instead. If you’re having trouble with the traditional “noun-based” approach to identifying classes, try a bit of London School TDD and see what shakes loose.

Singletons? Static Methods? Also No

Singletons (in the traditional “implemented as a static field in a class” sense, not in the “ideally we’d only have one of these” sense) destroy our ability to have fun and write tests that can run in parallel, slashing our potential productivity. Also, it leads people to start using the Service Locator pattern instead of Dependency Injection, and we take DI as an article of faith (see above), mainly because it facilitates TDD by making collaborators clear, like we (also) said above.

Use strong types where possible

We’re using a high ceremony language. Might as well embrace that properly. We dislike Stringly-typed code, and we like Tiny Types. Why do we like them? Because they allow our code to express intent as clearly as possible, and we can do things like “hang behaviour” off them as the need arises.

BTW, this means that we really should never be returning “WebElement” from a Page Object. Return a class that models the thing the user would expect to be returned, even if that leads to a class with nothing but a constructor.

Use the most abstract type that conveys intent for variables and fields

The most abstract type explains to the reader what you do. The concrete type is how you’re going to do it. You should be able to change your mind about “how” without needing to change the “what”. The most abstract type that conveys intent for a variable (“Map” instead of “HashMap”, “WebDriver” instead of “DroidDriver”)

I’ve been asked whether a method should return an ImmutableSet or just a Set when it returns a set of things that both sorted and immutable. Redundant as this question may seem, I’ll still have a bash at answering it. Return the ImmutableSet, as that conveys the intent of the return value. If the caller doesn’t care about the mutability of the set, they can assign it to a Set. Everyone’s happy.

Use the right naming convention

The naming convention in our codebase is a hangover from the Android coding style, which was created by people who wrote C++ for a living (which also explains why there are so many static methods in the framework too) We don’t write C++ for a living, and using a foreign language’s coding conventions in Java code makes you look like a clown and an arsehole.

However, it appears we’re perfectly happy to present ourselves as people who find it hard to get dressed in the morning without hurting ourselves. Consequently, when writing code that’s just for us, use the coding standard of the rest of the codebase and choke back the waves of nausea. If you’re writing code that we’ll put in front of a public who believe we’re competent engineers (that is, OSS) use the Google coding standard (effectively Oracle’s, but with a two space indent)

Use the Ubiquitous Language, Luke

If everyone calls it a “Self Aggrandizsing Wattle”, don’t name the class “AndroidPoweredMultiflexViewPort”. We want people to find the classes we write, and we want them to understand how they relate to other classes in the system. Using the names that people call things as class names is Totally Cool.

Also, if you’ve not done so, go and grab a copy of Domain Driven Design and attempt to wade through as much of it as you can bear. Then skip to the end and read the bit about Anti-Corruption Layers. That bit’s good.

Naming Things

Design Patterns are a means of communication, not blueprints. Similarly, the thing that makes classes interesting isn’t what pattern they happen to implement, it’s the role that they play in our system. Leave the pattern name off the class name, ok? The exception to this is the “Builder” pattern. Everyone expects the “Builder of Thing” to be called “ThingBuilder”. We might as well go with the flow on this one and buck our own contrarian ways.

Similarly, every concrete class is an implementation of something, so using the postfix “Impl” (presumably if you’re too lazy to name something properly, you’re too lazy to type “Implementation”) as a class name is a Dumb Thing To Do. Name the class for the particular thing that makes it interesting within the system, or prefix the name with “Default” if there is genuinely nothing interesting about it. Try and avoid naming the class around some obscure implementation detail that no-one using the class cares about.

BTW, it’s acceptable to append the name of the interface being implemented to the class name, but it’s better to try and name the class for the role it plays in the system.

Keep It SOLID

Single Responsibility Principle
Open/Closed Principle
Liskov Substitution Principle
Interface Segregation Principle
Dependency Inversion Principle

Presumably you’re a Software Engineer. If any of those are hard to understand, then I’d suggest using your favourite search engine to read up on them.

Document in Proper English That Which Needs Documenting

It’s a safe assumption that anyone actually reading your code is a professional software developer. Telling them stuff that they can see just by reading your code in javadocs and comments is Not Helping, so don’t do it. Use comments to explain the reasoning behind particular design decisions, or to alert people to odd corner cases that might actually need explanation.

Consider replacing one line comments of a block of code with a sensibly named method containing that block of code. After all, when it comes to maintaining this shit, only the most dedicated of developers will actually update the code and the docs.

It’s cool to use correct grammar and punctuation. Please do so, and try and end sentences with a period cretaceous

On the Naming of Tests

Mon, 04 Nov 2013 13:32:00 +0000

I’d thought that this was part of the automated testing canon already, but apparently not, so a quick note on the naming of tests appears to be in order. Well, how I think tests should be named. :)

When using an xUnit-style framework, the common pattern is to test class Foo in another class called FooTest. Within this test class, there are several methods. The principle I like to follow is that if you took the name of the test class, stripped off the “Test” postfix, and then listed the names of the tests as bullet points, you’d end up with a list of roles and responsibilities of the class under test. You’d end up with something like:

Foo  
  * Should eat cheese  
  * Should not consider cake as cheese  
  * Should handle null cheese by throwing a SpecificException

And so on.

Put another way, if someone were to delete the class under test and the bodies of the tests, could they recreate something functionally identical to the class under test using just the test names?

Time Keeps Ticking

Mon, 09 Sep 2013 14:15:00 +0100

A note so that I never forget again: the time used by a ZipEntry instance in Java appears to keep ticking.

ZipEntry entry = new ZipEntry("foo");  
long expected = System.currentTimeMillis();  
entry.setTime(expected);  
Thread.sleep(3000);  
long seen = entry.getTime()  
  
// This fails  
assertEquals(expected, seen);

Update: it turns out that the problem turns out to be that DOS timestamps only store seconds with a precision of 2 seconds. The above could be reduced to:

ZipEntry entry = new ZipEntry("foo");  
  
// Note: we set the seconds to an odd number  
long expected = Calendar.getInstance()  
    .set(2013, SEPTEMBER, 10, 12, 14, 1)  
    .getTimeInMillis();  
entry.setTime(expected);  
long seen = entry.getTime()  
  
// This fails  
assertEquals(expected, seen);

On Managing Time and Direct Emails About Selenium

Mon, 22 Jul 2013 08:57:00 +0100

I tend to get a few requests a day asking for help with a problem with Selenium. I almost never reply to these. It’s not (just!) because I’m an evil minded, grumpy so-and-so, but because it’s not a great use of anyone’s time.

I am just one person. I have a full time job, and family and friends that I like to spend time with. I work on Selenium as a volunteer, and that means fitting it in where I can. Fortunately, work are understanding about this, and support my role in the project, which means I do far more than if I only had the occasional evening free. Still, it does mean that I prioritise my time spent on the project. This is how I do so:

Writing code. That comes first.
Spend time on the IRC channel. I often just lurk here, but it’s a handy way to talk to the core development team and keep an eye on what’s going on.
Answer emails to the selenium-developers group. This is where we run the project and hold design discussions. It’s not a good place to ask for help, unless that help is about implementing Selenium itself.
I scan the webdriver and selenium-users lists, answering questions where I can and provided I have time.

Now, the nice thing with this ordering is that the further down the list you go, the more people there are who are able to help you with your issue. Put another way: asking for help in the user lists means that you’re far more likely to get the help you want. It also means that if someone else runs into the same problem, Google can come to their rescue. A private email doesn’t have that benefit.

I know that may be frustrating for you. I know that it seems to make sense to contact prominent people on the project directly. I understand your particular issue is urgent and important to you. I really do, and that’s why I don’t answer your emails.

Being Part of a Distributed Team

Fri, 19 Jul 2013 13:07:00 +0100

At work I’m part of a distributed team. Two colleagues and I are based in London, and the rest of the team is in Menlo Park, California. It’s been reminding me of some lessons that I’ve learnt over the years about working as part of a team that’s spread across time zones, and I thought it might be nice to share some of them. Without further ado:

Attempt to get code reviewed by the person most familiar with the area but also by someone who’s on the same site as you. This suggests that singletons on a site of their own are a suboptimal thing. Which leads to….
… be tolerant of clean up diffs. If the local reviewer approves a change in the code review tool the chances are high it’ll be landed. The author of the code is responsible for not being a clown: if there are fundamental design decisions that are still unresolved then landing the change, even if the local reviewer is ok with it, isn’t a winning move.
Code is a plastic thing and we have source control. We can fix things up. Taking advantage of that is a Good Thing.

One useful pattern I’ve seen is to check in a failing but ignored test. It a lovely way of moving things forward without jamming the works, and leaves a clear trail of intent.

Another example of being good at this: the code reviews which get approved with a list of nits and changes to make before landing. I think that shows great trust in your team, and I like it. It goes without saying that if you’re using a system like Gerrit (**update: **which submits the code when it sees that a diff has been approved), then this “ok but please fix” approach won’t work as well :)

Time zones are evil but we must live with them. A great habit for the Brits to get into is to have done code reviews for USian teammates by 3pm BST. For those teammates in America, reviewing code by the British first thing when you get to the office in the morning is an extremely helpful thing do. If you’re working with colleagues in Australia, India, China or elsewhere on that side of the globe, be aware of when they come into the office and have your code reviews done by then.

The reasoning is clear: it gives everyone as much time as possible to turn code around and get it reviewed again, as this is when the team’s hours overlap. That, in turn, helps the team as a whole move fast.

Bear in mind the hierarchy of communication:

nothing -> email group -> email -> IM -> VC -> in person

The further to the right you are, the lower the latency and the higher the bandwidth. The higher the bandwidth, the quicker misunderstandings can be resolved and design choices made.

Corollary 1: if a review is dragging on, hop on to VC, Skype or a Google Hangout, or go and chat to people.

Observation: the further to the left you are, the more asynchronous communication becomes. If time isn’t of the essence, then head left.

There’s nothing like being in the same place. Travel is bad because it means you’re away from family, your regular routines and all the things you love about home. Travel is great because you get to go to new places, hang out with the locals and get to see how other offices work. Although it’s probably more financially astute to make the smaller part of the team travel to the larger, it’s also unfair to expect the travel to always be done by one part of the team. It turns out that a sense of fairness is the thing that will keep the spirits up in the team and keep everything ticking along nicely.

I’m sure there’s more that needs to be covered. Things like scheduling meetings and alternating times so that people don’t always need to stay late or get up stupidly early, or having a useful glossary of ambiguous terms (“let’s table this discussion”) and other issues and hiccups, but I’ve been writing for a while now and this is getting long. So I’ll stop.

The UNIX Philosophy, WebDriver and HTTP Status Codes

Mon, 24 Jun 2013 21:11:00 +0100

The UNIX philosophy can be described in many ways (and the Wikipedia page has plenty), but I’ve always admired its practical application in the wealth of shell commands available to a user. Rather than having a single command that Does Everything, the UNIX shell is a place of small commands focused on doing one thing well, yet which are easy to link together.

For example, I recently needed to compare the contents of two JAR files and remove class files that were duplicated from one of those jars. I ended up generating the list of shared files via:

comm -12 <(jar tf first.jar | sort | uniq) <(jar tf second.jar | sort | uniq) | grep -v META

I doubt very much whether the authors of any of those tools thought that this is what I’d be doing, yet because the tools are carefully focused and are easy to chain together, this is a trivial thing to do.

How, you may ask, does this apply to Selenium? And specifically issue 141? For those of you who can’t be bothered to read the incredibly long list of comments on that issue (now at over 100), this is the one about being able to get HTTP status codes from the WebDriver API. The comments are split between those saying that this functionality doesn’t belong in the API, and those who (occasionally very vociferously) claim that it does.

From a philosophical perspective, the WebDriver API is attempting to model a user interacting with their browser. We attempt to limit the APIs we offer to just those that meet this need, only allowing ourselves to extend it to those very clear cases where the browser is the Source of Truth about a particular thing (such as with cookies), or where there’s no rational way to cleanly offer a facility (such as executing Javascript — incidentally, something that I spent a lot of time keeping out of the API)

HTTP status codes don’t fall into either category. The browser isn’t the the source of truth about these codes, as that’s the originating web server. The user may not be aware of them either; a 404 from a .js file? That’d most likely go unnoticed. A 500 from even the main page? That may be returned as a 200 by some app servers in certain configurations.

So that leaves our users out to dry, right? Well, it would if it wasn’t for the UNIX Philosophy. You see, it’s ridiculously simple to hook up a proxy that will capture this information for the user if you can’t obtain the information by instrumenting the server. You can do it like this:

// Explain where your proxy lives  
Proxy proxy = new Proxy();  
proxy.setHttpProxy("your_proxy:8080");  
  
// Now tell the webdriver instance about it  
DesiredCapabilities caps = new DesiredCapabilities();  
caps.setCapability(CapabilityType.PROXY, proxy);  
  
WebDriver driver = new RemoteWebDriver(caps);

That’s 5 lines of code in enormously verbose Java.

Separating the concerns of “browser automation” from “logging network” traffic allows the Selenium developers (most of whom are not paid to work on Selenium) to focus on the problem of driving the browser. It means that they’re not working on writing their own HTTP proxy, which is a sufficiently taxing tax that there are many projects out there working to write something solid and stable.

Great options for users looking for a powerful and capable proxy include Fiddler and Charles. Another option is the BrowserMob Proxy, which started as being a fork from the original Selenium RC codebase (Oh! The irony!) but has since matured and grown. This is amazingly simple to integrate with a WebDriver instance, as shown in their docs. For brevity, the integration can be done like so:

ProxyServer server = new ProxyServer(4444);  
server.start();  
  
// get the Selenium proxy object  
Proxy proxy = server.seleniumProxy();

Following the UNIX approach, we make it easy to use a proxy with the WebDriver API. That means that we’re not implementing an API for getting HTTP status codes in the Selenium project not only because it’s out of scope, but there are already people doing a great job of offering that capability elsewhere.

Why I Care About Automated Testing

Wed, 22 May 2013 09:43:00 +0100

I was reading a blog the other day that highlighted the fact that I’ve only got a limited number of keystrokes to use up in my working life. I can use those keystrokes on anything: email, writing code, futzing around with the git command line, facebook status messages, posts on Plus, anything….

That got me thinking about why I think that testing is an important part of software development: not an afterthought, but something that’s as vital as considering API design or how to structure methods. It’s because I only have a certain number of keystrokes. I’d rather spend those working on new features and moving the bits of the world I care about forward rather than fixing bugs or chasing down regressions.

It’s undeniable that writing a test and the code itself takes more time. I’m having to write more code. I’m burning keystrokes. But each of those tests may be helping to prevent regressions, or providing insight into the structure and usage of my code. And that additional insight, and those prevented regressions, mean that cumulatively I have more time to spend hacking on features, and that’s what I love to do.

So that’s why I care about automated tests. That’s also why I think you should care too.

Speaking Engagements

Tue, 26 Mar 2013 10:26:00 +0000

One of the things that I love most about working in tech is the chances I get to speak in public and share some of the knowledge I’ve somehow managed to accumulate (even, sometimes, about work that I’ve actually done myself!)

For someone who enjoys speaking in public, I find it hard to actually promote the fact that I’ll be in places, but some friends have recently asked when and where I’ll be, so without further ado here are my next confirmed appearances:

7th-10th April: DroidCon, Berlin
23rd-24th April: GTAC, NY
2nd May: Facebook Mobile Developer Conference (a brief stint in London!)
10th-12th June: Selenium Conf (tickets on sale!) (It looks like it’ll be great fun!)

For DroidCon, GTAC and the Facebook MobDevConf, I’ll be talking about various aspects of my day job at Facebook, though each conference, because of their different focuses, will get to hear about different parts of it!

Selenium Conf is something that I always look forward to. It’s a great chance to meet the people who are using the tool that has meant so much to my professional career, and it’s also a fantastic opportunity to meet up with the rest of the selenium developer team and buy them a steak dinner. We’ve yet to have a veggie join us as a core team member, but we’ll figure out a suitable way to reward their efforts too one day! If you’re interested in becoming that first vegetarian contributor, then you can always start by contributing some code!

I’ve also recently become Facebook’s W3C AC representative, and will be attending the TPAC in November in China.

Clue

Tue, 08 Jan 2013 09:41:00 +0000

Note to self: as an OSS project becomes more successful, the level of clue on the mailing lists drops.

Corollary: a successful project will have a mailing list containing several knuckle-bitingly painful posts.

Why is this? I believe that it’s all to do with the motivation of the person emailing the list. To begin with, an OSS project only has people who are very interested in using it posting to the lists. These people tend to be technically savvy and can identify what might be causing problems. The signal to noise ratio is therefore very favourable to a sensible discussion.

As a project gets used more widely used, the people using it move from those who are actively interested to the people who are sitting around them. They may not be quite so engaged, but they’re probably still relatively tech savvy. They have far less incentive or desire to dig into the code and understand why things are failing. The level of clue on the mailing lists appears to drop.

This process continues until the project is so successful that it’s a mandated part of people’s jobs. They have no choice but to use it. At this point, there is absolutely no incentive to understand why things are not working as intended, and every incentive to try and see whether someone else has already solved the problem. The level of clue in the mailing lists appears to drop through the floor.

I remind myself of this on a regular basis, and you know what? I’m okay with it too. In fact, sometimes I even smile when I see my project’s lists filled with poorly researched, ill considered emails. It’s a sign of success.

Where Computing Lives

Tue, 08 Jan 2013 09:26:00 +0000

It’s hardly an original thought, but it’s interesting to me that the location of a majority of someone’s computing power has moved from their personally controlled space to datacentres. It’s allowing companies to do some extraordinary data processing, but just surface the results to the user.

To prove my case, I cite Google Now. Which is awesome, in the traditional meaning of the word. Now, if I could just make it stop accurately telling me that it’s time to “head home” and then highlight a nearby pub.

The other thought that occurs to me is that the constraining factor on someone’s ability to utilise computational resources is going to become wireless bandwidth. I wonder whether there’s a Moore’s Law for that….

Removing Java 7 From OS X

Mon, 07 Jan 2013 18:28:00 +0000

As a follow-up from “Switching between Java 6 and 7 on OS X”, here’s how to remove Java 7 entirely:

# First the JDK  
cd /Library/Java/JavaVirtualMachines/  
sudo rm -rf jdk1.7\*  

# And now the JRE  
cd /Library/Internet\\ Plug-Ins/  
sudo rm -rf JavaAppletPlugin.plugin  
  
cd ../PreferencePanes/  

sudo rm JavaControlPanel.prefpane

And that should be that.

Help Me Help You

Mon, 07 Jan 2013 13:18:00 +0000

You’ve finally got fed up enough with whatever your current problem is to ask for help from someone. That help could come from a number of sources, but for the sake of argument, let’s imagine that it’s a mailing list. Further, let’s imagine that it’s a mailing list that I’m on, and that I genuinely want to help you. Here’s how you can help me help you:

Tell me what it is you’re actually trying to do, at a high level. Why? Because there might just be another way of approaching the problem that’s just as valid. An example? Sure: “I want to log into gmail via the web UI” That’s an inadvisable thing to do, especially when what you’re really wanting to do is verify that some email was sent by your system.
Remember, you’ve been pushing hard to fix this problem for a while now. You’re probably reaching out to the group because you’ve run out of ideas; either that, or you’ve no idea how to begin solving the problem and really need some pointers. In both cases, you’re loaded with context that I’m lacking. Help provide that context. Tell me why other ideas haven’t worked. Perhaps it’d help to talk to the teddy.
Information is vital. Without it, I can’t help you. A reproducible test case is one of the best ways to let me help you. Yes, yes, yes, I know that there’s absolutely no way that anyone, anywhere could possibly see your Super Sekret company intranet. You’re a smart chap. Creating a reproducible test case is not normally that hard.

Those are the key things, and they can be encapsulated as “I’m not you.” There are other things that are more likely to get me interested in helping you:

Run a spell checker across your email. Just quickly.
Read your email to yourself before sending it. You were most likely frustrated when you wrote it, and missed out something that might help provide more context to me.
Read your email aloud before sending it. You were most likely very frustrated when you wrote it, if you’ve stumbled over things and not clearly explained them, I’ll just move on.
I know you’re frustrated and annoyed. Calling the tool your working with “shit” or “appalling” or relying on gross over-generalisations (“clicking never works”), insults, or, better yet, combining any random two (“only a flea bitten moron would think of using this pile of crap”) isn’t a way to make friends and influence people.

And, that, my friends, is all I have to say on that for now.

Talk to the Teddy

Mon, 07 Jan 2013 13:09:00 +0000

Ever hear of “talking to the teddy”? That’s the time honoured technique of explaining a particularly troubling and hasslesome technical issue to a colleague only to realize half-way through the explanation that the answer is obvious and the problem far tamer than you thought.

You can save face and others’ time by replacing a colleague with a small teddy bear. Explain your woes to that first. Otherwise, there’s a greater than even chance I might make strange noises and sit there, staring blankly at you the first time we talk about a problem.

Switching Between Java 6 and 7 on OS X Lion

Wed, 02 Jan 2013 15:21:00 +0000

This took me too long to figure out, so I’m writing this up now.

To switch from Java 7 to Java 6 on Lion:

sudo rm /System/Library/Frameworks/JavaVM.framework/Versions/Current  
sudo ln -sf /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents  /System/Library/Frameworks/JavaVM.framework/Versions/Current

To switch from Java 6 to Java 7:

sudo rm /System/Library/Frameworks/JavaVM.framework/Versions/Current  
sudo ln -sf /System/Library/Frameworks/JavaVM.framework/Versions/A  /System/Library/Frameworks/JavaVM.framework/Versions/Current

There’s something deeply suspicious about this, but at least it works.

Ruminations on Code Bases I Have Known

Sun, 04 Nov 2012 19:24:00 +0000

Code bases, eh? Can’t live with ’em. Can’t live without ’em. Though that’s not strictly true, as many generations have seen. Wait. Hang on. This isn’t quite the articulate start I pictured in my head. Let’s have another go at this.

*ahem*

The style of architecture used for a large code base largely depends on how the third party dependencies of that code base are managed. Specifically, whether those dependencies are handled at the global level or at the team/project level.

Contentious statement out of the way. Let’s see if I can explain what the heck I just meant.

Let’s take the case of a large code base where dependencies are handled “globally”. For the sake of this discussion, let’s imagine that this means that there’s only ever one version of a particular dependency in use at any particular revision of that code base. When the code base is tiny, this approach is what teams tend towards (IME) and updating a dependency is relatively straightforward. However, as the code base grows, or more projects are rolled into it, updating a third party dependency becomes increasingly difficult and time consuming.

When a code base reaches a certain size, a choice has to be made. Should third party dependencies be handled at the project level or for the code base as a whole? The downside of handling them for the code base as a whole is that updates appear to become increasingly costly. It therefore seems reasonable to split the code base up in some way and then let the chunks handle themselves.

And this is where the problems start….

If your code base is anything like many I’ve seen, there’s a suite of utility functions and library code that’s shared between projects. These are often referred to by their code names, but everyone knows that they’re reusable components that should be (ummm…) reused by other projects in the code base (let’s call them “client projects”). The problem is that each client project is now handling its own set of dependencies, so this suite of library functions must, by necessity, only have the bare minimum number of third party dependencies (to minimise the chances of accidentally requiring multiple versions of the same library in a client project) or be more permissive about dependencies at the cost of more painful integration with client projects later.

Given the evolution of a code base, the former is preferred and the latter is what gets done, at least to start with. Put another way: by the time you realise that there’s a need for a shared set of utility functions, the helper dependencies have their tentacles well and truly wrapped round the code to share.

But we know that it’s possible to reduce the third party dependencies on core library code to a minimum. Guava libraries, which Google uses, demonstrates this. (Pro tip: to get there, never use XML)(only kidding)(not really: hi xalan and xerces and xml-api.jar) So it’s demonstrably possible to have a tiny subset of your code base be sharable without causing a ton of grief when integrating new versions. Hadanza!

That example with the common library of common code (let’s call it Odin or something suitably grandiose) is a microcosm of the horror that awaits when reintegrating different projects that have been managing their dependencies separately for some time. Each time there’s a third party dependency shared by each project being integrated but at a different version there’s more than just code integration to do. Some parts of one or both projects need to be reworked or rewritten, which increases the amount of time and testing required to do the reintegration.

Given the pain of updating even multiple lagging dependency can cause, it’s far more likely that the client projects, once split out of trunk, will remain split out indefinitely. Which is fine, until they need to communicate.

One approach to resolve the problem of communicating between different projects at runtime is to use XML. This approach is great, unless schema validation is turned on. Then at least one side or the other is going to claim that a perfectly valid message is complete garbage. Well, nuts! So, let’s not validate our XML, but use something like XPath to pull out the bits that we find interesting (using something like Shcematron perhaps, though it’s been a looong time since I looked at it) The alternative is to use something like Protocol Buffers or Thrift, which are used by Google and, given the latter’s origin, Facebook.

OK. So the various client projects now have a robust mechanism for communicating: by passing some sort of message between instances. That message may take the form of a document (recommended, as this minimises the surface area of API that needs to be agreed between client and server) or an RPC call (less recommended, as it’s all too easy to accidentally tightly bind two communicating projects together). As with all things IT-related, it’s typically easier to do the less recommended thing.

The astute reader will already be seeing that this is describing an SOA-style architecture. The astute reader would be correct. The astute reader may now have a cookie. Unless they’re in the UK, in which case they’ll have to give permission for cookies to be used. (joke)(not really). And yes, I know that expanding “SOA-style architecture” leads to a nonsensical phrase. I can live with that.

So: a large code base consisting of lots of independent projects each managing their own third party dependencies likely leads to an SOA-style architecture. Or integration through the database. I know which one I’d prefer.

But what if we stick with the original plan of managing the third party dependencies globally? This leaves us with more options (side note: I like having more options) We’ll still need some mechanism for old versions of a project within this one tree to communicate with newer versions of itself (if only across updates to the software, but also if you’re scaling horizontally), and, again, XML, protobufs, thrift or some other data interchange format will help a lot here. So, we could also make use of SOA-style architecture, and that may be advisable.

Alternatively, given that everything is using the same third party dependencies it’s likely that it’s a lot easier to share code between projects within the same tree. Depending on the rigour of code reviews or tools put in place to prevent teams delving into “not public really” APIs another style would be to just use other projects within the same tree in exactly the same way as third party dependencies.

Groovy. So we could use SOA or have a mass of tangled projects? Well, neither sounds that appealing to be honest. So which approach should be chosen?

Let’s take our thought experiment through a required update of a third party dependency (let’s imagine that not updating the library will accidentally cause a portal to Help Desk to be opened through which The Angry Users may reach you directly)

In the case of everyone being in the same tree it’s likely you’re already close to the latest release, as there’s always someone who wants that latest whizz-bang feature. If you’ve lots of dependencies — and you probably do — then an update to a third party dependency may be a well practised thing. Glibly speaking, the chances are that the update, whilst not necessarily smooth, can be managed with only minimal wailing and gnashing of teeth.

In the other world, of lots of separate projects each managing their own dependencies, things might well be very different. If each project has assiduously been updating their third party dependencies to the latest and greatest release, then things might be easier than in the case of a single, unified tree as you’re attempting to update a smaller code base. Hurrah!

However, it was the pain of these library updates that caused the code base to fracture into a world where SOA made sense. Some poor bastard is going to have to do a massive update, and that’s going to hurt. IME, the pain of an update grows non-linearly and at a multiple greater than 1 in relation to the number of intervening revisions skipped. That is, skipping one revision hurts, skipping (say) five really hurts and skipping a major version may well be justifiable cause for murder when it comes to integrate the latest and greatest into your tree (aside: that’s why people who are gung-ho about branching recommend integrating branches as frequently as makes sense) To make matters more painful, this integration is not being done at a time of the project’s choosing: it’s being crammed into an already too full release schedule. Yay! A recipe for success if there ever was one.

Taking this back to the start of this post, I hope you can now see that. in my view, how third party dependencies are handled really do have an impact on the architecture of your system.

Now, I’ve written this “in the Yegge style” (lots of words, aided by a glass or two of wine), so I won’t be offended if you all pull this apart in the comments :)

Time Off

Wed, 03 Oct 2012 20:10:00 +0100

It’s been quiet here, and its going to get quieter here for a bit longer: I’m taking some time off to spend with The Boy and The Wife. I’ll be back in time for the W3C TPAC. Have fun, everyone. Try not to break anything while I’m not paying attention :)

The Future of Mobile Testing

Tue, 18 Sep 2012 17:33:00 +0100

Six or so years ago, Jason Huggins and I were talking about the next generation of web testing tools. This wasn’t a conversation about the as-then unreleased Selenium 1.0 or even of WebDriver, which was a new and shiny thing I was working on at ThoughtWorks. This was about the next generation of testing tools.

The fact that we can do automated testing on the Web is a happy accident. When MS and Netscape put Javascript into browsers and standardised the DOM they didn’t do so with an eye to making it easy to write tests. They wanted new and whizzy features added that only worked in their browser in a fight to win the browser wars. Each browser implemented the features of the other and then added more in a bid to gain the edge. The fact that it was possible to build something like Selenium on top of this work was never meant to happen.

That wasn’t the conversation that Jason and I were having. We were talking about what the next generation of testing tools would use; the ones that would make selenium and webdriver totally redundant. It was obvious to both of us that accessibility APIs would be The Way Forward. After all, users with one form of disability or another make up a small, but important, percentage of web users. Their equal access to information and applications are enshrined in laws. Not only is making an app accessible a groovy and lovely thing to do, it’s also often a legal requirement.

The next generation of tools, we reasoned, would build upon this accidental automation infrastructure in the same way that we used the DOM and JS: to provide an API that can be used to drive and query an application from outside that application.

Microsoft lead the way, when .Net 3.0 contained an API called UI Automation. I was working on a project with Mike Two, who hacked together a proof of concept against the desktop app we were working on before flying to India to be closer to the dev team there. Some time later, White appeared, which took the concepts and followed through. Brilliant stuff.

Then it went quiet.

Until, that is, the mobile revolution started. For an amazing number of users, their primary contact with the Web will be a mobile device, probably either Android or iOS. The problem is that neither of these platforms have “making it easy to write an automated end to end test” baked in as a concept. Increasingly, however, they do have the keys to accidental testability provided: their accessibility frameworks, and these are often called something useful like “UI Automation”

All this means that the next generation of tools are coming.

There is, however, a missing piece. We have the WebDriver APIs for testing web-based content, and for testing native content we have the accessibility APIs (which can be wrapped to look webdriver-ish if desired)(and I think it is desired)(but I’m biased) But how do these two gel? How do we test a “hybrid” app, composed of both native and web-based content? In my view, this gap can best be bridged by augmenting the accessibility API to allow a webdriver instance to be returned from any WebViews that are found via the accessibility APIs, and by allowing returned WebElement instances to also implement the equivalent of UIAElement, so that it can be the target of OS-level simulated user inputs.

There. Problem solved.

What My Leaving Google Means for Selenium

Sun, 16 Sep 2012 15:54:00 +0100

TL;DR: nothing.

Longer version:
One of the questions I’ve received most since announcing that I was leaving Google was “what does this mean for the Selenium project?”

My answer is “not much” The project itself is Open Source, and more than half the commits are coming from developers outside of Google. These developers range from people working at other browser vendors (notably Mozilla and Opera) to people who are just interested in the project and write amazing code. Those people aren’t going away. The OSS project has already demonstrated that if I’m not around things still get done (though the releases slow waaaay down :), but I’ll still be reading every commit and still contributing where I can. Better still, I’m just one person, and the project is vibrant and humming with activity.

What about the browser vendors? One obvious impact might be that the ChromeDriver stops moving forward. That’s deeply unlikely to happen: the chrome driver is maintained as part of the chromium OSS project by members of the Chrome team itself, rather than the team I was TL of (Browser Infrastructure at the last name change) This is also true of the OperaDriver, which is maintained by Opera Software, and will be true of the FirefoxDriver once the Marionette project is available on release builds. The trend we’ve been encouraging is that browser vendors should be responsible for their drivers — given the complexity of the task, this is the best way to ensure that what our users want is what our users can actually do — and the browser vendors are rising to the challenge.

Of course, this does beg the question of what my team at Google does. The answer is “a heck of a lot”. Google has made a massive investment in browser automation. A lot of that investment is visible in the contributions to the Selenium project (where most of the team have earned the commit bit), but it can also be seen in other projects such as Wicked Good XPath, Web Puppeteer and the Browser Automation Atoms. The team also works on integrating these APIs with Google’s infrastructure, and providing support and guidance to teams, and they’re constantly striving to make writing web tests so stable and easy that even a software engineer can write them :) I may have left that team, but that investment continues unabated.

Sotto voce: if you’d like to join that team, I can pass on your CV….

Which leaves my involvement with the W3C spec. It’s true, over the next couple of months I intend to spend a lot of time with my family, but I’m also planning on spending time working on the spec. Once I join Facebook, I fully intend to continue co-editing it. That work is going to continue.

So overall, my take is that my leaving Google isn’t going to have an appreciable effect on the OSS project or the spec.

Which is nice.

Hidden Options in javac

Sat, 25 Aug 2012 17:07:00 +0100

You probably already know that the default Java compiler (javac) can output a bunch of options if you run it as javac -h. You can also get the extended list of Oracle specific command switches using “javac -X”. Until today, I’d not realised that javac has even more command line switches, called “hidden options”. It’s possible to call them using a syntax like so -XD-hidden-option-name.

Now, I wonder if ecj has the same sort of thing. It’d be handy….

Update: it doesn’t look like it. *unhappy face*

Ant's JUnit is a Horse's Arse

Sun, 12 Aug 2012 04:37:00 +0100

There are insightful posts, and then there’s this….

There’s a story “somewhere on the Web” that builds a case for the size of the boosters for NASA’s shuttle being limited in size by the smallest train tunnel that they needed to pass through. That tunnel was slightly wider than a train track, and, ultimately, the width of that train track was determined by the width of a horse’s arse.

I was reminded of this in a meeting recently when someone asked about how continuous build systems provide results of tests to users. Almost all of them do so by looking for XML files generated by JUnit. Specifically, they look for those files in the format generated by Ant’s JUnit runner. It doesn’t matter how sophisticated that continuous build server is, it still needs to read files in that format. In a way, the JUnit runner in Ant is a horse’s arse too.

Routes to Faster Java Compilation

Sat, 28 Jul 2012 18:39:00 +0100

My recent post comparing the compilation speed of the ecj batch compiler and javac hints at the topic that I’ve been starting to explore in my copious free time: how to get a faster build out of a java project. This matters to me a lot, since I spend the majority of my day not answering emails coding in Java for an OSS project. Here’s where I’m at now:

Develop as much as possible in the IDE, including running tests. If there’s some way to avoid doing a command line build, then avoiding it allows faster progress. On the Selenium project there’s a reasonable amount of effort invested to make this possible. It’s mostly successful, but not as fast as I’d like because…
Minimize the dependencies in your code base. The IDE based test runs I do often need to shell out to build things like firefox extensions. That kills the performance of the build and makes doing end-to-end testing far less pleasant.
Delete dead code. The less there is to compile, the better.
Evaluate the tool chain. Past assumptions don’t always hold true, so revisit them from time to time.
Going from Java to a JAR is almost always the right approach, particularly if you’re dealing with a large number of files. Running “stat” on each java file and class file and output JAR file is often less efficient than running “stat” on just the java files and the output JAR. However…
Use an SSD. Compiling code is an exercise in small and random reads and writes. A spinning platter disk isn’t the best choice for this. I need to rerun my checks to see if the statting of class files is now worth the extra effort.
Get more memory. Your OS will cache things pretty aggressively, but hitting swap will murder your build times.
More memory and an SSD offset many of the disadvantages of a slower CPU, particularly in a single-threaded build.
Build in parallel. If you don’t, then make sure the clock speed of your CPU is as high as possible.

Now, I’d love to say that I do all of these things, but I can’t. In particular, I don’t build in parallel, which is a real waste of 7 of the cores in my personal machine (or even more for my machine at work) The main reason for that is that the build tool that I use (rake) has fairly blunt support for running in parallel and the layers of abstraction wrapped around it make figuring out when to use multitasks harder.

The thing that sticks with me the most, however, is that compilation speeds are fastest when the compiler only has a few files to deal with. With Java, that leads to having lots of small targets rather than one massive glob of the entire file system. Or it would if the way that determining whether a JAR had changed was determined by something other than the last modified time of the file. There’s an additional wrinkle: how do you avoid a small change in a method (say adding a logging statement) from causing a complete recompilation of everything that depends on the JAR that contains that method? Hmmm… I wonder….

Compilation Speed of ecj vs javac

Tue, 24 Jul 2012 13:44:00 +0100

For the longest time, I’ve taken it as an article of faith that the eclipse compiler (ecj) was significantly faster than the default Sun java compiler (javac). The last time I measured this properly was when I compared the Java5 compiler against the equivalent ecj release. After a recent conversation, I realised that I was still making the same assertion, despite over two years having elapsed.

So, time to measure again. And this time, the results surprised me. What used to be a whitewash for ecj has now changed to a slight edge being given for javac.

The test was done by repeating a build of the java components of the selenium codebase 20 times on an OS X machine running Lion and the latest released version of Java 7 from Oracle. The machine itself has 12GB of RAM and the development wasn’t being done on an SSD. The results were:

javac: avg. 29.9s with a standard deviation of 0.80s
ecj: avg. 31.1s with a standard deviation of 0.82s

Hadanza! First Public Working Draft

Sun, 22 Jul 2012 21:16:00 +0100

Hadanza! The webdriver spec has reached First Public Working Draft on the 10th July, 2012. Now all we need to do is finish it :)

/me does happy dance.

Explaining the Speed of Selenium Updates

Sun, 22 Jul 2012 20:49:00 +0100

With the release of Selenium 2, we moved from incredibly slow updates and releases to far more regular ones. We were aiming for weekly releases, but have settled into something that averages between two and three weeks. That’s a pretty fast update schedule. Why do we do it?

The first reason is that there are a lot of checkins every week on the project. Last week, which admittedly was pretty busy, saw 94 separate revisions in the space of the 7 days between the 14th and the 21st of July 2012. The previous week was quieter. There were 84 checkins. Each of these checkins represents either a refinement of existing functionality, a bug fix or (increasingly rarely) new features. We’d like to get those features and fixes in front of the users as quickly as possible.

The second reason is that we need to continue working with the latest and greatest versions of browsers that are out there. Although Internet Explorer is on a comparatively leisurely yearly release cycle, Chrome and Firefox release a new major version once every six weeks. Six weeks! Each of these releases tends to call for a new selenium release too.

One of the nice side-effects of the design of the webdriver APIs is that we can, in theory, decouple the release cycle of the individual browser drivers form the release cycle of the client libraries. Since the Chrome team maintain the chromedriver, this has been the case for a long time with chrome (hurrah!). Opera were the ones that blazed this particular trail, and the opera driver releases are independent too. With the next release of Selenium, this will also be true for all client languages for IE as well.

Which leaves Firefox.

If we were happy with only using synthesized events, we could relax a little with the selenium releases. The problem is, we’re not happy with only using synthesized events, and neither are our users. Right now, the OS-level events are fired using a binary component built by the selenium project. The way that these components work means that we need to do a new release for every major number release of firefox. Back in the day, this meant “almost never”. Today, that means “once every six weeks”. It’s tiring for us. I’m sure it’s tiring for Mozilla. It’s definitely tiring for our users.

We’re working with Mozilla to try and get a mechanism for injecting events at the OS level built into Firefox itself. Once this mechanism is in place, we should be able to slow our release schedule a little (though we’d still like regular releases in order to get bug fixes in front of you)

The other thing that Mozilla are working on is the Marionette project. This is an implementation of the webdriver APIs embedded within Firefox (and mobile, and Firefox OS) and maintained by Mozilla themselves. Once this in place, the selenium project will be able to completely decouple releases of the client libraries from the browser drivers.

This decoupling should be a Good Thing for users. For a start, it’ll be possible to do far more focused updates. For example, let’s imagine that we discover and fix a bug that only affects IE users. We could release a new IE driver server as soon as the fix is in and users can grab that, without also needing to grab a new version of the client libraries (or the Firefox driver….)

Oh well. We’ll see.

Getting the show back on the road

Sun, 22 Jul 2012 20:27:00 +0100

My old domain name — pubbitch.org — lapsed due to some pretty unfortunate circumstances. It’s taken me longer than I wanted to get a new domain name and set up a site, but I took a few hours out today to get something basic up and running.

It’s actually been remarkably painless. I’m using Google services for basically the entire thing, with the main site being hosted on App Engine (well, I say “main site”, but it’s actually just an index page right now) and this blog on Blogger. I’ll be ironing out the rough edges as I go along, but this should be my new home on the Net from now on.

Yay!

Working on Specs

Tue, 03 Jan 2012 16:24:00 +0000

Well, that makes it real:

http://dvcs.w3.org/hg/webdriver/raw-file/d2a77f663f2e/webdriver-spec.html

Rocket-Powered Jet Pants

We're Going to Make Out Like Bandits

The Dorodango Technique: Agents in Agents Driving Scripts

TL;DR

The long version

Why “Dorodango”?

Comparison with Wiggum Loops

Principles of software development

The best code is the code you don’t write

Fast feedback loops are vital

Assume no one reads the docs

Favour simplicity

Do the simplest thing that can possibly work

Fear of code is a sign of where to start

Collocate things that are related

Long lived code needs tests

Source control gives you freedom

Don’t let “perfect” be the enemy of “good”

Wrapping up

rules_jvm_external's New Maven-based Resolver

A little is better than nothing

Small tests #FTW

Software is collaborative art

Patience is a virtue

A New Approach to CI

Sotto voce

There's No Such Thing as a Free Lunch

Your Roots Are Showing

The Social Expectations of Source Repos

Setting the scene

The more, the merrier

But it hurts

A brief discussion about updating shared dependencies

Choose what to optimise for

bzlmod Modules

Quick Overview

Preparing for bzlmod

The Wrong Module Development Workflow

The Right Module Development Workflow

Tags are Strongly Typed Macros

Module Implementation Functions Replace Workspace Stanzas

bzlmod Lock Files

Managing Your MODULE.bazel

Final Thoughts

Gravity

Cloud Workstation

Migration Complete?

Regular and Infrequent Performance Reviews

Rebuilding the Site

Hugo

Migrating from Blogger

Hosting on App Engine

Deploying via GitHub Actions

Writing Again

Many Months in Selenium: to November

A Month in Selenium: March

A Month in Selenium: February

Two Months in Selenium - November and December

The Selenium Server & Creating New Sessions

Why Use a Monorepo?

Simplicity

Discoverability

Atomicity

Summary

Complementary practices

Thanks

Organising a Monorepo

Language-based monorepos

Project-based monorepos

Functional area-based monorepos

Nix-style monorepos

Blended monorepos

Third party dependencies

Thanks

Tooling for Monorepos

Code size

Requirement for specialised tooling

Reduces the ability of teams to move fast and independently

Politics and fiefdoms

Getting the most of a monorepo

Preparing for `bzlmod`

`bzlmod` Lock Files

Managing Your `MODULE.bazel`