Code and Coordinates
engineering at Geocodio
Everything we got wrong about UK addresses
For twelve years, Geocodio has turned addresses into coordinates and back again, first in the US, then Canada, then Mexico. We process about 2 billion lookups a month and maintain a database of more than 300 million address points in the US alone. Along the way we'd met postal codes with a space in the middle, French-language addresses in Quebec, Spanish-language addresses in Mexico, and street names that arrive before the house number. I'm Danish, and I grew up writing my address with the number after the street, so I knew the American way of writing an address was only one way of writing an address, not a law of nature. In short… we thought we understood addresses.
Then we started working on UK support.
The country kindly spent the next several months showing us how much thought it had put into something we figured we understood, and how little of it we truly understood. Some of this surfaced as bugs we had to fix. More of it made us admire the system we were plugging into. This post is the tour.
The UK has one of the most precise, most thoughtfully engineered addressing systems in the world. When showed up with a parser that had already learned to bend for three countries, and most of it held up fine. But the UK runs on a handful of ideas that have no North American equivalent, and each one landed on an assumption we hadn't thought to question.
We learned a lot about addresses along the way.
Here is what Britain taught us.
"Every address has a house number"
Our entire geocoding engine is organized around the idea that an address is a number on a street.
Then we met addresses like this:
Rose Cottage, High Street, Teddington
The Old Barn, Rectory Lane, WR6 6TN
And… no number.
And none needed, either.
In the UK, a building name is not a nickname or a vanity plate. It is the address, recognized by Royal Mail and present as a first-class field in the national address data.
Our parser's first attempt at "The Old Barn" was to quietly fold "Old" into the street name and throw the rest away. (It did this with great confidence.) We had to teach it that the first part of a UK address might be the name of a building, and that the name is load-bearing.
And it goes further: about 2.4% of British addresses have a building name and a postcode and no street at all. "Spyders Castle, WR6 6TN" is a complete, deliverable address. Our US-trained engine looked at that the way you'd look at a sentence with no verb.
The upside is that this work travels. Building-name addressing is not a British quirk so much as a British specialty, and the handling we built for it should help us in places like Puerto Rico, where US mainland-style numbered addresses are not a given either.
A bug named after a cottage today is infrastructure for somewhere else tomorrow.
"The town in your address is the town you live in"
This one is my favorite, because it sounds like it can't be true.
In the UK, the town in your postal address is the "post town": the town whose delivery office serves you. It is a fact about how your mail travels, not about where you live. If you live in the village of Whitgift in East Yorkshire, your address says GOOLE, the nearby town that sorts your post. People in Kingswood, a town with its own identity and its own town centre, get mail addressed to BRISTOL.
So a UK address contains two town concepts that only sometimes agree: the place you would say you live, and the place Royal Mail routes your letters through. In the national address data, these are two separate fields, and they differ for roughly 17% of postcodes.
We had to rebuild our UK place lookup to store both, so that geocoding "Kingswood" and geocoding "Bristol" both land where the user expects.
This destroyed our geocoder’s assumption that each address only belongs to one city, and we had to introduce a many-to-one relationship for address:city combinations.
"A postcode is like a ZIP code"
They look similar. But it turns out, they are not remotely the same thing.
A US ZIP code covers a neighborhood, sometimes a whole town. ZIP codes are basically the equivalent of the post town in the UK: Each post office has its own ZIP code, and that’s where all of the mail for that area is sorted. The US has about 41,000 postal codes for 340 million people.
By contrast, the UK has roughly 1.8 million postcodes for 68 million people, which works out to a different unit of measurement entirely: a full UK postcode is around 15 letterboxes. Not a town. More like a handful of front doors.
The closest US equivalent is ZIP+4, which narrows things down to that same block-face precision. But ZIP+4 was built for mail-sorting machines, gets appended by software, and shows up on roughly zero envelopes addressed by a human.
The UK took that precision and made it the part everyone memorizes. The reason it's memorable is the reason it's useful: a postcode like WR6 6TN is two meaningful halves. The outward code (WR6) points to a postal area and district, roughly a town or a slice of a city. The inward code (6TN) narrows that to a sector and then a unit of around 15 neighboring addresses. Six or seven characters carry you almost to the doorstep. Add a house number and you have, for most purposes, a complete address.
That changes what a postcode is for. In a US address, the ZIP is a routing hint, and a geocoder does fine without it. In a UK address, the postcode is the closest thing there is to the address itself, and everything else is elaboration. It's the index.
Ask a British person for their postcode and you get it instantly, correctly formatted, because they use it the way the rest of us use a street name.
Ask an American for their ZIP+4 and you get the first five digits and a shrug.
We learned how load-bearing it is from our own response times. While testing, "215 King Street, Aberdeen" with no postcode took around two seconds to resolve, because the engine had to weigh every King Street across a wide area (Britain has a great many King Streets). The same query with a postcode came back in about 80 milliseconds. Leave the postcode out and you've asked a much harder question.
We also shipped a bug where WR6 6TN failed while WR66TN worked, because our normalization didn't expect the space to matter. It does: the space splits the outward code from the inward code, and Britain writes it so consistently that the space is effectively part of the spelling. We already handle Canadian postal codes, which are always the same shape: six characters split by a single space. UK postcodes, by contrast, aren’t always the same number of characters, and the first part can be anywhere from two to four characters.
"'Fl' means floor"
Here is a two-letter abbreviation that cost us real engineering time.
Our address normalization vocabulary comes from the USPS, where FL is the standard abbreviation for Floor. Floor 2, second floor. Sensible.
In the UK, the dominant unit type is the flat. When we first imported the UK data, 1.23 million British flats flowed through our North American-shaped pipeline and came out the other side labeled Fl. Which means our system was prepared to tell you, with a straight face, that Flat 2 was on the second floor.
Worse: when a user searched for "Flat 2, Wisteria House", our parser, having never met a flat, tried to read "Flat" as part of the street name.
The fix is a country-aware vocabulary, where unit designators mean what they mean locally.
"One country, one set of political districts"
Geocodio's claim to fame in the US is appending congressional districts to addresses. Naturally we wanted the British equivalent: give us an address, we tell you the Westminster constituency, the ward, and so on.
What we found is that Great Britain is four countries in a trench coat. England, Scotland, Wales, and Northern Ireland are not regions or provinces, and they are definitely not states (more on that mistake in a moment). They are constituent countries, and each answers the question "who represents this address?" differently.
Every address has a Westminster constituency. So far so good.
A Scottish address also has a Scottish Parliament constituency and region.
A Welsh address has Senedd representation, and while we were building this, Wales reorganized the entire system: 40 constituencies and 5 regions became 16 new constituencies, with the regions abolished outright.
An English address has none of these, as there is no devolved English parliament, and returning nothing is the correct answer. We had to write test cases asserting an empty result, which felt wrong and was right.
Two details from this work that I genuinely treasure:
In the Scottish Parliament, three constituencies are protected by statute and cannot be redrawn. Orkney and Shetland have each been ring-fenced as their own seat since the Scotland Act 1998, and Na h-Eileanan an Iar, the Western Isles, was added in 2018. The other 70 Holyrood seats are fair game for boundary reviews, but the islands are forever.
Second: the official national statistics encode the Channel Islands and the Isle of Man with the sentinel codes
L99999999andM99999999, which is the government's way of saying "these have postcodes that look British, but they are not in the United Kingdom, please stop asking."
One last piece of vocabulary we made our peace with. We say UK in conversation, but in our code and our API we say GB, because GB is the ISO country code for the United Kingdom, even though the code is the name of an island, not the name of the country. The standard-setters reached for the famous part and called it close enough, and now we type GB, say UK, and try not to think about it too hard.
"Our API field names are fine"
This was the most humbling one, because the thing that broke wasn't our parser or our data model. It was our vocabulary.
Geocodio's response schema was designed around the US and subsequently expanded to Canada and Mexico. As a result, it returned zip and state fields.
Then we started testing UK addresses, and there it was: our API returning Scotland as a state. Scotland is a nation with its own parliament, its own legal system, and a thousand years of history. Calling it a state is not a rounding error. It's wrong. And zip is a purely American term (a USPS trademark, at that). The rest of the English-speaking world says postcode or postal code.
We were not going to launch in the UK with field names that misclassified an entire nation. So before the UK ever went live, we shipped v2 of the Geocodio API, and I can tell you the hardest part was not the code. It was the naming. We spent a long time studying how everyone else handles this. Google Maps solves it with administrative_area_level_1 through administrative_area_level_3, which is technically correct for every country on Earth and a genuine misery to read and parse. (Quick: which level is a county?)
We wanted something a human could understand without a decoder ring, which meant arguing about questions like "should we say postal_code or postcode?" for longer than I'd like to admit.
And the rename couldn't happen in one place. The API, the spreadsheet output columns, the upload header detection, the Lists API, six client libraries, and every code sample in the docs all had to move in lockstep. Otherwise we'd ship the absurd situation where the API says postal_code and the spreadsheet you download still says Zip.
One field name, designed in 2013 for American addresses, and the bill came due twelve years later.
And we are not finished paying it. To this day, if you hand Geocodio an address with no country attached, it assumes the United States. That made sense when the US was the only thing we did. It is not great UX for a geocoder that supports multiple countries, and fixing it properly is its own project that had to wait while the rest of this got built. Internationalization is not a feature you finish. It's a debt you keep servicing.
The part where we got jealous
Here is the thing that reframed the whole project for us.
To build our 300+ million US address points, we stitch together almost 3,000 separate data sources: counties, cities, states, each with their own formats, quirks, and update schedules. It works, and we're proud of it, but it is a quilt.
Great Britain — England, Scotland, and Wales — has one list. Ordnance Survey's AddressBase, built on Royal Mail's Postcode Address File, covers every addressable building in Great Britain: 33 million addresses, updated weekly, each one with rooftop-precision coordinates. Every property has a UPRN, a Unique Property Reference Number, a stable open identifier that follows the building through its entire life. Two government-adjacent institutions maintain, between them, a single authoritative answer to the question "what addresses exist?"
The United States has nothing like this. No national address list, no universal property ID, no single source of truth. As an American geocoding company, looking at AddressBase for the first time felt like a medieval scribe being handed a laser printer.
There is a catch, and it is a real one. The US patchwork is messy, but most of it is public data we are free to use. AddressBase is the opposite: pristine, and licensed, with terms attached that reshaped more than our code. Some of the hardest work of this whole project had nothing to do with the geocoder, and everything to do with how we are allowed to bill for data this tightly held. The quality is worth it. It was not a dance on roses.
The catch with this, that we realized later than we’d like to admit, was that our address sources were for Great Britain — not the United Kingdom. (The GB vs UK distinction strikes again.) It turns out Northern Ireland has its own Ordnance Survey with its own licensing terms that ended up being utterly unworkable for us. Unfortunately, we ended up having to launch with postcode-level accuracy in Northern Ireland as a result.
What we took away
Address formats are culture. You cannot port assumptions between countries, only data models flexible enough to hold both.
The postcode is the atom of British geography. Respect it and everything downstream gets easier.
When the data is this good, most "data problems" turn out to be your-code problems.
So yes, we arrived with North American instincts and left with a long list of bugs named after cottages. But the real takeaway runs the other way: the UK's addressing system is a quietly remarkable piece of national infrastructure, and most of the work of supporting it was rising to its standard, not the reverse.
Moving forward
UK geocoding is now available on Geocodio, with the same API you already use for US, Canadian, and Mexican addresses. And if you happen to be at Laravel Live UK in London next week, come find me. I will happily talk about post towns for longer than anyone should.
One more thing: everything in this post—the parsing, the data modeling, the four-nations district logic—turned out to be the straightforward part of launching in the UK.
The hard part took four months and had almost nothing to do with geocoding. Royal Mail's licensing terms sent us into open-heart surgery on parts of the codebase that had nothing to do with addresses, a part of the codebase that gives any SaaS engineer shivers: the billing system. We had to move to per-user billing (and add multi-currency support!), and all of it rebuilt while the lights stayed on.
But that's a story for another day.
Want to read the full article?
Subscribe to Code & Coordinates for the full post and future engineering articles from the Geocodio team.
Get new posts in your inbox
We write about what we're working on, thinking about, and getting so excited playing around with that we accidentally stay up a bit too late.