Mapping UK Startups

Since I first washed up on the chalky (more peaty, I guess) British shores, I've been doing my best to get an overview of the geography of UK startup activities. That's my job after all: to figure out where the entrepreneurship hots spots are and why those places are great areas for startups. I forgot about this for a while after being buried in other work and teaching, but I was reminded about this by a recent report by Startup Britain about the the UK's entrepreneurial hotspots. They were kind enough to release the underlying dataset, which was produced by Companies House. The data is a report of how many new firms were registered in every postal code area in the UK.

This data set helped me rediscover the joy and the pain of making maps while watching re-runs of Law and Order.

Plugging the data from Startup Britain into QGIS (a nice, open source GIS platform that actually runs on OS X!) produces a nice visualization of where the UK's entrepreneurs are. UK Startups

This is a pretty diverse geography of startups, but it's about what we'd expect. High levels of entrepreneurship in the Southwest and up into the Midlands, lower levels of entrepreneurship in the Northeast and in the Highlands.

We can make this a bit simpler to get an even broader overview of the UK's entrepreneurial geography. This is an equal area map of the average number of startups in the postal code areas contained within 25 KM hexes I think this is the prettiest map I've ever made.

With this, you can see a very clear pattern of high rates of startup activity in the area between London and Manchester, with fewer activity elsewhere.

But XKCD teaches us that most maps just map population.... XKCD teaches us every lesson.

So, we've got to control for population. This is where I ran into the wall of horrible data collection. It's pretty dang easy to get population for postal code areas England and Wales from NOMIS. But, because of Events over the past 700 years, Scotland gets it's own census and it's not very good at showing what data they have and letting you have it. After several hours of yelling at the computer, I finally found what I needed and could make a map of the number of startups per 1000 people in every area code in the UK (except for northern Ireland, Gibraltar, and the Channel Islands, because I just couldn't bring myself to care.) Startups per 1000 people

This is..... ummm.....less interesting. London is really the only place where we see huge deviations from the mean of 20.66 new firms per 1000 people. Indeed, if we look at a histogram of the log of startups per capita, we see it's really concentrated around the mean. Has anyone writen a history of histograms?

This is because there is a very clear relationship between the population of a postal code area and the number of startups. The correlation coefficient is 78%! This is very apparent when you graph population against startups. The colors! From the graph, it's clear that there are very few regions that have an exceptionally high rates of startups per capita, but there are plenty of regions in the North and the North West which have very low rates.

This is even more apparent when we make a box plot of startups per capita by region. I guess it's more of a violin plot than a boxplot. London does have a lot of areas with exceptionally high levels of entrepreneurship per capita. Of the 6 area codes that have more than 1 reported startup per person, 5 are in London (EC1V, SW1Y, EC4A, W1B, W1S) and one is in Birmingham (B2). I imagine these codes are some weird corporate or historical zones where no one actually lives (maybe just the Queen and her Dogs), which totally throws off the per capita calculation. But even with that, the average startups per capita in London is still significantly higher than the national mean.

So, where do we go from here. The first thing I want to do is try to break this down by industry. In terms of economic development, all new firms aren't created equally. A consulting LLC will likely never employ more than a few people, but a new manufacturing firm can employ many people and export products abroad. We also need to look at firm births as well as death. What regions are gaining startups and which are losing them? We also need more data to figure out what's driving entrepreneurship. High populations do mean more economic activity, but this doesn't help policy makers figure out how to encourage entrepreneurship. We need to look at things like education, levels of immigration and migration, and that fun stuff.

So, I've got a lot of librarians and statisticians to yell at. I want to thank everyone on the twitter-sphere who encouraged me to make these maps, it was a great excuse to learn some new tools and data sources.