UK maps showing linguistic clusters of placenames

I remember reading a book when I was in my teens, showing the distribution of placenames on the UK. They showed how different peoples in the UK had distinct naming styles. The distribution of these names can still be seen in modern place names.

Around 2001 I acquired a database of place names and wrote a simple CGI script to generate these maps. The database wasn't very good and the maps were crude. I always meant to improve them.

A few years ago the UK Ordnance Survey released much of its data for public use. In the US it is normal for public bodies to publish data that has been gathered at public expense. In the UK this is still, sadly, quite rare. So, it is good to see the OS do this.

Ordnance Survey, Great Britain, 1:50 000 Scale Gazetteer

I also needed an outline of the UK. I found a county boundary map at

County Data For the British Isles

For historic reasons I store these datafiles on my server at '/usr/local/data/books'. My code expects to find them here.

That gives us the raw data we need to generate the maps. I also needed a way to convert between OS map reference data (usually called Eastings and Northings) and latitude and longitude. There are different schemes for expressing latitude and longitude, so I needed a library to convert these.

Some years ago I came across a Python library by Nick Burch that contains all the functions required. It is a nice compact stand alone library.

There are other ways to do this, including the mpl_toolkits.basemap toolkit, see, for example

Easily change coordinate projection systems in Python with pyproj by John A. Stevenson

I also had some code for converting OS map references into eastings and northings. UK map references divide the country up into a series of tiles, each of which has a pair of numbers.

My code is on GitHub DaveBerkeley/uk_maps

It can be run on the command line, passed one or more regexes to specify the placenames to plot on the map. The output is the file 'map.png'. Or it can be used as a part of a library to produce image files.

I've used fixed record size data entries and sorted indexes to allow for fast searching. The data set is read-only, so I didn't need a sophisticated database. The database and indexes are cached, so they get created the first time you run the code. Subsequent search operations, particularly the binary search, is fast.

Maps showing linguistic clusters of placenames in the UK

The UK has had a series of inward migrations in the last 2000 years.

The pre-Roman areas had several distinct but related cultures. The people of Scotland spoke Gaelic. This is still prevalent in the Western Highlands and Isles. Most people in Scotland now speak English.

In Wales the people spoke Welsh. Wales was invaded and eventually conquered by the English, starting with an enclave in Pembrokeshire, but always retained its distinct language and culture. There is still a strong Welsh speaking region in the North West of Wales, but most Welsh speak English. Welsh is taught in all schools in Wales.

Cornwall has a traditional language, Cornish, but the last native speakers died years ago. There are groups trying to revive the language in Cornwall.

Each of these regional and racial groups left its mark on the country in the form of placenames.

I'll start with some simple ones. There are many names in Cornwall starting with 'Tre'. Tre in the Cornish language means a settlement or homestead. You can see that the 'Tre' region follows the modern Cornish boundary very closely. There are a number of Welsh places too. This may be because of the similarity between the Welsh and Cornish languages.

A very distinctive Welsh place name form is the 'Llan' beginning. eg. Llandudno. We can clearly see this is a name unique to Wales. It simply means 'village' in Welsh.

To show the commonality between Cornish and Welsh, try 'Porth', which means port. This neatly outlines parts of the Welsh and Cornish coasts.

Compare this to 'port' (use regex '.*[Pp]ort$'), which, surprisingly, shows that most ports are inland, except in Scotland, where they have the good sense to put them on the coast. Looking back at the data I discovered that this was because I was showing airports. The gazetteer is a modern one, and has lots of modern placenames which we don't want.

Now for some later invaders. The Vikings settled many areas along the coast (also in Ireland, which I don't have any data for), but the centre of Viking England was Danelaw, with it's capital in York. Distinctive Viking placenames are 'thorp' and the ending 'by'. These can be passed to the program as regular expressions '.*by$' and '.*thorp'. The 'by' endings are in white, 'thorp' in red. You can see the rough extent of Danelaw and the Viking settlements there. The Isle of Man was also a Viking settlement.

On the Irish coast, most of the major ports were Viking settlements. Their longships were able to travel up river systems into the heart of the country. York was the furthest navigable point on the river for their boats and an ideal location for the capital. It would be interesting to compare the Danelaw names with the river catchment areas.

Now for the dominant culture, Anglosaxon. Little is known about the period during which the Saxons and others invaded from the East. It is known, mysteriously, as The Dark Ages. Neither the Saxon invader, nor the indigenous Britons were literate at first, so there is no written record. What we can say is that the Anglosaxon placenames almost completely replaced those of the earlier settlers. We don't know if the people suffered the same fate as their placenames.

One of the few celtic words to survive the Anglosaxon invasion is 'avon'. You still find this is Wales, but it also occurs as the name of several rivers. It means 'river', so the invaders adopted the name by calling the feature "River River".

The most obvious Anglosaxon placename is 'ton', meaning 'town'. This is an interesting distribution. It is more prolific than any others we've looked at. It covers the whole of England, much of Lowland and East Scotland. Wales is largely 'ton' free, except for a patch in Pembrokeshire. This was invaded by the Normans in the late 11th century, so the 'ton' form must have been adopted there on or after this time. There are a few other anomalies. Cornwall has a low 'ton' density. Contrast with neighbouring Devon. There is also a ring around London with a lower 'ton' density. I have no idea why this is. This is a map of Anglosaxon dominance.

There are other less prominent clusters. 'holme', meaning an island, seems to be a Viking word. Here is a map showing '.*[Hh]olme' in white and '.*[Hh]olm[^e]' in red. The white looks like Danelaw. The red looks like some other influence. Look at Shetland and the Orkneys (both settled by Norse).

A mysterious (to me) cluster is 'shield', which I think means Summer pasture, is a Northumbrian word. Generate with '.*[Ss]hield' to catch all of them. I don't know what the significance of the region is. It could be geographical as well as linguistic.

For those not familiar with the UK, I've also added a population map. This gives a clear picture of where people live. It was generated using the same code, but it requires a postcode database, which is not in the public domain. You can see clearly areas of moorland (eg. Dartmoor, Bodmin Moor), central Wales, the Peak District, the Dales, all of which are sparsely populated. Scotland has far fewer people than England, and the Western highlands and Isles are particularly sparse. In Southern England you can see the area used by the military around Salisbury Plain, where people are unable to live.

I've used pallette normalisation to produce the greyscale.

The Norse word for waterfall, 'foss', is mostly expressed as 'force', so here we have a map of Norse occupied places with waterfalls. Lots around Cumbria.

I tried the same thing with plain 'foss' and got something unexpected. Here we have a very clear map of the line of the Fosse Way, the Roman road running from Exeter to Lincoln. Same word, different origin - from 'fosse', ditch. Plus a few Norse waterfalls of course. A 2000 year old road, marked out in Saxon place names.

One last map. Places beginning with 'A' in white and places beginning with 'H' in red. Even though the Western highlands of Scotland are sparsely populated, they are rich in placenames. This 'A|H' must be showing some linguistic bias in Gaelic. Contrast with the population map above it. Wales also seems to have a 'A|H' bias, but not the Shetland and Orkney Islands.