Harmonizing Swiss commune data

Swiss communes are the smallest administrative spatial unit in the country and their borders change over time as units merge, split or exchange territories. In order to create balanced panels that are consistent over time for longitudinal analysis, municipal-level data should be harmonized to the territorial boundary definitions for a given year. For a recent project, I mapped all Swiss communes from 1994 onwards to their present borders as of January 2020. Relevant files are posted on my Github repo.

Background

The blog has been quiet the past couple of months. Like a lot of people, my planning for the year was completely derailed by COVID, and after the spring lockdown, I spent all summer furiously working to finish up the remaining requirements for my Master’s, which, while completed later than expected, is finally done. So to get the blog running again, I’ve decided to, first, share a few simple things from my thesis work that might be of interest to others.

My Master’s thesis looked at the relationship between migration, refugees, and voting outcomes for different social policies in Switzerland. For this analysis, I examined the voting patterns of Swiss communes (i.e. municipalities) over a certain time period.

Municipal-level analysis

On the one hand, the great number of Swiss communes (2000+) provides a lot of variation that can be exploited for econometric analysis, but on the other, longitudinal analysis must account for the dynamic nature of these small spatial units, as borders continually change from year to year.

The figure above shows that, until the mid-1990s, the number of communes was fairly steady at just over 3000, but since that time, the total number has declined rapidly as mergers between small units have become more and more common. As of January 2020, the official number of communes in Switzerland stand at 2,202.

Current borders and other official documentation

The official list of Swiss commune names and identifiers can be found at on the website of the Swiss Federal Statistics Office (OFS). The actual file of interest, the Réperatoire official des communes Suisses is available as an Excel file, as well as in the form of a searchable app. Also included on this page is a historical list of commune names.

Despite my best efforts to find a better solution, I was not able to figure out an easy way to map historical communal borders to a given year’s borders, so I ended up doing most of this work manually. The result of this exercise is a simple crosswalk file that maps all examples old/retired commune numbers to their 2020 equivalents based on the various official datasets that I worked with so far. For reference, this current file should catch all historical listings from the following:

  • Swiss Labor Force survey from 1994-2019 (not including the 2019 special module on platform workers)
  • Swiss land use statistics data files for the two most recent survey periods (2004/09 and 2013/18)
  • Swiss Secretariat for Migration (SEM) data on refugees and asylum seekers by nationality and permit from 1994 onwards (*commune-level data available by request only)
  • All voting data for national parliamentary elections and popular initiatives, as well as any commune-level descriptive data I could find on the excellent STAT-TAB online data portal hosted by the OFS. (*STAT-TAB data is already harmonized to the current year)

Using the crosswalk file

I’m certain there has to be an easier way to do this. I never figured it out, nor did my conversations with the OFS lead to better ideas than this. As this crosswalk file was created to handle only the different official data that I worked with, anyone trying to use this file with datasets outside of the ones I described above will inevitably encounter old commune numbers that I did not catch here. In this section, I detail the process I used to create this file so that any new cases can be easily identified and mapped to current borders as needed.

Before getting started, two files are needed, both available in the Github repo:

cw_commune2020.csv

This is a crosswalk file that can be used to map old, disappeared Swiss communes to their 2020 territorial equivalents. In this file the column commune_ID is the original OFS commune ID number associated with the old, disappeared communes, while commune2020_ID represents the actual commune number that the old unit has been merged or changed to. The commune names columns provided for reference only and should not be used as a unique key to map old to new communes.

crosswalk_commune.xlsx

This is a slightly modified version of the Réperatoire official des communes Suisse, where I have renamed the columns and sheets into English and removed white spaces from column and sheet names. Up-to-date as of 13 January 2020.

Steps:

To harmonize any given commune-level dataset to their 2020 names and ID numbers the following steps can be used:

  1. In the un-harmonized dataset, rename the column with the official commune ID number (often listed as GDENR, Gemeindenummer, GMDNR, GEM_CD, No de la commune, etc.) to commune_ID.

  2. Merge the crosswalk file to this un-harmonized dataset (e.g. left-join) by commune_ID.

  3. Looking at the new merged dataset, check to see if any of the original commune numbers do not have a modern match by looking for missing values in the commune2020_ID column. If all old communes are correctly matched to 2020 communes, there should be no missing values in commune2020_ID.

  4. If there are missing values, find the original commune names that do not have a match (“commune”) and find their current name and ID number using the communes_disappeared and the communes2020 tables in the “crosswalk_commune.xlsx” file. Update the mapping file accordingly, and repeat steps 1-3.

Look up old name, find current commune name
Look up current ID number of new commune
  1. When all old communes in the dataset have a corresponding match, aggregate (or “collapse”) the variables of interest in the dataset on the commune2020_ID number to get the final harmonized dataset.

Please note that the old commune names (commune and commune2020) in the crosswalk file are listed purely as a reference and should not be used as a unique key to match old to new communes as there can be discrepancies in how commune names are spelled or listed at different years in the past, and whether or not the short- or long-name convention is used. Only the old and new commune ID numbers should be used to map (i.e. commune_ID –> commune2020_ID).

Also note that these steps are written according to how the crosswalk file is currently constructed, using my own naming conventions. If you prefer to use the original column names from the Swiss Federal Statistics office, please rename the columns in the crosswalk file so that they match accordingly.

In exceptional cases, there are sometimes communes with no matches where you only have the old commune number, but you do not have the old commune name. When this is the case, the old commune name can be found by using the “liste historisée des communes” (.txt or .xml file, or the corresponding online app).

Pull requests welcome

This crosswalk file is not exhaustive and only captures the cases in the data that I’ve worked with up to now. Regardless, I believe it covers a very large majority of the changes that have happened since 1994, and I hope this be of use to others and can help to save some time. To anyone using this file and who finds additional un-mapped communes, pull requests to update this crosswalk file are most welcome. And of course, if you know of an easier solution, please let me know!

COVID-19 Tracker: Days since N The Swiss Labour Force Survey: Working with labeled data in R
comments powered by Disqus