Exploring United Nations Migrant Stock Data With theuntools Package

Joshua BrinksISciences, LLC 

Highlights:

  • untools automates retrieval and cleanup of migrant stock data not designed for analysis.
  • Provides multiple automated visualizations

Contents:

Introduction

Note: This vignette was updated August, 2020 to reflect changes to the UNHCR Data API, release of the 2019 International Migrant Stock Population dataset, and subsequent streamlining of the untools package. This vignette still uses the 2017 version of the Migrant Stock data.

In addition to the Population Statistics, the United Nations provides additional data sets detailing international migrant statistics through the Department of Economic and Social Affairs: Population Division. The most widely used dataset is the International Migrant Stock. This data is featured in several recent academic papers; one of the more prominent examples is Guy Abel’s Quantifying Global International Migration Flows (Abel and Sander 2014). The most recent version is entitled the The 2017 Revision, and is available by total international migrant stocks, by age and sex, and by destination and origin. For this vignette we will focus on the destination and origin version.

Acquiring the Data

The International Migrant Stock: 2017 Revision by Destination and Origin Country is available by direct download link here. Unlike the UNHCR Population Statistics time series data featured in our Exploring United Nations Refugee & Asylum Data With the untools Package vignette, the migrant stock data is not analysis-ready. The data is organized for viewing in complex Microsoft Excel spreadsheets. The data is organized in matrices with multiple aggregations listed together, complex headers, and spread across several tabs:

Screen shot containing UN Migrant Stock data. Large headers, offset matrices, multiple aggregations, multiple tabs…

The getUNstock() Function

The untools package can streamline the acquisition and processing of UN migrant stock data using the getUNstocks() function. The getUNstocks() function will automatically retrieve the dataset over http, remove aggregations above the national level, remove unnecessary columns for analysis, introduce ISO country codes, and melt the matrix into long form. You may select either the 2017 or 2019 version of the Migrant Stock data by specifying version = '2017'.

stocks<-untools::getUNstocks(version = '2017')
year host host_iso3 origin origin_iso3 stock
2017 Argentina ARG Afghanistan AFG 9
2015 Argentina ARG Afghanistan AFG 9
2010 Argentina ARG Afghanistan AFG 9
2005 Argentina ARG Afghanistan AFG 15
2000 Argentina ARG Afghanistan AFG 20
1995 Argentina ARG Afghanistan AFG 20
1990 Argentina ARG Afghanistan AFG 20
2017 Australia AUS Afghanistan AFG 39297
2015 Australia AUS Afghanistan AFG 37482
2010 Australia AUS Afghanistan AFG 30880

Additionally, the getUNstocks() function provides arguments to return the data in wide form wide = TRUE, and filter data by year with range = c(2000, 2017). Migrant stock data is available in 5 year intervals starting from 1990 in addition to the revision years (2017, 2019) . By default the getUNstocks() function includes all data. Viewing data in wide format is convenient for examining time series data outside of a figure. Because the getUNstocks() function automatically retrieves the Excel spreadsheet from the UN server, it’s advisable to acquire the data using the default settings and perform additional subsetting or casting into wide format using base R, dplyr, data.table, and reshape2. If you don’t have internet connectivity or already have the Excel spreadsheet place the unaltered excel file in your working directory and getUNstock() will import and process the excel file.

Visualizing UN Stock Data

The untools package provides several default plotting function designed to create modern visualizations for static or time series UN migrant stock data. By default untools will plot a time series of migrant stocks for the 5 greatest populations in the final year. For convenience, the plot() function will automatically adjust the color palette, title, and subtitle based on the chosen country and the number of positive stock populations for the given time period. With no additional arguments the default plotting function will produce a time series of the top 5 stock populations in 2017 for the United States:

usa.stocks<-plot(stocks)

You can specify additional countries using the country = argument. Lets view stock populations in Russia.

rus.stocks<-plot(stocks, country = 'RUS')

Lastly, you can view a static barplot for a given year using the mode = 'static and yr = arguments. Let’s plot the top stock population in Italy during 2005.

ita.static.stocks<-plot(stocks, country = 'ITA', mode = 'static', yr=2005)

References

Abel, Guy J., and Nikola Sander. 2014. “Quantifying Global International Migration Flows.” Science 343 (6178): 1520–2. https://doi.org/10.1126/science.1248676.

Additional Meta-Data

Data Categories:

tabular, wide form dyadic

Add new comment

Plain text

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd>
  • No HTML tags allowed.
  • Web page addresses and email addresses turn into links automatically.