https://images.unsplash.com/photo-1621632361333-4649f0b59adc?ixlib=rb-4.0.3&q=85&fm=jpg&crop=entropy&cs=srgb

Africa is a significant origin of irregular migration, playing a crucial role in global policy discussions. A key concern within these debates is the effectiveness of return orders issued to individuals and their subsequent compliance in returning to their home countries.

The primary objective of this assignment is to visually represent the number of people who were instructed to return to their countries by EU nations, as well as those who actually complied with these orders, with a focus on sub-regional data. I want to answer three questions:

  1. Which regions have the highest number of return orders and actual returns between the years of 2013 and 2022?
  2. Which year has the highest return rate?
  3. Which region has the highest return rate in 2022?

To accomplish this, I utilised two datasets from Eurostat: "Third Country Nationals Ordered to Leave" and "Third Country Nationals Returned Following an Order to Leave." In this project, I aim to create visualisations that can be utilised to facilitate a comprehensive analysis of the dynamics and outcomes of return policies within the African context.

You can access the Tableau dashboard for this project here, and find the complete SQL script here.

Data Wrangling

After downloading both datasets, I proceeded to create empty tables on my local PostgreSQL server to store and manage the data effectively.

CREATE TABLE ordered (
    dataflow VARCHAR(50),
    last_update TIMESTAMP,
    freq CHAR(1),
    citizen VARCHAR(10),
    age VARCHAR(10),
    sex VARCHAR(5),
    unit VARCHAR(10),
    geo VARCHAR(10),
    time_period INTEGER,
    obs_value INTEGER
);
CREATE TABLE returned (
    dataflow VARCHAR(50),
    last_update TIMESTAMP,
    freq CHAR(1),
    citizen VARCHAR(10),
		c_dest VARCHAR(20),
    age VARCHAR(10),
    sex VARCHAR(5),
    unit VARCHAR(10),
    geo VARCHAR(10),
    time_period INTEGER,
    obs_value INTEGER,
		obs_flag VARCHAR(50)
);

Through the shell I copied the downloaded files into my server:

\\COPY returned (dataflow, last_update, freq, citizen, 
c_dest, age, sex, unit, geo, time_period, obs_value, obs_flag)  
FROM '/.../migr_eirtn_linear.csv'  WITH (FORMAT csv, HEADER true, DELIMITER ',');
\\COPY ordered (dataflow, last_update, freq, citizen, age, sex, unit, geo, time_period, obs_value) 
FROM '.../migr_eiord_linear.csv' 
WITH (FORMAT csv, HEADER true, DELIMITER ',');

Here is the breakdown of relevant columns:

citizen nationality of the individual
c_dest destination category
geo The EU country that issued the return order or the country that the individual is returning from
time_period year
obs_value number of return orders or returns

The Eurostat datasets lack sub-regional categorisation, so I added a file with the required classifications:

create table africa (
	code varchar(2)
	sub_region varchar(50)
)
\\COPY africa (code, sub_region)  FROM '.../african_countries.csv' 
 WITH (FORMAT csv, HEADER true, DELIMITER ',');