Covering the materials of Chapter 11.
Topics: vector spatial data management with geopandas
In the attached data
folder the following attached datasets are given for this assignment:
hungary_admin_8.shp
, containing the city level administrative boundaries of Hungary. (Data source: OpenStreetMap)hungary_population_2020.csv
, containing the population of Hungarian cities on 2020 January 1. (Data source: Hungarian Government)hungary_population_2011.csv
, containing the population of Hungarian cities on 2011 January 1. (Data source: Hungarian Government)Note: in the CSV files the columns are delimited with ;
characters (instead of the default ,
).
Write a program that creates a thematic map for Hungary based on the adminstrative boundaries of the cities and their population in 2020.
(Use the All population field from the CSV file.)
import pandas as pd
import geopandas as gpd
# Read the datasets
cities = gpd.read_file('../data/hungary_admin_8.shp')
cities = cities[['NAME', 'geometry']]
cities.set_index('NAME', inplace=True)
population_2020 = pd.read_csv('../data/hungary_population_2020.csv', delimiter = ';')
population_2020.set_index('City', inplace=True)
# Add the population DataSeries to the cities "manually"
df = cities.copy()
df['All population'] = [None] * len(cities)
# Get the indexes which are present in both DataFrames
indexes = set(cities.index) & set(population_2020.index)
for index in indexes:
df.loc[index, 'All population'] = population_2020.loc[index]['All population']
display(df)
# This can be done in an easier and more efficient way with pandas' merge() function
df = cities.merge(population_2020, left_index=True, right_index=True)
display(df)
import matplotlib.pyplot as plt
%matplotlib inline
# Create the plot
df.plot(column='All population', figsize=[20,10], legend=True, cmap='YlOrRd', scheme='quantiles', k=7)
plt.show()
Write a program that adds the population data for 2011 and 2020 to the Shapefile as new scalar fields to each city; and save it as a new Shapefile.
population_2011 = pd.read_csv('../data/hungary_population_2011.csv', delimiter = ';')
population_2011.set_index('City', inplace=True)
df = df.merge(population_2011, left_index=True, right_index=True, suffixes=[' 2020', ' 2011'])
df.rename(columns={'County 2020':'County'}, inplace=True)
del df['County 2011']
display(df)
# Save it to file
df.to_file('hungary_population.shp')
Write a program that creates a thematic map for Hungary based on the adminstrative boundaries of the cities and their population change between 2011 and 2020.
df['Population difference'] = df['All population 2020'] - df['All population 2011']
ax = df.plot(column='Population difference', figsize=[20,10], legend=True, cmap='bwr', vmin=-5000, vmax=5000)
ax.set_facecolor("lightgray") # background color
plt.show()
Optional: add a raster basemap with contextily.
# How to install: conda install -c conda-forge contextily
# How to use: https://contextily.readthedocs.io/en/latest/
import contextily as ctx
# Verify CRS, must be Web Mercator (EPSG:3857) to add a base map with the contextily module.
print(df.crs)
if df.crs == 'epsg:3857':
ax = df.plot(column='Population difference', figsize=[20,10], legend=True, cmap='bwr', vmin=-5000, vmax=5000, alpha=0.85)
ctx.add_basemap(ax)
ax.set_axis_off()
plt.show()
else:
print('CRS must be EPSG:3857, instead {0} was given'.format(df.crs))
Write a program that creates a thematic map for Hungary based on the adminstrative boundaries of the cities and their population density in 2020.
df_eov = df.to_crs('EPSG:23700') # EOV is EPSG:23700
df['Area'] = df_eov.area / 10**6
df['Density 2020'] = df['All population 2020'] / df['Area']
display(df)
df.plot(column='Density 2020', figsize=[20,10], legend=True, cmap='YlOrRd', scheme='quantiles', k=7)
plt.show()