Read the hungary_cities.shp
shapefile located in the data
folder. This dataset contains both scalar and spatial data of the Hungarian cities:
Source: ELTE FI, Institute of Cartography and Geoinformatics
import geopandas as gpd
cities = gpd.read_file('../data/hungary_cities.shp')
display(cities)
The correct encoding of the file should be automatically detected. In case the Hungarian characters are displayed incorrectly, you may specify the encoding manually:
cities = gpd.read_file('../data/hungary_cities.shp', encoding='latin1')
Plot the location of all Hungarian cities:
import matplotlib.pyplot as plt
%matplotlib inline
cities.plot(figsize=[15,10], color='red', markersize=4)
plt.show()
Add a raster base map to it with the contextily package:
import contextily as ctx
# Display the CRS
print(cities.crs)
# Set the CRS to EOV projection (EPSG:23700) if None
if(cities.crs == None):
cities.set_crs('epsg:23700', inplace=True)
# Display the CRS
print(cities.crs)
# Transform the GeoDataFrame to Web Mercator projection (EPSG:3857) to display correctly with the base map
ax = cities.to_crs('epsg:3857').plot(figsize=[15,10], color='red', markersize=4)
ax.set_axis_off()
ctx.add_basemap(ax)
plt.show()
NetworkX supports both the Prim and the Kruskal algorithm for building a minimum / maximum spanning tree, with a uniform interface. The default is Kruskal.
nx.minimum_spanning_tree(graph, weight, algorithm)
Step 1: Create an undirected graph with the towns as the nodes.
import networkx as nx
# Create empty, undirected graph
graph = nx.Graph()
for index, row in cities.iterrows():
graph.add_node(row['City'],
county = row['County'],
status = row['Status'],
ksh_code = row['KSH'],
location = row['geometry']
)
# Check results
print(graph.nodes['Esztergom'])
Display the location in WKT format:
print(graph.nodes['Esztergom']['location'].wkt)
Fetch the (X,Y) coordinates form the location:
print(graph.nodes['Esztergom']['location'].x)
print(graph.nodes['Esztergom']['location'].y)
Calculate the location between 2 cities with the Pythagoras theorem:
import math
def dist(loc_a, loc_b):
return math.sqrt(math.pow(loc_a.x - loc_b.x, 2) +
math.pow(loc_a.y - loc_b.y, 2))
print(dist(graph.nodes['Esztergom']['location'], graph.nodes['Budapest']['location']))
The Point type has a built-in distance()
method to do that:
print(graph.nodes['Esztergom']['location'].distance(graph.nodes['Budapest']['location']))
Step 2: Create a complete graph (add all possible edges).
import math
for city_from in graph.nodes:
location_from = graph.nodes[city_from]['location']
for city_to in graph.nodes:
location_to = graph.nodes[city_to]['location']
if city_from < city_to: # we do not need to add all edges twice
# Add edge to the graph with distance as its cost
graph.add_edge(city_from, city_to,
distance = graph.nodes[city_from]['location'].distance(graph.nodes[city_to]['location']))
# Check results
print(graph['Esztergom']['Debrecen'])
Step 3: Calculate the minimum spanning tree as a new graph.
print('Number of nodes in original graph: {0}'.format(graph.order()))
print('Number of edges in original graph: {0}'.format(graph.size()))
spanning_tree = nx.minimum_spanning_tree(graph, weight = 'distance')
print('Number of nodes in spanning tree: {0}'.format(spanning_tree.order()))
print('Number of edges in spanning tree: {0}'.format(spanning_tree.size()))
Step 4: Visualize results.
# Start new plot figure
plt.figure(figsize=[15,10])
# Plot all edges as black lines in the MST
for edge in spanning_tree.edges:
city_from = edge[0]
city_to = edge[1]
location_from = spanning_tree.nodes[city_from]['location']
location_to = spanning_tree.nodes[city_to]['location']
plt.plot([location_from.x, location_to.x], [location_from.y, location_to.y], color='black')
# Plot all cities as red dots
for city in spanning_tree.nodes:
location = spanning_tree.nodes[city]['location']
plt.plot(location.x, location.y, color='red', marker='o', markersize=2)
# Display plot
plt.show()
Alternative approach: use NetworkX to draw the plot.
# Add all city coordinates a tuples to the nodes of the graph.
for node in spanning_tree.nodes:
spanning_tree.nodes[node]['coords'] = spanning_tree.nodes[node]['location'].coords[0]
# Visualize the spanning tree, using the positions in the coords field.
plt.figure(figsize=[15,10])
nx.draw_networkx(spanning_tree, nx.get_node_attributes(spanning_tree, 'coords'), with_labels=False, node_size=0)
plt.show()