import geopandas as gpd
import matplotlib.pyplot as plt
import os
%matplotlib inline
# Filepaths
= "../files/data/L6"
folder
= os.path.join(folder, "population_square_km.shp")
grid_fp = os.path.join(folder, "roads.shp")
roads_fp = os.path.join(folder, "schools_tartu.shp")
schools_fp
# Read files
= gpd.read_file(grid_fp)
grid = gpd.read_file(roads_fp)
roads = gpd.read_file(schools_fp) schools
Static maps with matplotlib
Download datasets
Before we start you need to download (and then extract) the dataset zip-package used during this lesson from this link.
You should have following Shapefiles in the Data
folder:
- population_square_km.shp
- schools_tartu.shp
- roads.shp
population_square_km.shp.xml schools_tartu.cpg
population_square_km.shx schools_tartu.csv
roads.cpg schools_tartu.dbf
roads.dbf schools_tartu.prj
L6.zip roads.prj schools_tartu.sbn
roads.sbn schools_tartu.sbx
roads.sbx schools_tartu.shp schools5.csv
population_square_km.cpg roads.shp schools_tartu.shp.xml
population_square_km.dbf roads.shp.xml schools_tartu.shx
population_square_km.prj roads.shx schools_tartu.txt.xml
population_square_km.sbn
population_square_km.sbx
population_square_km.shp
Static maps in Geopandas
We have already seen during the previous lessons quite many examples how to create static maps using Geopandas.
Thus, we won’t spend too much time repeating making such maps but let’s create one with more layers on it.
Let’s create a static neighbourhood map with roads and schools line on it and population density as a background.
First, we need to read the data.
Then, we need to be sure that the files are in the same coordinate system. Let’s use the crs of our travel time grid.
= grid.crs
gridCRS 'geometry'] = roads['geometry'].to_crs(crs=gridCRS)
roads['geometry'] = schools['geometry'].to_crs(crs=gridCRS) schools[
Let’s check the histogram first:
# Plot
="Population", bins=10)
grid.hist(column# Add title
"Amount of population km2 Tartumaa histogram")
plt.title('Population', fontsize='x-large')
plt.ylabel( plt.tight_layout()
This don’t look very clear. Let’s improve the styling of the plot a bit and also look at the log-transformed distribution:
# we use numpy to log transform
import numpy as np
# Plot the original histogram
=(9, 4))
plt.figure(figsize1, 2, 1)
plt.subplot('Population'], bins=50, alpha=0.3, color="red")
plt.hist(grid['Population', fontsize='x-large')
plt.ylabel("Original Data")
plt.title(
# The data has very strong positive skew and in order to better see what is happening in the data, we can use log-transform
# Plot the log-transformed histogram
1, 2, 2)
plt.subplot('Population']), bins=50, alpha=0.3) # np.log1p(x) is log(1 + x)
plt.hist(np.log1p(grid['Population', fontsize='x-large')
plt.ylabel("Log-Transformed Data")
plt.title(
plt.tight_layout()
Finally we can make a visualization using the .plot()
-function in Geopandas. The .plot()
function takes all the matplotlib parameters where appropriate. For example we can adjust various parameters
ax
if used, then can indicate a joint plot axes onto which to plot, used to plot several times (several layers etc) into the same plot (using the same axes, i.e. x and y coords)column
which dataframe column to plotlinewidth
if feature with an outline, or being a line feature then line widthmarkersize
size of point/marker element to plotcolor
colour for the layers/feature to plotcmap
colormaps (cmap - parameter)alpha
transparency 0-1legend
True/False show the legendscheme
one of 3 basic classification schemes (“quantiles”, “equal_interval”, “fisher_jenks”), beyond that use PySAL explicitlyk
number of classes for above scheme if used.vmin
indicate a minimal value from the data column to be considered when plotting (also affects the classification scheme), can be used to “normalise” several plots where the data values don’t aligh exactlyvmax
indicate a maximal value from the data column to be considered when plotting (also affects the classification scheme), can be used to “normalise” several plots where the data values don’t aligh exactly
import matplotlib.lines as mlines
= plt.subplots(figsize=(9, 7))
fig, ax
# Visualize the population density into 5 classes using "Quantiles" classification scheme
# Add also a little bit of transparency with `alpha` parameter
# (ranges from 0 to 1 where 0 is fully transparent and 1 has no transparency)
="Population", ax=ax, linewidth=0.03, cmap="Reds", scheme="quantiles", k=5, alpha=0.8, legend=True)
grid.plot(column
# Add roads on top of the grid
# (use ax parameter to define the map on top of which the second items are plotted)
# use the zorder to define the order of the layers.
#Try to remove zorder from roads and schools and see what happens
=ax, color="grey", linewidth=1.5, zorder=2)
roads.plot(ax
# Add schools on top of the previous map
=ax, marker='o', color="black", edgecolor="white", linewidth=0.7, markersize=14.0, zorder=3)
schools.plot(ax
# Remove the empty white-space around the axes
"Population in Tartumaa in relation to schools and major roads")
plt.title(
# Create legend line for roads
= mlines.Line2D([], [], color='gray', label='Roads', linewidth=1)
gray_line
# Create legend marker for schools
= mlines.Line2D([], [], color='black', marker='o', linestyle='None', markersize=4,
black_marker ='white', markeredgewidth=1, label='Schools')
markeredgecolor
# add annotations example (some "gymnasium" aka highschools names only, because too crowded)
for idx, row in schools.iterrows():
if "gümnaasium" in row['name'].lower() and not "Tartu linn" in row['Aadress']:
# take only first part of the name
= row['name'].split(" ")[0]
label 'X']+700, row['y']+700, label, size=10, color='black', bbox=dict(boxstyle='round', pad=0, fc='white', ec='none'), zorder=5)
plt.text(row[
# Combine legend elements
= list(ax.get_legend().legend_handles) + [gray_line, black_marker]
legend_elements = [t.get_text() for t in ax.get_legend().texts] + ['Roads', 'Schools']
legend_labels
# Add the combined legend with bottom right position
='lower right')
ax.legend(legend_elements, legend_labels, loc
'Northing in EPSG:3301 Estonian grid (metres)')
ax.set_ylabel('Easting in EPSG:3301 Estonian grid (metres)')
ax.set_xlabel(
plt.tight_layout()
= "static_map.png"
outfp =300) plt.savefig(outfp, dpi
This kind of approach can be used really effectively to produce large quantities of nice looking maps (though this example of ours isn’t that pretty yet, but it could be) which is one of the most useful aspects of coding and what makes it so important to learn how to code.
Special plots types with Geoplot
https://residentmario.github.io/geoplot/gallery/index.html
From the website:
Geoplot is a high-level Python geospatial plotting library designed for data scientists and geospatial analysts that just want to get things done. It is an extension to cartopy and matplotlib which makes mapping easy.
We might have to install additional packages.
micromamba install -c conda-forge contextily geoplot
Geoplot provides a few handy plot types, that would be very complicated to manually describe in matplotlib. In the following some examples are shown, that use data from the earlier lessons.
Using web tile layer as background and mapclassify directly
We want to create a choropleth plot and use one of those shiny web basemaps in a static plot.
Then of course we want to provide a reclassified visualisation of our polygon data, in this case the population in administrative units. Let’s load the dataset and, again, convert that column into numbers.
import geopandas as gpd
import matplotlib.pyplot as plt
import pandas as pd
= "../files/data/L4/population_admin_units.shp"
fp
= gpd.read_file(fp)
acc 'pop_int'] = acc['population'].apply(pd.to_numeric, errors='coerce').dropna()
acc[= acc.to_crs(4326) acc_wgs84
Now let’s import a couple useful helpers here and get going with Geoplot, which ties these things nicely together:
For the background layer we can use a small package, named contextily that provides that functionality. For the classification we use of course mapclassify.
https://contextily.readthedocs.io/en/latest/
import geoplot as gplt
import geoplot.crs as gcrs
import mapclassify as mc
import contextily as ctx
The Geoplot library works a bit different in that way, that it takes over the axes object. We sort of have to pass it around downstream:
# receive the **ax** object from a ``geoplot gplt.*`` function
= gplt.webmap(df=acc_wgs84, projection=gcrs.WebMercator(), zoom=8, provider=ctx.providers.CartoDB.Positron, figsize=(10,8))
ax
# create classifier
= mc.FisherJenks(acc_wgs84[['pop_int']], k=5)
scheme
# create choropleth plot, inserting the classifieras scheme and the ax object like we are used to
gplt.choropleth(='pop_int', projection=gcrs.WebMercator(),
acc_wgs84, hue='white', linewidth=1, alpha=0.7,
edgecolor='viridis', legend=True,
cmap=ax,
ax=scheme
scheme
)
# set a title
"Population per municipality", fontsize=20) plt.title(
Text(0.5, 1.0, 'Population per municipality')
GeoPlot cartogram, using a variable as scale for size of polygon
I’d like to demonstrate the cartographic method of a cartogram.
A cartogram distorts (grows or shrinks) polygons on a map according to the magnitude of some input data. A basic cartogram specifies data, a projection, and a scale parameter.
https://residentmario.github.io/geoplot/api_reference.html
Here I also show, that you can have a dark background, is a option to set in matplotlib. Strong colors and contrasts work well on a dark background.
# set dark background for matplotlib-based plots
'dark_background')
plt.style.use(
# again, we need to take the ax object with us to the next plot
# we also reuse the scheme mc classifier
# hue tells from which column to take the value
= gplt.cartogram(
ax ='pop_int', projection=gcrs.AlbersEqualArea(),
acc_wgs84, scale=True, legend_kwargs={'bbox_to_anchor': (1, 0.9)}, legend_var='hue',
legend='pop_int',
hue=scheme,
scheme='viridis',
cmap=(0.5, 1),
limits=(9, 7)
figsize
)# for comparison we also plot the original size boundaries
='white', edgecolor='white', alpha=0.8, ax=ax)
gplt.polyplot(acc_wgs84, facecolor
"Population per municipality as cartogram", fontsize=20) plt.title(
Text(0.5, 1.0, 'Population per municipality as cartogram')
GeoPlot making a static KDE plot as contour or filled
Geoplot provides a variety of practical use-case-based plot types. The last one I’d like to show here is the KDEplot.
Kernel density estimation (KDE) is a technique that non-parameterically estimates a distribution function for a sample of point observations. It is not a heatmap as such, as we don’t scale based on parameter values from the point data, but only the number of points close together. KDEs are a popular tool for analyzing data distributions.
A basic kdeplot takes pointwise data as input. For spatial context, let’s also plot an underlying basemap. For completeness we also set title and save the plot.
https://residentmario.github.io/geoplot/api_reference.html
# reset the background to default / white again
'default') plt.style.use(
The underlying webmap plot function has a few limitations, for example that it interpretes geomtries as WGS84 rto calculate extent etc. Also, the webmap only plots tiles in WebMercator projection. So we reproject the data to WGS84.
import geopandas as gpd
# porijogi_sub_catchments
= "../files/data/L3/porijogi_sub_catchments.geojson"
catchments_fp = gpd.read_file(catchments_fp, driver='GeoJSON')
catchments
# protected species under class 3 monitoring sightings
= "../files/data/L3/category_3_species_porijogi.gpkg"
species_fp = gpd.read_file(species_fp, layer='category_3_species_porijogi', driver='GPKG')
species_data
= catchments.to_crs(4326)
c4 = species_data.to_crs(4326) s4
/usr/lib64/python3.13/site-packages/pyogrio/raw.py:198: RuntimeWarning: driver GeoJSON does not support open option DRIVER
return ogr_read(
/usr/lib64/python3.13/site-packages/pyogrio/raw.py:198: RuntimeWarning: driver GPKG does not support open option DRIVER
return ogr_read(
Now we build up the whole plot example. We can initialize our matplotlib plt handle, but not the ax object. As you see, we have to take the ax object from each function and pass it to the next. With the plt handle we have the context to save the image to a file.
= plt.figure()
fig
= gcrs.WebMercator()
proj
= gplt.webmap(df=gpd.overlay(c4, s4, how='union'), projection=proj,
ax =10, provider=ctx.providers.CartoDB.Voyager,
zoom=(9, 7))
figsize
= gplt.pointplot(s4, projection=proj, color='green', edgecolor='white', linewidth=0.5, alpha=0.8, ax=ax)
ax
= gplt.kdeplot(s4, projection=proj, cmap='YlOrRd', fill=False, linewidths=5, alpha=0.7, ax=ax)
ax
"Distribution of recorded specimen in south Tartu county", fontsize=18)
ax.set_title(
plt.tight_layout()# plt.savefig("species-kde.png", bbox_inches='tight')
<Figure size 640x480 with 0 Axes>
The KDEplot function provides a fill parameter, that allows us to switch between contour lines and filled contours.
= gcrs.AlbersEqualArea()
proj
= gplt.pointplot(s4, projection=proj, color='green', edgecolor='white', linewidth=0.5, alpha=0.8, figsize=(9,7))
ax
= gplt.polyplot(c4, projection=proj, edgecolor='black', linewidth=2, alpha=0.7, ax=ax)
ax
= gplt.kdeplot(s4, projection=proj, cmap='YlOrRd', fill=True, alpha=0.7, ax=ax)
ax
"Distribution of recorded specimen in south Tartu county", fontsize=18)
ax.set_title(
plt.tight_layout()# plt.savefig("species-kde-shaded.png", bbox_inches='tight')
Task:
Try to change your plotting parameters, colors and colormaps and see how your results change! Change the order of plotting the layers and vector plotting criteria and see how they change the results.
Download the notebook:
Launch in the web/MyBinder: