Introduction to Python GIS
Why Python for GIS?
Python is an extremely useful language to learn in terms of GIS since many (or most) of the different GIS Software packages (such as ArcGIS, ArcGIS Pro and QGIS etc.) provide an interface to do analysis using Python scripting. During this course, we will mostly focus on doing GIS without any third party softwares such as ArcGIS. Why? There are several reasons for doing GIS using Python without any additional software:
- Everything is free: you don’t need to buy and expensive licenses
(e.g. for ESRI softare)
You will learn and understand much more deeply how different geoprocessing operations work
Python is quite efficient and used for analysing Big Data
Python is highly flexible and supports all data formats that you can imagine
Using Python (or any other open-source programming language) supports open source softwares/codes and open science by making it possible for everyone to reproduce your work, free-of-charge.
Plug-in and chain different third-party softwares to build e.g. a fancy web-GIS applications as you want (using e.g. GeoDjango with PostGIS as a back-end)
Learning objectives
At the end of the course you should:
know basic concepts, skills, and tools for working with the Python and R scripting environments
receive an overview of practical Python (and R) libraries for everyday scientific and professional GIS use
understand how to make use of integration of Python (and R) environments from other software packages
be able to apply Python (and R) to solve common data-related tasks in concrete GIS projects
be competent use spatial and non-spatial data in order to answer a research question
know how to conduct and automate different standard GIS-related tasks that support clear documentation of methods in the Python (and R) scripting environments
In particular that translates to following direct tasks for Python during the next lessons:
Read / write spatial data from/to different file formats for vector and raster data
Deal with different projections
Do different geometric operations and geocoding
Reclassify your data based on different criteria
Do spatial queries
Do simple spatial analyses and processing
Visualize data and create (interactive) maps, such as following:
What sort of tools are available for doing GIS in pure Python?
You might have already used few Python modules for conducting different tasks, such as numpy for doing mathematical calculations or matplotlib for visualizing our data. From now on, we will familiarize ourselves with a bunch of other Python modules that are useful when working with spatial data for different GIS analysis tasks.
One drawback when compared to using a specific GIS-software such as ArcGIS Pro, is that GIS tools are spread under different Python modules and created by different developers. This means that you need to familiarize yourself with many different modules (and their documentation), whereas e.g. in ArcGIS everything is packaged under a same module called arcpy.
If you use QGIS (highly recommended) you might want to checkout how to use Python in QGIS via the Python console
Below we have listed most of the crucial modules (and links to their docs) that helps you get going when doing data analysis or GIS in Python. If you are interested or when you start using these modules in your own work, you should read the documentation from the web pages of the module that you need:
GIS, Geospatial Data analysis & visualization packages used in this course:
Numpy - Fundamental package for scientific computing with Python
Pandas - High-performance, easy-to-use data structures and data analysis tools
Matplotlib - Basic plotting library for Python
Shapely - Python package for manipulation and analysis of planar geometric objects (based on widely deployed GEOS).
Geopandas - Working with geospatial data in Python made easier, combines the capabilities
Fiona - Reading and writing spatial data (lower-level API basis for geopandas).
Pyproj - Performs cartographic transformations and geodetic computations (based on PROJ).
GDAL - Fundamental package for processing vector and raster data formats (many modules below depend on this). Used for raster processing. of pandas and shapely.
Rasterio - Clean and fast and geospatial raster I/O for Python and the library Rasterstats which is build on top of Rasterio.
Pysal - Library of spatial analysis functions written in Python.
EarthPy: A integrating higher-level API package for working with Earth Data
Cartopy - Make drawing maps for data analysis and visualisation as easy as possible.
Geoplot - High-level geospatial data visualization library for Python.
GeoViews - Interactive Maps for the web.
Holoviews Panel - Interactive dashboards for Python.
Folium makes it easy to visualize data that’s been manipulated in Python on an interactive leaflet map.
Additional packages you could explore independently after the course:
OSMnx - Python for street networks. Retrieve, construct, analyze, and visualize street networks from OpenStreetMap
Networkx - Network analysis and routing in Python (e.g. Dijkstra and A* -algorithms), see this post.
Geopy for Geocoding library: coordinates to address <- address to coordinates.
owslib for data access with Open Geospatial Consortium (OGC) web services (hence OWS)
Bokeh - Interactive visualizations for the web (also maps)
Plotly - Interactive visualizations (also maps) for the web (commercial - free for educational purposes)
Scipy - A collection of numerical algorithms and domain-specific toolboxes, including signal processing, optimization and statistics
Scipy.spatial - Spatial algorithms and data structures.
Datashader is a graphics pipeline system for creating representations of very large datasets quickly and flexibly, building on Holoviews/Geoviews
Rtree - Spatial indexing for Python for quick spatial lookups.
Dash - Dash is a Python framework for building analytical web applications.