Getting to Know the SciPy Stack

by | Aug 4, 2017

This blog post introduces the Python packages that can be found in the SciPy stack, that is now shipped with ArcMap and available through Conda for ArcGIS Pro users.

SciPy is a Python-based ecosystem of open-source software for mathematics, science, and engineering. This stack can be used to extend the capabilities of ArcGIS and let you do scientific computing in Python. The six most important packages found in the SciPy stack are covered below, along with some popular use cases. The SciPy website offers extensive documentation, examples, links and tutorials for each package so this is a great place to start. The most commonly libraries used in data sciences are NumPy, SciPy and MatPlotLib and these are also recommended as a starting point for novices.

Like what you’re reading? Subscribe now and receive the full version of our newsletter delivered to your inbox each week.

ArcGIS Pro users can access the SciPy stack through the Conda package manager, while ArcMap users have all packages (except for IPython) available as these are shipped with all versions above ArcMap 10.4. The ArcGIS Python API also offers access to the SciPy stack.

  1. NumPy

NumPy adds support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. Numpy has been part of the ArcGIS software installation since 9.2. For this reason, you can find many NumPy entries in the ArcGIS Help documentation, such as “Working with NumPy in ArcGIS. Together, ArcGIS and NumPy can interoperate on raster, table and feature data – for example convert features, tables and rasters to Numpy arrays and back. Here´s a link to the official NumPy documentation.

  1. The SciPy library

Not to be confused with the SciPy stack itself, the SciPy library is a collection of numerical algorithms and domain-specific toolboxes, including signal processing, optimization, statistics, interpolation and much more. SciPy extends the NumPy library and adds to NumPy functions. One particular package, called scipy.spatial, offers spatial computational methods such as triangulations, Voronoi diagrams, and convex hulls of a set of points. Also of interest for spatial data users is scipy.ndimage, which enables multidimensional image processing. Here´s a link to the official SciPy library documentation.

  1. Matplotlib

Matplotlib is a plotting package and API for NumPy data, that provides publication-quality 2D plotting as well as rudimentary 3D plotting. Matplotlib can be used in Python scripts, the Python and IPython shell, the Jupyter notebook, web application servers, and four graphical user interface toolkits. For code examples and an extensive image gallery to give you some ideas of what´s possible, see this link.

Like what you’re reading? Subscribe now and receive the full version of our newsletter delivered to your inbox each week.

  1. IPython

IPython is an interactive computational environment in which you can combine code execution, rich text, mathematics, plots, and rich media. IPython offers an interactive Python shell and a Jupyter kernel to work with Python code in Jupyter notebooks and other interactive frontends. This last option means that you´re now using a web browser as an IDE that offers extended capabilities with regards to the layout of a script or code snippet. These scripting documents are called Jupyter notebooks and quickly became adopted for presentations and tutorials. These used to be called IPython notebooks, but Jupyter became a separate project over time. Both IPython and Jupyter are installed as a part of Anaconda3. For more information, see these this link to most recent documentation on IPython, the IPython website and Jupyter website.

  1. pandas

pandas stands for PANel DAta analySis and is a package for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. A fundamental library feature is a the DataFrame object, that treats tabular (and multi-dimensional) data as a labeled, indexed series of observations which is somewhat comparable to Excel. pandas aims to provide a lot of the data manipulation and analysis functionality that people use R for. For more information, see this link.

  1. Sympy

Sympy offers symbolic mathematics and computer algebra. Symbolic computation deals with the computation of mathematical objects symbolically. This means that the mathematical objects are represented exactly, not approximately, and mathematical expressions with unevaluated variables are left in symbolic form. With Sympy, the square root of the number “8” would not yield an approximate decimal form, but a symbolic result that would also be symbolically simplified:

>>> import sympy>>> sympy.sqrt(8)2*sqrt(2)

 

For more information, see this link.

  1. Nose

Nose is a framework for testing Python code, in order to improve your productivity and create robust code. Nose builds on the unittest framework, which is part of the Python standard library and extends it to make testing easy. It includes a number of plugins and can be extended with third-party plugins. For more information, see the official documentation through this link and this blog post on testing code .

Categories

Recent Posts

Eric van Rees
Eric van Rees is a freelance writer and editor. His specialty is GIS technology. He has more than eight years of proven expertise in editing, writing and interviewing as editor and editor-in-chief for the international geospatial publication GeoInformatics, as well as GIS Magazine and CAD Magazine, both published in Dutch. Currently, he writes about geospatial technology for various clients, publications and blogs.

Sign up for our weekly newsletter
to receive content like this in your email box.