Multidimensional Arrays: Reference

Key Points

datasets for the xarray tutorial	refer to this page for access to the tutorial data
Introduction to multidimensional arrays	unlabelled, N-dimensional arrays of numbers (e.g. NumPy’s ndarray) are the most widely used data structure in scientific computing these arrays lack meaningful metadata, so users must track indices in an arbitrary fashion in-memory operations, needed to process and visualize large arrays, are reaching limits as datasets grow in size
xarray architecture	xarray is build on the netCDF data model xarray has two main data structures: DataArray and Dataset DataArrays store the multi-dimensional arrays Datasets are the multi-dimensional equivalent of a Pandas dataframe
label-based indexing	xarray’s labeled dimensions free the user from having to track positional ordering of dimensions when accessing data, creating a more simplified workflow
plotting	xarray has plotting functinality that is a thin wrapper around the Matplotlib library xarray uses syntax and function names from Matplotlib whenever possible
arithmetic and aggregation	xarray’s labeled dimensions enable simplified arithmetic and data aggregation, enabling many powerful shortcuts
groupby processing	xarray provides Pandas-like methods for performing data aggregation over defined groupings in the data
out-of-core computation	dask integration with xarray allows you to work with large datasets that “fit on disk” rather than having to “fit in memory”. It is important to chunk the data correctly for this to work.
masking	xarray provides tools for creating and analyzing masked data.
masking	xarray provides tools for creating and analyzing masked data.
Wrap-Up	A summary of everything so far

FIXME: more reference material.