Reading and Writing Files#

Xarray supports direct serialization and IO to several file formats, from simple Pickle files to the more flexible netCDF and HDF5 format (recommended).

You can read different types of files in xr.open_dataset by specifying the engine to be used:

xr.open_dataset("example.nc", engine="netcdf4")

The “engine” provides a set of instructions that tells xarray how to read the data and pack them into a Dataset (or Dataarray). These instructions are stored in an underlying “backend”.

Xarray comes with several backends that cover many common data formats. Many more backends are available via external libraries, or you can write your own. This diagram aims to help you determine - based on the format of the file you’d like to read - which type of backend you’re using and how to use it.

Text and boxes are clickable for more information. Following the diagram is detailed information on many popular backends. You can learn more about using and developing backends in the Xarray tutorial JupyterBook.

        ---
config:
  theme: base
  themeVariables:
    fontSize: 20px
    lineColor: '#e28126'
    primaryBorderColor: '#59c7d6'
    primaryColor: '#fff'
    primaryTextColor: '#fff'
    secondaryColor: '#767985'

---
flowchart LR
    built-in-eng["`**Is your data stored in one of these formats?**
        - netCDF4
        - netCDF3
        - Zarr
        - DODS/OPeNDAP
        - HDF5
        `"]

    built-in("`**You're in luck!** Xarray bundles a backend to automatically read these formats.
        Open data using <code>xr.open_dataset()</code>. We recommend
        explicitly setting engine='xxxx' for faster loading.`")

    installed-eng["""<b>One of these formats?</b>
        - <a href='https://github.com/ecmwf/cfgrib'>GRIB</a>
        - <a href='https://tiledb-inc.github.io/TileDB-CF-Py/documentation'>TileDB</a>
        - <a href='https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html#rioxarray'>GeoTIFF, JPEG-2000, etc. (via GDAL)</a>
        - <a href='https://www.bopen.eu/xarray-sentinel-open-source-library/'>Sentinel-1 SAFE</a>
        """]

    installed("""Install the linked backend library and use it with
        <code>xr.open_dataset(file, engine='xxxx')</code>.""")

    other["`**Options:**
        - Look around to see if someone has created an Xarray backend for your format!
        - <a href='https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html'>Create your own backend</a>
        - Convert your data to a supported format
        `"]

    built-in-eng -->|Yes| built-in
    built-in-eng -->|No| installed-eng

    installed-eng -->|Yes| installed
    installed-eng -->|No| other

    click built-in-eng "https://docs.xarray.dev/en/stable/get-help/faq.html#how-do-i-open-format-x-file-as-an-xarray-dataset"


    classDef quesNodefmt font-size:12pt,fill:#0e4666,stroke:#59c7d6,stroke-width:3
    class built-in-eng,installed-eng quesNodefmt

    classDef ansNodefmt font-size:12pt,fill:#4a4a4a,stroke:#17afb4,stroke-width:3
    class built-in,installed,other ansNodefmt

    linkStyle default font-size:18pt,stroke-width:4
    

Organization#

This documentation is organized into separate sections for each major file format:

  • netCDF and HDF5: NetCDF and HDF5 file formats, including complex data types

  • Zarr: Zarr format for cloud-optimized array storage

  • Other File Formats: Additional formats including Iris, pickle, and various other backends

Each section provides detailed examples and best practices for working with that specific format.