NetCDF: Code Example#
This notebook demonstrates four common methods for reading NetCDF climate datasets using xarray
. These examples correspond to the approaches outlined in the Method Overview page.
The dataset files are stored in the ../data/
directory.
Included Approaches:#
Reading a single NetCDF file using
xr.open_dataset()
Reading multiple files with wildcards using
xr.open_mfdataset()
Reading files by year range using a filename template (e.g.,
YYYY
)Downsampling high-resolution data using either
isel()
orcoarsen()
import xarray as xr
import matplotlib.pyplot as plt
import os
# === PARAMETERS ===
data_dir = "../data"
file_slp = "slp.ncep.194801-202504.nc"
file_hgt_all = "hgt_ncep_daily.*.nc"
file_hgt_year = "hgt_ncep_daily.YYYY.nc"
file_sst = "sst.oisst_high.198109-202504.nc"
Approach 1: Reading a Single File with open_dataset()
#
# Define the start and end years
ystr, yend = 1991, 2020
# === Step 1: Construct full path and open dataset
path_slp = os.path.join(data_dir, file_slp)
ds1 = xr.open_dataset(path_slp)
# === Step 2: Select time range from ystr to yend
slp = ds1["slp"].sel(time=slice(f"{ystr}-01-01", f"{yend}-12-31"))
# === Step 3: Preview result
slp
<xarray.DataArray 'slp' (time: 360, lat: 73, lon: 144)> Size: 15MB [3784320 values with dtype=float32] Coordinates: * lat (lat) float32 292B 90.0 87.5 85.0 82.5 ... -82.5 -85.0 -87.5 -90.0 * lon (lon) float32 576B 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5 * time (time) datetime64[ns] 3kB 1991-01-01 1991-02-01 ... 2020-12-01 Attributes: long_name: Sea Level Pressure valid_range: [ 870. 1150.] units: millibars precision: 1 var_desc: Sea Level Pressure level_desc: Sea Level statistic: Mean parent_stat: Other dataset: NCEP Reanalysis Derived Products actual_range: [ 955.56085 1082.5582 ]
# Plot the first time slice
slp.isel(time=0).plot(cmap="coolwarm")
plt.title(f"Sea-Level Pressure on {str(slp.time.values[0])[:10]}")
plt.show()

Approach 2: Reading Multiple Files with open_mfdataset()
#
This method is useful when data is split into multiple NetCDF files by year or month. Here, we combine geopotential height data from 2018–2020 using a wildcard pattern.
plev = 500
# === Step 1: Build full path pattern
hgt_path = os.path.join(data_dir, file_hgt_all)
# === Step 2: Open multiple files using wildcard
ds2 = xr.open_mfdataset(hgt_path, combine="by_coords", parallel=True)
# === Step 3: Select 500 hPa level
z500 = ds2["hgt"].sel(level=plev)
# === Step 4: Preview result
z500
<xarray.DataArray 'hgt' (time: 1096, lat: 73, lon: 144)> Size: 46MB dask.array<getitem, shape=(1096, 73, 144), dtype=float32, chunksize=(1, 73, 144), chunktype=numpy.ndarray> Coordinates: level float32 4B 500.0 * lat (lat) float32 292B 90.0 87.5 85.0 82.5 ... -82.5 -85.0 -87.5 -90.0 * lon (lon) float32 576B 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5 * time (time) datetime64[ns] 9kB 2018-01-01 2018-01-02 ... 2020-12-31 Attributes: long_name: mean Daily Geopotential height units: m precision: 0 GRIB_id: 7 GRIB_name: HGT var_desc: Geopotential height level_desc: Multiple levels statistic: Mean parent_stat: Individual Obs valid_range: [ -700. 35000.] dataset: NCEP Reanalysis Daily Averages actual_range: [ -523.75 32252.75]
z500.isel(time=0).plot(cmap="viridis")
plt.title("500 hPa Geopotential Height – First Time Step")
plt.show()

Approach 3: Reading Multiple NCEP Files by Year Range#
This method loads daily geopotential height (hgt
) data from multiple yearly NCEP files using a specified range of years. It uses xarray.open_mfdataset
to combine the data into a single dataset for easy analysis.
# === PARAMETERS ===
ystr, yend = 2018, 2020
plev = 250
years = list(range(ystr, yend + 1))
file_list = [os.path.join(data_dir, file_hgt_year.replace("YYYY", str(y))) for y in years]
ds3 = xr.open_mfdataset(file_list, combine="by_coords")
z250 = ds3["hgt"].sel(level=plev)
z250
<xarray.DataArray 'hgt' (time: 1096, lat: 73, lon: 144)> Size: 46MB dask.array<getitem, shape=(1096, 73, 144), dtype=float32, chunksize=(1, 73, 144), chunktype=numpy.ndarray> Coordinates: level float32 4B 250.0 * lat (lat) float32 292B 90.0 87.5 85.0 82.5 ... -82.5 -85.0 -87.5 -90.0 * lon (lon) float32 576B 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5 * time (time) datetime64[ns] 9kB 2018-01-01 2018-01-02 ... 2020-12-31 Attributes: long_name: mean Daily Geopotential height units: m precision: 0 GRIB_id: 7 GRIB_name: HGT var_desc: Geopotential height level_desc: Multiple levels statistic: Mean parent_stat: Individual Obs valid_range: [ -700. 35000.] dataset: NCEP Reanalysis Daily Averages actual_range: [ -523.75 32252.75]
z250.isel(time=0).plot(cmap="viridis")
plt.title("250 hPa Geopotential Height – First Time Step")
plt.show()

Approach 4: Downsampling High-Resolution SST Data#
This method demonstrates how to load a high-resolution sea surface temperature (SST) dataset and reduce its spatial resolution to speed up processing and simplify visualization. This is especially useful for global daily SST data.
We use the NOAA OISST dataset sst.oisst_high.198109-202504.nc
, which is approximately 0.25° in spatial resolution.
# === PARAMETERS ===
use_lowres = True
coarsen_factor = 4
method = "isel" # options: "isel" or "coarsen"
# === Paths ===
data_path = os.path.join(data_dir, file_sst)
lowres_file = os.path.join(data_dir, file_sst.replace(".nc", f".low.{method}.nc"))
# === Load SST ===
if use_lowres and os.path.exists(lowres_file):
print(f"📁 Found existing low-res SST: {lowres_file}")
ds = xr.open_dataset(lowres_file)
else:
print(f"⚙️ Creating low-resolution SST from: {data_path}")
ds_full = xr.open_dataset(data_path)
if method == "isel":
ds_lowres = ds_full.isel(
lat=slice(None, None, coarsen_factor),
lon=slice(None, None, coarsen_factor)
)
elif method == "coarsen":
ds_lowres = ds_full.coarsen(
lat=coarsen_factor,
lon=coarsen_factor,
boundary='trim'
).mean()
else:
raise ValueError("`method` must be 'isel' or 'coarsen'.")
ds_lowres.to_netcdf(lowres_file)
print(f"✅ Saved low-res file to: {lowres_file}")
ds = ds_lowres
# === Final variable: sst ===
sst = ds["sst"]
sst
📁 Found existing low-res SST: ../data/sst.oisst_high.198109-202504.low.isel.nc
<xarray.DataArray 'sst' (time: 524, lat: 180, lon: 360)> Size: 136MB [33955200 values with dtype=float32] Coordinates: * time (time) datetime64[ns] 4kB 1981-09-01 1981-10-01 ... 2025-04-01 * lat (lat) float32 720B -89.88 -88.88 -87.88 ... 87.12 88.12 89.12 * lon (lon) float32 1kB 0.125 1.125 2.125 3.125 ... 357.1 358.1 359.1 Attributes: long_name: Monthly Mean of Sea Surface Temperature units: degC valid_range: [-3. 45.] precision: 2.0 dataset: NOAA High-resolution Blended Analysis var_desc: Sea Surface Temperature level_desc: Surface statistic: Monthly Mean parent_stat: Individual Observations actual_range: [-1.8 32.14] standard_name: sea_surface_temperature
# === Step 6: Plot the first monthly SST map
sst.isel(time=0).plot(
cmap="coolwarm",
figsize=(10, 5)
)
plt.title(f"SST (Low-Res) — {str(sst.time.values[0])[:10]}")
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.show()
