Data Analysis (Monthly)#
This chapter introduces common techniques for analyzing monthly climate datasets using Python. These techniques include:
Computing monthly climatology, anomalies, and standard deviation
Calculating seasonal, annual, and water year averages
Calculating long-term trends and removing them (detrending)
Performing correlation and regression analysis
Conducting Empirical Orthogonal Function (EOF) analysis
π¦ Required Python Packages#
The following packages are used throughout this monthly data analysis workflow. Make sure they are installed in your Python environment.
β Packages β#
General Packages#
import pandas as pd
import xarray as xr
import numpy as np
import os
import ipynbname
GeoCAT Tools#
import geocat.comp as gccomp
import geocat.viz as gv
import geocat.viz.util as gvutil
Visualization#
import cmaps
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import shapely.geometry as sgeom
Matplotlib Essentials#
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import matplotlib.patches as mpatches
import matplotlib.dates as mdates
from matplotlib.colors import ListedColormap, BoundaryNorm
from matplotlib.ticker import MultipleLocator
Specialized Layouts#
import matplotlib.gridspec as gridspec
π Reading the Dataset#
Before we begin our analysis, we load the NetCDF dataset containing monthly sea surface temperature.
# --- Parameter setting ---
data_dir = "../data"
fname = "sst.cobe2.185001-202504.nc"
ystr, yend = 1991, 2020
fvar = "sst"
# --- Reading NetCDF Dataset ---
# Construct full path and open dataset
path_data = os.path.join(data_dir, fname)
ds = xr.open_dataset(path_data)
# Extract the variable
var = ds[fvar]
# Ensure dimensions are (time, lat, lon)
var = var.transpose("time", "lat", "lon", missing_dims="ignore")
# Ensure latitude is ascending
if var.lat.values[0] > var.lat.values[-1]:
var = var.sortby("lat")
# Ensure time is in datetime64 format
if not np.issubdtype(var.time.dtype, np.datetime64):
try:
var["time"] = xr.decode_cf(ds).time
except Exception as e:
raise ValueError("Time conversion to datetime64 failed: " + str(e))
# === Select time range ===
dat = var.sel(time=slice(f"{ystr}-01-01", f"{yend}-12-31"))
print(dat)
π§ Topics Covered in This Module#
This module will walk through the following analysis techniques in sequence:
Monthly climatology
Calculate the average conditions for each calendar month over a baseline period, typically 30 years or more. This forms the foundation for identifying departures from typical conditions.Monthly anomalies
Determine the difference between observed monthly values and the corresponding climatological mean. Anomalies highlight deviations from normal conditions.Monthly standard deviation
Quantify the typical spread or variability of monthly anomalies, helping to assess regions or times with greater fluctuation.Seasonal, annual, and water year averages
Aggregate data across defined periods:Seasonal (e.g., DJF, JJA) averages capture variability tied to meteorological seasons.
Annual averages summarize calendar year totals.
Water year averages (OctoberβSeptember) are commonly used in hydrology and water resource planning.
Long-term trends
Assess changes over time using linear regression or other methods. Useful for detecting climate change signals.Detrending
Remove long-term trends from a dataset to focus on interannual or decadal variability. Often a preprocessing step for EOF or correlation analysis.Correlation and regression analysis
Examine the statistical relationships between variables. Correlation maps can reveal spatial coherence with a reference time series, while regression helps quantify the magnitude of response.Empirical Orthogonal Function (EOF) analysis
Decompose spatial-temporal fields into orthogonal modes of variability. Useful for identifying dominant patterns like ENSO, the North Atlantic Oscillation, or Pacific Decadal Oscillation.
π Note: Instructions for accessing and preparing datasets are provided in the File I/O chapter.
π Next: Climatology