EOF Analysis#

Empirical Orthogonal Function (EOF) Decomposition#

EOF analysis decomposes spatial-temporal datasets into orthogonal spatial patterns (EOFs) and their corresponding time series (principal components or PCs). This is useful for identifying dominant modes of variability like ENSO or the Pacific/North American pattern.

This method uses latitude-weighted anomalies, optional standardization, and the GeoCAT EOF routines which are compatible with xarray and masked data.


📌 Function: compute_eof_analysis()#

def compute_eof_analysis(datA, latlonEOF=(30.0, 50.0, -125.0, -100.0), neval=5, normalize=False, min_coverage=0.9):
    # Ensure longitude is 0-360
    if datA.lon.min() < 0:
        datA = datA.assign_coords(lon=((datA.lon + 360) % 360))

    lon_min = latlonEOF[2] if latlonEOF[2] >= 0 else latlonEOF[2] + 360
    lon_max = latlonEOF[3] if latlonEOF[3] >= 0 else latlonEOF[3] + 360

    # Flip latitude if needed
    if datA["lat"][0] > datA["lat"][-1]:
        datA = datA.isel(lat=slice(None, None, -1))

    # Subset data
    datA_eof = datA.sel(lat=slice(latlonEOF[0], latlonEOF[1]), lon=slice(lon_min, lon_max))

    # Latitude weighting
    rad = np.pi / 180.0
    clat = np.sqrt(np.cos(datA_eof["lat"] * rad))
    clat = clat.broadcast_like(datA_eof)
    datY = datA_eof * clat

    if normalize:
        dat_std = datY.std(dim="year")
        datY = datY / dat_std

    datY = datY.transpose("year", "lat", "lon")

    # Mask grid points with insufficient coverage
    valid_count = datY.notnull().sum(dim="year")
    required_count = int(datY.sizes["year"] * min_coverage)
    mask = valid_count < required_count
    datY = datY.where(~mask)

    # Ensure 'time' dimension for GeoCAT
    datY = datY.rename({'year': 'time'})

    # Perform EOF analysis
    eofs = gccomp.eofunc_eofs(datY, neofs=neval, meta=True)
    pcs = gccomp.eofunc_pcs(datY, npcs=neval, meta=True)

    # Normalize PCs and restore year dim
    pcs = pcs.rename({'time': 'year'})
    pcs = pcs / pcs.std(dim='year')
    pcs.attrs['varianceFraction'] = eofs.attrs.get('varianceFraction', None)

    print(f"Percentage Variance Explained: {pcs.attrs['varianceFraction']}")

    return pcs, eofs

⚙️ Parameters#

  • datA: Input gridded anomalies (dimensions: year, lat, lon)

  • latlonEOF: Latitude/longitude domain of analysis

  • neval: Number of EOF modes to return

  • normalize: If True, standardize spatial anomalies before decomposition (correlation matrix EOFs)

  • min_coverage: Minimum fraction of valid data required per grid point


📌 Example: Covariance and Correlation Matrix EOFs#

# Define EOF domain (Western US)
latlonEOF = (30.0, 50.0, -125.0, -100.0)

# EOF Based on Covariance Matrix
pcs_cov, eofs_cov = compute_eof_analysis(
    dat_YR, latlonEOF=latlonEOF, neval=5,
    normalize=False  # Use raw anomalies
)

# EOF Based on Correlation Matrix
pcs_std, eofs_std = compute_eof_analysis(
    dat_YR, latlonEOF=latlonEOF, neval=5,
    normalize=True  # Use standardized anomalies
)

🔍 Note: Covariance EOFs emphasize high-variance regions. Correlation EOFs treat all locations equally.


This method is useful for understanding large-scale coherent patterns and reducing dimensionality for further analysis (e.g., regression, clustering).