File IO ======= Examples -------- Download Sample Files ^^^^^^^^^^^^^^^^^^^^^ Read CSV Example * :download:`read_csv-v1.py ` * :download:`GHCN_sample.csv ` .. image:: SRC/read_csv-v1.png :width: 200 Read/Write CSV Example * :download:`rw_csv-v2.py ` * :download:`GHCN_sample.csv ` .. image:: SRC/rw_csv-v2.png :width: 200 Read NetCDF Example * :download:`read_netcdf-v2.py ` * :download:`oisst_monthly.nc ` * :download:`lsmask.nc ` .. image:: SRC/read_netcdf-v2.png :width: 200 Read multiple NetCDF Example * :download:`read-multi_netcdf-v2.py ` * :download:`hgt_ncep_daily.2018.nc ` * :download:`hgt_ncep_daily.2019.nc ` * :download:`hgt_ncep_daily.2020.nc ` .. image:: SRC/read-multi_netcdf-v2.png :width: 200 Read/Write NetCDF Example * :download:`rw_netcdf-v2.py ` * :download:`oisst_monthly.nc ` * :download:`lsmask.nc ` .. image:: SRC/rw_netcdf-v2.png :width: 200 .. _ioCSV: CSV file -------- The CSV file is the typical format for weather station data. It stands for the "comma separated value" written by the text (or ASCII) format. It generally consists of data description (called a header) and data value. The package Pandas provides an easy way to read CSV files. An example of CSV file ^^^^^^^^^^^^^^^^^^^^^^ Let's try to read the CSV file :download:`GHCN_sample.csv `. This CSV file includes a note on Lines 1-15 and a header on Line 16. After Line 17, we can see the data separated by a comma. The value "M" indicates a missing value. The data contains day, precipitation, multi-day precipitation, snow depth, snowfall, minimum temperature, maximum temperature, and reference evapotranspiration. .. literalinclude:: SRC/GHCN_sample.csv :language: text :lines: 1-20 :linenos: An example of Python script ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Let's check the sample file, :download:`rw_csv-v2.py `, to read and write a CSV file. General packages """""""""""""""" .. literalinclude:: SRC/rw_csv-v2.py :language: python :lines: 1-7 :linenos: :lineno-start: 1 This script will read the CSV file using Pandas and then get the data as Xarray (and Numpy). There are some date formats in weather station data. For example, we describe August 24, 2021, as of 10-21-2021 or 10/21/21. The "DateTime" package is convenient to adjust such a date format. The "OS" package is also helpful to obtain the filename. This script also displays a plot using the package "Matplotlib," but it is optional here. Read a CSV file """"""""""""""" .. literalinclude:: SRC/rw_csv-v2.py :language: python :lines: 10-48 :linenos: :lineno-start: 10 :emphasize-lines: 24-26 This script uses "pandas.read_csv" function to read the CSV file (Line 33-35). There are some options in this function. The option delimiter should be "," for the CSV file. This script defines the file name (Line 17), the number of lines for the header (Line 24), and the missing value (na_values). It also defines the variable names on Line 21. The variable name in the first column is "date" here. The CSV file shows the date formate of October 1, 1998, as "1998-10-01". Here, we define the date formate as "%Y-%m-%d" on Line 12, which puts into the option of "parse_dates" and "date_parser" for the function. .. note:: Here are examples of date format. * %Y-%m-%d (e.g., 2021-10-13) * %m/%d/%y (e.g., 10/13/21) * %Y-%d-%m %H:%M:%S (e.g., 2021-10-13 00:00:00) Write a CSV file """""""""""""""" .. literalinclude:: SRC/rw_csv-v2.py :language: python :linenos: :lines: 54-56 :lineno-start: 54 To write a CSV file, we use a Pandas function `pandas.DataFrame.to_csv `_. An output file name has the same file name of this script as defined on Line 27, but for the CSV file extension (".csv"). The output CSV file looks like below. .. literalinclude:: SRC/rw_csv-v2.csv :language: text :lines: 1-5 :linenos: .. _ioNetcdf: NetCDF file ----------- Read and write a single NetCDF file ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Let's check the sample file, :download:`rw_netcdf-v2.py `, to read and write a NetCDF file. This script reads monthly sea surface temperature provided by NOAA (:download:`oisst_monthly.nc `). It also requires the land-sea mask file (:download:`lsmask.nc `). General packages """""""""""""""" .. literalinclude:: SRC/rw_netcdf-v2.py :language: python :lines: 1-6 :linenos: :lineno-start: 1 This script uses Xarray to read and write the NetCDF file. As a default setting, Xarray may not support the NetCDF. In that case, you need to install other packages, such as Dask, NetCDF4, or PyNIO. Please check :ref:`SetupPack` how to install NetCDF. Read a NetCDF file """""""""""""""""" .. literalinclude:: SRC/rw_netcdf-v2.py :language: python :lines: 8-35 :linenos: :lineno-start: 8 :emphasize-lines: 14, 19 To read a signle NetCDF file, we use a xarray function `xarray.open_dataset `_, as described on Lines 21 and 26. The resultant varialbe "ds" is the *DataSet* array. You can extract the sea surface temperature *DataArray* from "ds" on Line 29. We also applied the land-sea mask to obain the sea surface temperature over the ocean not the land using a `DataArray.where `_ function. Read multiple NetCDF files """""""""""""""""""""""""" To read multiple NetCDF file, we use the differnet xarray function `xarray.open_mfdataset `_. Following is an example to read the mulple NetCDF file and extract geopotential height at 250 hPa as the *DataArray* "dat". Sample file :download:`read-multi_netcdf-v2.py ` .. literalinclude:: SRC/read-multi_netcdf-v2.py :language: python :lines: 23-30 :linenos: :lineno-start: 23 :emphasize-lines: 2, 4, 7