GEOS 597e: Spatiotemporal Data
Analysis Workshop
Homework 4: Empirical orthogonal function calculation
Last updated 9/21/06. To be completed prior to class
session Weds., Oct. 6th. Note you have two weeks to work on this
rather central homework assignment.
Introduction: In this
homework
we'll learn and practice the calculation of EOFs, and
we'll calculate the empirical orthogonal functions and principal
components of the covariance matrix of GOSTA sea surface temperatures
for 1961-1990.
- Paper and pencil.
Let F be a 2 x 3 matrix whose rows are station locations and
whose columns give observations over time, such that the rows of F are
3
time series measurements of some variable measured at two
locations:
F = [ 1 -2 1
2 -1 -1 ]
Calculate the eigenvalues, eigenvectors, and principal components of
the covariance R of the following dataset, F. Report your
eigenvectors normalized to unit length. Calculate also normalized
eigenvalues. Check that your eofs and principal components are
correct by using them to reconstruct F exactly. Check that your
eofs are orthogonal.
- Code it. Open a
script (call it hw4_lastname.m). Calculate the
eigenvalues, normalized eigenvectors, and principal components of F
using the matlab function svd and confirm that you get the same results
as in Problem 1. Note that in the special case that the
input to the svd calculation is a square symmetric matrix like R, the
SVD gives us R=ELET, where E is the matrix whose columns are
the eigenvectors of R, and L is the matrix whose diagonal elements are
the eigenvalues of R. Calculate and check the normalized
eigenvalues as well. Helpful Matlab functions: length, svd, diag,
sum, zeros.
- Test it. Use your
script to perform an EOF analysis of the 4 timeseries
sea surface temperature matrix ts from
HW3.
(To retrieve those calculations, you can load your hw3 matlab workfile
or rerun your script, using save filename.mat ts ts_covariance,
replacing filename and ts_covariance
with the appropriate names. Now you can use load filename.mat
in your HW4 script to access those variables without recalculating
them.
- Calculate the eigenvectors, eigenvalues, normalized
eigenvalues, and principal components of the covariance of ts. Important note: In
general, be sure to remove the mean from each time series in ts, as you
did to construct the covariance, to create SST anomaly timeseries,
prior
to calculating the principal components. Hint: to calculate the
principal components, make a copy of the SST dataset, replacing all
NaNs
in it with zeros. In this case, when you multiply the EOF matrix
by this matrix to get the principal components, you will be adding
zeros
where you have no data for the inner products; this give an unbiased
result, as you removed the mean of each data series (leaving out
missing values) over exactly this period as well. (Make the
calculation by hand for a 1x3 eof matrix and a 3x3 data series matrix,
with NaNs in it, to see what I mean.)
- Verify that you can reconstruct the anomaly series ts from
the
eigenvalues and eigenvectors of the covariance matrix.
- Make a plot to report your results. Your plot should have
3 panels: eigenvalue amplitude vs. eigenvalue number; eigenvector
amplitudes vs. station, and principal component amplitude vs. time, for
the first principal component only. Have Matlab make a color
postscript copy of your plot. (You can use the commands
subplot(221), subplot(222), and subplot(212) before each of your
plotting commands to put these three plots all on a single page.)
Helpful Matlab commands: subplot, plot,
title, legend, xlabel, ylabel, grid, print. Print your
result to a postscript file by issuing the command: print -dpsc
hw4q3c.ps once
you have the plot you like.
- Notice that the first EOF [ e.g. subplot(222);plot(eof(:,1))
] loads most strongly (really, loads only) on the first station
location. Explain why the four EOFs you calculated load most
strongly on stations 1, 4, 2, and 3, respectively.
- Were we able to find strong patterns of common variance among
the time series, as did Weare et al. (1976)? Why isn't this a
very
useful EOF analysis?
- Calculate the global SST covariance
matrix. Use your script to
calculate the covariance, eigenvectors, eigenvalues, and principal
components of the full field dataset for January 1961- December 1990
from the MOHSST5 dataset. Find the
dataset here. This is a binary
file
with specific structure, so use the following code
pasted into your
script for how to read the data in properly and plot it (there're some
quirks).
To run this code, you'll also need the land area "mask" you can
download here. You can also see
also the following workspace which
results from running ld.m.
Here's a suggested roadmap to help you with your script:
- If you work on your own computer and not the CLUE PCs, you'll
need to install the m_map
plotting routines (including modified m_pcolor.m function) built and
distributed by Rich Pawlowicz from the University of British Columbia.
Find a zipped copy of this package here
and install it to your system in a place matlab knows where to look for
it. (See here for
the original source of these routines.)
- Before you set your script loose on this calculation (it
could take a while), be very sure it is doing what you wanted it to do,
for instance by testing it on a small subset of the full matrix.
We may still bring the PCs to their knees.
- Find the time series which have at least 25 years of monthly
values (this way, covariances are guaranteed to have at least 10 years
of overlap; why?). Be sure to save the indices of these time
series, which will allow you later to plot them on the full original
spatial grid. Remove the mean from each time series in the ocean-only
SST anomaly dataset.
- Calculate the covariance. Don't forget you can roughly
halve the number of calculations by exploiting the symmetry of the
covariance matrix! Verify that there are no NaNs in the
covariance
matrix.
- Analyze this.
Perform the eigenvector decomposition of the covariance matrix
you just calculated.
- Calculate the eigenvalues, EOFs, and principal components of
the covariance matrix you just calculated. Hint: make a copy of
the SST dataset with only the time series with at least 25 years of
monthly values, replacing all NaNs in it with zeros. When you
multiply the EOF matrix by this matrix to get the principal components,
you will be adding zeros where you have no data for the inner products;
this give an unbiased result, as you removed the mean of each data
series over exactly this period as well. (Make the calculation by
hand for a 1x3 eof matrix and a 3x3 data series matrix, with NaNs in
it,
to see what I mean.)
- As before, verify that you can reconstruct the SST anomaly
series
from the eigenvalues and eigenvectors of the covariance matrix.
- Make three plots to report your results. Your first plot
is of normalized eigenvalue amplitude vs. eigenvalue number. Use
the command axis
to focus on the range [0 10 0 0.2]. The second plot should
show principal component amplitudes vs. time, for the first four
principal components only. Use the Matlab commands subplot(411),
subplot(412), ... subplot(414) to make these time series plots. Your
third plot should show the first four EOF patterns in space (use Matlab
plotting commands orient landscape and subplot(221),
subplot(222),
... subplot (224) to put these plots on the same page. Use the
code for plotting maps iusing the m_map routine m_pcolor.m.
Have Matlab make color postscript copies of your plots using print -dpsc and call the
three figures hw4q5c1.ps, hw4q5c2.ps,
hw4q5c3.ps.
Helpful Matlab commands: reshape
(see
ld.m for examples), squeeze
(see
ld.m for example), zeros (for
creating a blank matrix for which you'll then fill in specific elements
with useful numbers).
- How much of the variance in the covariance matrix do the first
four EOF patterns explain? Is this a relatively large or small
number? Why is this the case?
- Do the first four EOF patterns and their principal components
have large-scale structures? Why/Why not? If yes, do
the spatial patterns of the EOFs resemble SST anomaly patterns you have
seen before, in this week's reading, or in other papers you have seen
in
the literature?
- Save your work. As
in HW2, save your calculations by issuing the following matlab
command, replacing lastname
with your last name, but saving only key matrices:
>> save hw4soln_lastname.mat R evals eofs pcs
replacing R
eofs evals pcs with
the names of the variables you assigned to the calculation of the SST
covariance matrix, eigenvalues, eigenvectors and
principal components you calculated in Problem 5. If
necessary, clean up or compress older files not used this week.
But we will be returning to results from this week's exercise, so don't
delete anything precious or time-consuming to recreate.
- Products. Please be
sure you've handed in a copy of your answers to Prework
4. Please write "Prework 4" and your name on it. Please hand in
your answers to Homework 4 questions 1, 3de, and 5de above.
Please write "Homework 4" and your name on it. Print a copy of
your HW4 script; call it hw4_lastname.m. Print
your carefully labeled plots for questions 3c and 5c and hand them in
your HW4 script.
Back
to Schedule/Syllabus.