GEOS 597e: Spatiotemporal Data Analysis Workshop

Homework 5:
EOF analysis: Error, stability, truncation and filtering

Last updated 10/10/06.
To be completed prior to class session Weds., Oct. 11th.

Introduction:
This week we'll analyze our EOF  calculations from Homework 4:  how many of the EOF structures are robust?   How can we use the EOF analysis to improve the reliability of subsequent analyses of the results?  How stable are the results to additional observational errors?
  1. Rule N estimation of significant EOFs.  Apply Rule N to the SST dataset we studied in HW4.  Here's a roadmap for within a loop that repeats 100 times:
    1.  Create a synthetic dataset consisting of m rows ("gridpoints") and nt columns ("time points") of random values chosen from a Gaussian distribution with mean zero and unit variance, where m and nt are identical to the dimensions of the "reduced" SST dataset for which you performed your EOF calculations.  (Helpful matlab function: randn.)
    2. Give the synthetic data the variance of the real dataset.  (Where have you already calculated the variances of the real data, in Homework 4?)  Helpful matlab function: diag, sqrt.  Check that the synthetic data now has roughly the same variances as the real data. 
    3. Calculate the covariance matrix of your synthetic dataset.
    4. Calculate the first 40 eigenvalues of your synthetic covariance matrix.  HINT: use the function svds to calculate only the first 40 eigenvalues of the covariance matrix. and save only the eigenvalues into a 100 x 40 array (why dimensions 100 x 40?)
    5. Once you have assembled your 100 x 40 matrix of synthetic eigenvalues, find the 95th-highest one for each eigenvalue number (1-40).  Helpful matlab commands: sortrows, flipud.  On the same figure, plot the 95th-highest eigenvalue vs. eigenvalue number, and plot the first 40 eigenvalues of the SST covariance matrix you calculated  for HW4.  Print this plot.  Write a few sentences explaining how you interpret this result.
  1. EOF filtering.  Filter the SST dataset according to the results you just found: 
    1. Calculate Ff=EfCf, where Ef is the subset of the eigenvector matrix E which contains the first Mf columns of E, and Cf = EfTF.
    2. Now use the gridding information commenting in ld.m to find a grid point of interest to you, and extract its time series from the filtered dataset Ff.   Also extract the same time series from the original dataset.  (For our purposes here, remain working within the weighted, reduced dimension dataset of relatively well-observed gridpoints.)   In any way you wish, show how the gridpoint time series from the EOF-filtered dataset is similar and/or different from the gridpoint time series from the unfiltered dataset.  Make a plot showing your results. Write a few sentences describing the results of your comparison. 
  2. Stability of results in the presence of white noise.  Examine how robust the EOF patterns are with respect to observational errors. 
    1. Go back to the SST data set you studied in HW4, and add random noise with variance equal to that of the sst data. Helpful matlab commands: diag, randn, sqrt.
    2. Now recompute the eigenvalues, eigenvectors, and principal components of this new covariance matrix Rn.  Use your code from HW4.  Your eigenvalues will be larger (why and how?) but the EOF patterns are normalized as before such that ETE=I.
    3. On a single page, plot the first and last of the significant eigenvectors from the EOF analysis of R, and the corresponding two eigenvectors from the EOF analysis of Rn.  Helpful matlab plotting commands: subplot(221), subplot(222), ... subplot(224); m_map routines;  again, use the similar code you used to plot eigenvectors in HW4.  Make a plot showing your results.  Do the two pairs of eigenvectors look different?  Why/Why not?
  1. Products.  Please be sure you've handed in a copy of your answers to Prework 5.  Please write "Prework 5" and your name on it.  Print and turn in a copy of your hw5 script;  call it hw5_lastname.m.  Turn in your answers to questions 1e, 2b, and 3c, and your printed plots for questions 1e, 2b and 3c.

Back to Schedule/Syllabus.