GEOS 597e: Spatiotemporal Data Analysis Workshop

Homework 8:
SSA: Error, stability, truncation and filtering

Last updated 10/24/06.
To be completed prior to class session Weds., Nov. 1st.

Introduction:
This week we'll test the sensitivity of our SSA of the century-long NINO3 SST anomaly time series from the GOSTA dataset to parameter choices and to a simple model of a time series with memory; and we'll filter the data to isolate the robust signal.
  1. Use the Monte Carlo SSA algorithm described by Ghil et al. (2002) to compare the eigenvalues of the NINO3 SST time series to a null hypothesis of AR(1) noise. 
    1. For this exercise we will work with only the time interval January 1947- December 1990, which has no missing values.  Select only this time interval for further analysis, and remove the mean from the data.  Use the code you developed for HW7 to perform SSA on this time series with an embedding dimension of 60 months.
    2. Load a set of 100 realizations of red noise time series having the same 528-month length and similar AR(1) characteristics of the NINO3 time series, from here.  These synthetic data were produced using the ARfit package published by Tapio Schneider and Arnold Neumaier.   (Because of the high degree of autocorrelation in the monthly NINO3 data series, this rather sophisticated algorithm, rather than the simple one we saw in HW0, is necessary to construct reasonably stationary synthetic AR(1) series.)   Additionally, I had to additionally adjust the synthetic time series to acccount for the skew in the real time series.  If you're interested, code for working with ARfit to do this is here.
    3. For each of 100 red noise realizations, construct the trajectory matrix, and the covariance matrix corresponding to the trajectory matrix.  Don't forget to remove the mean of each time series prior to calculation of the covariance.  Use these 'noise' covariance matrices to calculate the MC-SSA eigenvalues after Ghil et al. (2002), Eq. 17, and in the text.  In other words, construct the 'noise' eigenvalue distribution as the diagonal elements of the projection of the NINO3 SSA EOFs on each estimate of the covariance of each of the 100 AR(1) noise series trajectory matrices.  Save your eigenvalues into a 100 x 60 matrix.
    4. Sort your 100 x 60 'noise'-projected eigenvalues twice: sort them by row, then by column.  Then flip the matrix vertically and horizontally to get your noise eigenvalues in order from largest to smallest, left to right and top to bottom.  Helpful matlab functions: sort, flipud, fliplr, help.
    5. Plot the 5th and 95th percentile 'noise' eigenvalues vs. eigenvalue number (use subplot(221)) as a pair of lines.  Plot on the same graph the eigenvalues of the NINO3 time series SSA, as a series of markers.  Using the full set of Monte Carlo results, at approximately what level are each of the first 6 EOFs statistically different from red noise?  Helpful matlab commands/operators: find, <=, >=. Label your plot with a title and axis labels.
    6. Plot the EOFs corresponding to the significant eigenvalues (use subplot(222)).  How would you describe these EOFs? Label your plot with a title and axis labels. 
    7. Plot the sum of just the RCs corresponding to your significant EOFs vs. time (use subplot(212)).  On the same time axis, plot the original NINO3 timeseries.  How much variance in the NINO3 time series is explained?  How similar is this "significant" summed RC to the sum of the first 6 RCs?   
    8. Print your figure, and be sure to hand in your answers to the questions in questions 1f-h.
  1. Test the sensitivity of your SSA of the NINO3 time series to choice of embedding dimension M.
    1. Thought experiment: What would you expect the eigenvalue trace to look like for small embedding dimension, relative to the length of the time series?  How about for large embedding dimension relative to the length of the time series?  What trade-off do you make in opting for large vs. small embedding dimension?
    2. Choose a wide range of embedding dimensions over which to test the sensitivity of the results of the NINO3 SSA to choice of M.  Divide your range up into P reasonable choices for M, where P is about 10.  Be sure to include the case M=60.
    3. For each value of M, recompute the SSA, form the sum of the first six RCs, and compute the correlation of this summed RC with the sum of the 6 RCs you formed for the case M=60.
    4. Make a two panel plot (using subplot(211), subplot(212) ).  Iin the top panel, plot the correlation with the sum of the first 6 RCs from the M=60 case vs. embedding dimension.  In the second panel, plot the variance explained by the first six EOFs vs. embedding dimension. How sensitive is this SSA to the choice of M?  Why?  How sensitive is the total variance explained by the first six EOFs to choice of M?  Why?  Do your results fit with your thought experiment results (question 2a)? Label your plots with titles and axis labels, and print the figure.  Be sure to turn in answers to questions in 2a,d. 
  1. Please be sure you've handed in a copy of your answers to Prework 8.  Please write "Prework 8", the date, and your name on it.  Hand in a copy of the code you wrote to solve HW8 and your written answers to Problems 1f-h and 2a,d.

Back to Schedule/Syllabus.