Multivariate Data Analysis
Previously known as multivariate statistical analysis


There are essentially only four steps here:

  1. Low-pass filtration
  2. Alignment in two dimensions
  3. Dimension-reduction -- expression of a mxn image using only a few terms, i.e., eigenvectors
  4. Classification

The low-pass filtration is optional, but if you plan to look at individual particles, this step will help.

For the classification below to be sensible, the images will need to have been aligned. The alignment step here is optional if the images have been aligned already.

The dimension-reduction step is even optional, in theory. In principle, one could classify the raw images (which is what SPIDER operation 'AP C' does). As an example here, I'm using correspondence analysis for the dimension-reduction. A similar method is principal-component analysis (PCA); to run PCA, one needs to change an option under SPIDER operation 'CA S' (here, in the batch file ca-pca.spi).

For classification, there are three methods illustrated here: Diday's method, Ward's method, and K-means. The individual classification operations are described in more depth in the classification tutorial.


Getting started


Procedure

  1. Low-pass filtration

  2. Reference-free alignment. -- choose one of these two options:
    1. Using 'AP SR'
      • BATCH FILE: apsr4class.spi
      • INPUT PARAMETER: object diameter (pixels, after decimation)
      • INPUTS: unaligned particles, selection file
      • OUTPUTS: aligned particles, averages
      • There may to be a memory limit in 'AP SR'.
      • If you get a core dump, truncate the selection file and try again.
    2. Using pairwise alignment
      • BATCH FILE: pairwise.spi
      • INPUT PARAMETER: object diameter (pixels, after decimation)
      • INPUTS: unaligned particles, selection file
      • OUTPUTS: aligned particles, averages/li>
      • Conceptually, this alignment first aligns pairs of images and averages them. Then, it aligns pairs of those pairs and averages them, and so forth. This type of alignment appears to be less random than does 'AP SR', which chooses seed images as alignment references.

  3. Dimension-reduction

  4. Classification -- choose one of three options:

Miscellaneous tools:


Source: techs/MSA/index.htm     Page updated: 8/03/09     Tanvir Shaikh