Multivariate Data Analysis
Previously known as multivariate statistical analysis
There are essentially only four steps here:
The low-pass filtration is optional, but if you plan to look at individual particles, this step will help.
For the classification below to be sensible, the images will need to have been aligned. The alignment step here is optional if the images have been aligned already.
The dimension-reduction step is even optional, in theory. In principle, one could classify the raw images (which is what SPIDER operation 'AP C' does). As an example here, I'm using correspondence analysis for the dimension-reduction. A similar method is principal-component analysis (PCA); to run PCA, one needs to change an option under SPIDER operation 'CA S' (here, in the batch file ca-pca.spi).
For classification, there are three methods illustrated here: Diday's method, Ward's method, and K-means. The individual classification operations are described in more depth in the classification tutorial.
Getting started
Procedure
mkfilenums.py listparticles.dat win/ser*.datThere may to be a memory limit in 'AP SR'. If you get a core dump, truncate the selection file and try again.
Conceptually, this alignment first aligns pairs of images and averages them. Then, it aligns pairs of averages of those pairs and averages them, and so forth. This type of alignment appears to be less random than does 'AP SR', which chooses seed images as alignment references.
Reference: Marco S, Chagoyen M, de la Fraga LG, Carazo JM, Carrascosa JL (1996) Ultramicroscopy 66: 5-10.
|
To switch to PCA (or iterative PCA), change the register x28 in ca-pca.spi to 2 (PCA) or 3 (iterative PCA).
After running, examine the eigenimages and decide which ones to use. Typically all but the first few are noisy. If not, increase the number of eigenfactors to calculate, and re-run this batch file.
|
|
After running, decide how many classes to include. using WEB/ JWEB (Commands -> Dendrogram) and clicking on Show averaged images.
|
After running, decide how many classes to use. The PostScript file may be highly branched, and nodes may be unreadable.
|
|
It can be informative to look at the individual particles from a class.
You can use
WEB/
JWEB, or
montagefromdoc.py.
Usage:
./montagefromdoc.py KM/docclass001.dat
If you have requested too many classes, there will be
similar-looking class averages.
If you have requested too few, there will be dissimilar
particles within a class.
Miscellaneous tools:
|
Source: techs/MSA/index.html Page updated: 4/13/12 Tanvir Shaikh