Principal Component Analysis
Summary
These handwritten digit images live in a high dimensional space. However, we can exploit pixel intensity covariance patterns to reduce the dimensionality of the data. PCA provides a principled way to find a low-dimensional subspace where most of the image variability is thus retained.
We will use PCA to explore if digits 1 and 7 form distinguishable clusters in this lower dimensional representation of the data.
Learning Goals
The main objective of this exercise is to get students to use all the linear algebra they have learned so far.
The exercise will allow them to visualize a relative complex and large data set. By doing so they will put in practice the concepts of dot product, projections, orthonormal basis sets, dimensionality reduction, singular value decomposition, and principal component analysis, among others.
It also offers the opportunity to introduce basic machine learning concepts such as linear separability, single layer perceptrons and clustering.
Additionally it opens the door for discussion on how the brain may perform these operations and what are the biological basis of PCA.
For those in the neuroscience/ computational neuroscience domain it also presents an opportunity to discuss Principal component analysis in the context of neural circuits (1). In fact, Oja's rule is very similar to the power method to obtain eigenvectors.
Finally, the MATLAB skills needed for this activity are:
- making plots and matrix visualizations
- matrix operations and indexing
- logical indexing
- reshaping matrices
- for-loops.
(1) Oja, E.(1982). "Simplified neuron model as a principal component analyzer". Journal of Mathematical Biology. 15 (3): 267–273.
Context for Use
I use this exercise as problem set in 9.40 (Introduction to Neural computation), which is a sophomore/junior subject.
We have covered linear algebra, PCA/SVD and perceptrons before they work on this particular problem set.
Alternatively I use this exact same exercise as an in-class activity for a graduate level subject (9.014 Quantitative Methods and Computational Models in Neurosciences). Here, we have presented linear algebra, covariance matrix and SVD.
Description and Teaching Materials
This activity can be set as homework/problem set or as an in-class activity.
The in-class activity takes roughly 80 minutes. For this format the best approach is to break the class into smaller groups and make them work on each question. After 35 minutes make a brief class discussion to summarize results and programming difficulties that might arise. Usually students get up to question 5 by this time.
5-10 minutes before the end of class make a wrap up by discussing other applications of PCA.
Materials included:
Problem set / handout for activity in PDF and .doc format. Problem set / handout (Zip Archive 253kB Sep29 16)
Data set as .mat file. Data set (Matlab .MAT File 121kB Sep29 16)
Two solutions sets as .m, .mlx and pdf from live editor. (With svd and with eig)
svd: Solution SVD (Zip Archive 167kB Sep29 16)
eig: Solution EIG (Zip Archive 166kB Sep29 16)
Teaching Notes and Tips
If you don't cover SVD in your class but you do cover eigenvectors and eigenvalues, you can use the alternative solution with eig.
Also you will need to change the handout to use eig rather than svd.
Students usually struggle to understand projections, rotations (the essence of orthonormal basis sets) and reshaping matrices.
I usually show a simple 2-D example at the beginning of class to set the problem up.
Make sure that your TAs know the problem in advance and are willing to help students to get through the exercise.
There is also a follow up problem for extra credit when you use it as a PSET.
Assessment
At the end of the class you can discuss what the V matrix in the SVD decomposition (we did not use it) represents. Try to make them think of what doing PCA in the other space represents.
If they get this you can rest assured that the exercise has worked quite well.