math Morgan

Iterative Eigenvector Computation

There are fancy algorithms for computing eigenvectors, but sometimes you just want to compute a small number of eigenvectors without any hassle -- maybe your matrix is too big to compute eigenvectors directly.

Fortunately, there is an iterative algorithm for computing the eigenvector with the largest eigenvalue.

def largest_eigenvector(A): v = np.random.normal(0, 1, (A.shape[0], 1)) for i in range(32): v =, v) v /= np.sqrt((v**2).sum()) return v

It's not necessary that the vector come from a Gaussian distribution -- any random vector will do. The goal is to not accidentally choose some other eigenvector, and sampling from a Gaussian is one way to make that "very unlikely".

Then, by repeatedly applying the matrix to this vector, it converges to the largest eigenvector. The speed of this depends on the relative sizes of the largest and second largest eigenvalues: the larger the ratio the faster the convergence.

A problem with this approach is that the vector will also get larger or smaller exponentially (depending on how big the eigenvalue is). To stop the vector from exploding or vanishing, we normalize it after every step.

Finally, one optimization we can make is to multiply by some power of A, rather than just A, which speeds up convergence. For instance, if we do this:

def largest_eigenvector(A): A =, A) # added line v = np.random.normal(0, 1, (A.shape[0], 1)) for i in range(32): v =, v) v /= np.sqrt((v**2).sum()) return v

Then the algorithm will converge twice as fast.

Finally, if you want the second, third, etc. eigenvectors, you can "subtract out" the previously computed eigenvectors:

def largest_eigenvector(A, k): V = [] # Populate with the first k eigenvectors. for j in range(k): v = np.random.normal((A.shape[0], 1)) for i in range(1000): v =, v) v /= np.sqrt((v**2).sum()) # Magic line "subtracts out" previous eigenvector. A -=, v), v.T) # Add j-th eigenvector to "V" V.append(v) return np.concatenate(V, 1)

The intuition is that we want to modify A, so that Av = 0 (where 'v' is the largest eigenvector).

This is equivalent to making each row of A orthogonal to v, which means we want to subtract out (A[i] · v) v from each row, which happens to be equivalent to

A -=, v), v.T)