主成分分析的工作示例?
问题描述:
是否有任何可用的示例给出了数据集上主要组件分析的实例?我正在阅读仅讨论理论的文章,并且正在寻找能够告诉我如何使用PCA,然后解释结果并将原始数据集转换为新数据集的内容。有什么建议吗?主成分分析的工作示例?
答
如果你知道的Python,这里是一个简短的动手例如:
# Generate correlated data from uncorrelated data.
# Each column of X is a 3-dimensional feature vector.
Z = scipy.randn(3, 1000)
C = scipy.randn(3, 3)
X = scipy.dot(C, Z)
# Visualize the correlation among the features.
pylab.scatter(X[0,:], X[1,:])
pylab.scatter(X[0,:], X[2,:])
pylab.scatter(X[1,:], X[2,:])
# Perform PCA. It can be shown that the principal components of the
# matrix X are equivalent to the left singular vectors of X, which are
# equivalent to the eigenvectors of X X^T (up to indeterminacy in sign).
U, S, Vh = scipy.linalg.svd(X)
W, Q = scipy.linalg.eig(scipy.dot(X, X.T))
print U
print Q
# Project the original features onto the eigenspace.
Y = scipy.dot(U.T, X)
# Visualize the absence of correlation among the projected features.
pylab.scatter(Y[0,:], Y[1,:])
pylab.scatter(Y[1,:], Y[2,:])
pylab.scatter(Y[0,:], Y[2,:])
答
您可以检查http://alias-i.com/lingpipe/demos/tutorial/svd/read-me.html SVD和LSA与PCA非常相似,都是空间缩减方法。基础评估方法的唯一区别。
答
由于您要求提供实际操作示例,因此您可以使用一个交互式演示。