Locally Tangent Space Alignment (LTSA) is a prominent technique in the realm of non-linear dimensionality reduction, focusing on preserving the local geometry of high-dimensional data by aligning local tangent spaces. Introduced by Zhenyue Zhang and Hongyuan Zha in their influential work, LTSA stands out for its mathematical elegance and practical effectiveness in capturing the intrinsic geometry of data manifolds. This document delves into LTSA, emphasizing its mathematical foundations, procedural steps, and practical implications, closely adhering to the original authors' explanations.
LTSA is predicated on the manifold assumption, which posits that high-dimensional data points often reside on a lower-dimensional manifold embedded within the high-dimensional space. Unlike global linear approaches, LTSA aims to uncover this manifold by exploring the data's local linear structures, subsequently aligning these local views to reveal the manifold's global structure.
The LTSA algorithm can be broken down into several mathematical steps, which we'll explore in detail:
Given a high-dimensional dataset where each , the aim is to find a lower-dimensional representation where each and . For each point , identify its nearest neighbors and construct the local covariance matrix:
where is the matrix of neighbors of and is the mean-centered matrix of . The local tangent space is then approximated by the leading eigenvectors of , corresponding to the largest eigenvalues.
For each point and its neighbors, project the neighbors onto the tangent space to obtain local coordinates :
where is the mean of the neighbors of .
The core of LTSA lies in aligning these local tangent spaces to construct a global coordinate system. This involves minimizing the discrepancy between local coordinates in their respective tangent spaces and their corresponding coordinates in the global space. Formally, we seek to minimize:
where is the matrix of low-dimensional embeddings for all points, is a matrix that maps global coordinates to local coordinates for the neighborhood of point , and denotes the Frobenius norm.
The solution to the optimization problem is found by constructing a matrix from the local contributions of each point and its neighbors and then solving an eigenvalue problem. Specifically, the matrix is defined such that , where is constructed from all . The lower-dimensional embeddings are then given by the eigenvectors corresponding to the smallest non-zero eigenvalues of .
n_components
: int
n_neighbors
: int
, default = 10Test on the Swiss roll dataset:
from luma.reduction.manifold import LTSA
from sklearn.datasets import make_swiss_roll
import matplotlib.pyplot as plt
X, y = make_swiss_roll(n_samples=500, noise=0.2)
model = LTSA(n_components=2, n_neighbors=8)
Z = model.fit_transform(X)
fig = plt.figure(figsize=(10, 5))
ax1 = fig.add_subplot(1, 2, 1, projection="3d")
ax2 = fig.add_subplot(1, 2, 2)
ax1.scatter(X[:, 0], X[:, 1], X[:, 2], c=y, cmap="rainbow")
ax1.set_xlabel(r"$x$")
ax1.set_ylabel(r"$y$")
ax1.set_zlabel(r"$z$")
ax1.set_title("Original Swiss-Roll")
ax2.scatter(Z[:, 0], Z[:, 1], c=y, cmap="rainbow")
ax2.set_xlabel(r"$z_1$")
ax2.set_ylabel(r"$z_2$")
ax2.set_title(f"After {type(model).__name__}")
ax2.grid(alpha=0.2)
plt.tight_layout()
plt.show()
LTSA is widely applied in areas such as:
- Zhang, Zhenyue, and Hongyuan Zha. "Principal Manifolds and Nonlinear Dimension Reduction via Tangent Space Alignment." Journal of Shanghai University (English Edition), 2004.