Metric Multidimensional Scaling (Metric MDS) is a form of Multidimensional Scaling (MDS) that focuses on preserving the metric distances between points in a high-dimensional space when mapping them to a lower-dimensional space. Unlike Nonmetric MDS, which aims to preserve the rank order of distances, Metric MDS attempts to preserve the actual distances as closely as possible. This technique is particularly useful in applications where the precise distances between data points are crucial for analysis, such as in certain psychological, biological, and physical sciences research.
Metric MDS is based on classical scaling theory, which assumes that the distance matrix used as input is derived from points in a Euclidean space. The primary goal is to find a configuration of points in a lower-dimensional space that minimizes the difference between the distances in this space and the original distances in the high-dimensional space.
Given a set of items with an distance matrix , where represents the distance between items and , Metric MDS seeks to find a set of points in a -dimensional space () such that the Euclidean distances between these points, denoted as , closely match the original distances . The objective function, often referred to as the stress or strain, is minimized during the process:
This optimization problem is typically solved using iterative methods, such as gradient descent or majorization techniques, to find the best lower-dimensional representation of the data.
n_components
: int
, default = Nonep
: float
int
, default = Nonemetric
: Literal
, default = ‘euclidean’from luma.reduction.manifold import MetricMDS
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
import numpy as np
iris_df = load_iris()
X = iris_df.data
y = iris_df.target
model = MetricMDS(n_components=2, metric='mahalanobis')
X_trans = model.fit_transform(X)
fig = plt.figure(figsize=(11, 5))
ax1 = fig.add_subplot(1, 2, 1, projection="3d")
ax2 = fig.add_subplot(1, 2, 2)
for cl, m in zip(np.unique(y), ["s", "o", "D"]):
X_cl = X[y == cl]
sc = ax1.scatter(
X_cl[:, 0],
X_cl[:, 1],
X_cl[:, 2],
c=X_cl[:, 3],
marker=m,
label=iris_df.target_names[cl],
)
ax1.set_xlabel(iris_df.feature_names[0])
ax1.set_ylabel(iris_df.feature_names[1])
ax1.set_zlabel(iris_df.feature_names[2])
ax1.set_title("Original Iris Dataset")
ax1.legend()
cbar = ax1.figure.colorbar(sc, fraction=0.04)
cbar.set_label(iris_df.feature_names[3])
for cl, m in zip(np.unique(y), ["s", "o", "D"]):
X_tr_cl = X_trans[y == cl]
ax2.scatter(
X_tr_cl[:, 0],
X_tr_cl[:, 1],
marker=m,
edgecolors="black",
label=iris_df.target_names[cl],
)
ax2.set_xlabel(r"$z_1$")
ax2.set_ylabel(r"$z_2$")
ax2.set_title(
f"Iris Dataset after {type(model).__name__} (Mahalanobis)"
)
ax2.legend()
ax2.grid(alpha=0.2)
plt.tight_layout()
plt.show()
- Cox, Trevor F., and Michael A. A. Cox. Multidimensional Scaling. Chapman and Hall/CRC, 2001.
- Borg, Ingwer, and Patrick J. F. Groenen. Modern Multidimensional Scaling: Theory and Applications. Springer Science & Business Media, 2005.
- Kruskal, Joseph B., and Myron Wish. Multidimensional Scaling. Sage Publications, 1978.