[Linear Algebra]#1 NORM

Jude's Sound Lab·2023년 1월 22일

Mathmatics for ML

목록 보기

11/13

Introduce

torch.nn.functional.normalize is a function that normalizes the input tensor along a specific dimension, whereas torch.linalg.norm is a function that calculates the norm of the input tensor along a specific dimension.

F.normalize normalizes the input tensor such that it has a L2 norm of 1. It is defined as y = x / norm(x)

norm calculates the norm of the input tensor, which is defined as the square root of the sum of squares of the tensor elements. It can be either L1 or L2 norm which is specified by the p parameter. The default value is p=2.

Norm

norm in vector

there are different methods to calculate the norm of a certain vector

L1 norm (Manhattan norm): To compute the L1 norm of a vector, simply sum the absolute values of all its elements. In mathematical notation, the L1 norm of a vector x with n elements can be written as ||x||_1 = sum(|x_i|), where |x_i| denotes the absolute value of the i-th element of x.
L2 norm (Euclidean norm): To compute the L2 norm of a vector, you can take the square root of the sum of the squares of its elements. In mathematical notation, the L2 norm of a vector x with n elements can be written as ||x||_2 = sqrt(sum(x_i^2)), where x_i^2 denotes the square of the i-th element of x.
Infinity norm (Chebyshev norm): To compute the infinity norm of a vector, simply take the maximum absolute value of its elements. In mathematical notation, the infinity norm of a vector x with n elements can be written as ||x||_inf = max(|x_i|), where |x_i| denotes the absolute value of the i-th element of x.
p-norm: The p-norm of a vector is a generalization of the L1 and L2 norms, and can be computed using the formula ||x||_p = (sum(|x_i|^p))^(1/p). This formula defines a family of norms that depends on a parameter p, where p=1 gives the L1 norm, p=2 gives the L2 norm, and p->infinity gives the infinity norm.

norm in matrix

The norm of a matrix is a quantity that measures the size of the matrix in a certain sense. There are different ways to define the norm of a matrix, depending on the specific application and the properties of the matrix.

Frobenius norm: The Frobenius norm of a matrix is defined as the square root of the sum of the squared absolute values of all its elements. In mathematical notation, the Frobenius norm of a matrix A with m rows and n columns can be written as ||A||F = sqrt(sum(|A{i,j}|^2)), where |A_{i,j}| denotes the absolute value of the element in the i-th row and j-th column of A.
Spectral norm (or operator norm): The spectral norm of a matrix is defined as the maximum singular value of the matrix, which is the square root of the largest eigenvalue of the matrix A^T A, where A^T is the transpose of A. In mathematical notation, the spectral norm of a matrix A with m rows and n columns can be written as ||A||_2 = sqrt(max(eig(A^T A))).
Maximum norm (or infinity norm): The maximum norm of a matrix is defined as the maximum absolute row sum of the matrix. In mathematical notation, the maximum norm of a matrix A with m rows and n columns can be written as ||A||inf = max(sum(|A{i,j}|)), where the sum is taken over all columns j and the maximum is taken over all rows i.
p-norm: The p-norm of a matrix is defined as the maximum absolute value of the sum of the p-th power of the absolute values of the elements in each row of the matrix. In mathematical notation, the p-norm of a matrix A with m rows and n columns can be written as ||A||p = max(sum(|A{i,j}|^p)^(1/p)), where the sum is taken over all columns j and the maximum is taken over all rows i.

determinant

The determinant of a matrix is a scalar value that is used to characterize the matrix and its properties. It is denoted by det(A) or |A|, where A is the matrix. The determinant of a matrix is only defined for square matrices, i.e., matrices with the same number of rows and columns.

There are different ways to define the determinant of a matrix, but one common definition is based on the concept of signed volume or area. Specifically, the determinant of a 2x2 matrix is the area of the parallelogram formed by the column vectors of the matrix, and the determinant of a 3x3 matrix is the signed volume of the parallelepiped formed by the column vectors of the matrix.

parallelogram

The sign of the determinant depends on the orientation of the parallelogram or parallelepiped. If the column vectors of the matrix are oriented counterclockwise, the determinant is positive; if they are oriented clockwise, the determinant is negative.

Notice that the area of the parallelogram formed by the rows is the same as the area of the parallelogram formed by the columns, and the orientation (counterclockwise or clockwise) is the same as well.

links are the images of proving "ad-bc = area of the parallelogram from a certain matrix"
https://math.stackexchange.com/questions/29128/why-determinant-of-a-2-by-2-matrix-is-the-area-of-a-parallelogram

The determinant of a matrix has many important applications in linear algebra, such as:

Computing the inverse of a matrix: A square matrix A is invertible if and only if its determinant is nonzero. Moreover, the inverse of A can be computed using the adjugate matrix, which is a matrix that is obtained by replacing each element of A with its cofactor, and then transposing the resulting matrix. The adjugate matrix is then multiplied by 1/det(A), where det(A) is the determinant of A.
Solving systems of linear equations: A system of linear equations can be solved using the inverse of the coefficient matrix, which is a square matrix that represents the coefficients of the equations. The inverse of the coefficient matrix can be computed using the determinant and the adjugate matrix.
Testing for linear independence: A set of vectors is linearly independent if and only if the determinant of the matrix formed by these vectors is nonzero.

calculating eigenvalue

# for example
A = [2 1]
    [1 2]
    
det(A - lambda*I) = 0

det([2-lambda 1       ]) = 0
    [1        2-lambda]

(2-lambda)^2 - 1 = 0

lambda^2 - 4lambda + 3 = 0

lambda = (4 +/- sqrt(4^2 - 4*3)) / 2
       = 2 +/- 1

So the eigenvalues of A are lambda1 = 3 and lambda2 = 1.

In summary, to find the eigenvalues of a matrix, you need to solve the characteristic equation of the matrix, which is obtained by setting the determinant of the matrix minus the eigenvalue times the identity matrix equal to zero. Then you can solve the resulting polynomial equation to find the eigenvalues.