264x Filetype PDF File size 0.61 MB Source: www.researchsquare.com
Isometric Projection with Autoencoder
Ruisheng Ran ( rshran@cqnu.edu.cn )
Chongqing Normal University
Qianghui Zeng ( 2021210516092@stu.cqnu.edu.cn )
Chongqing Normal University
Xiaopeng Jiang ( 2021210516042@stu.cqnu.edu.cn )
Chongqing Normal University
Bin Fang ( fb@cqu.edu.cn )
Chongqing University
Research Article
Keywords:
DOI: https://doi.org/
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
Isometric Projection with Autoencoder
1* 1† 1†
Ruisheng Ran , Qianghui Zeng , Xiaopeng Jiang and Bin
Fang2†
1*The College of Computer and Information Science, Chongqing
Normal University, , Chongqing, 401331, , China.
2The College of Computer Science, Chongqing University, ,
Chongqing, 400044, , China.
*Corresponding author(s). E-mail(s): rshran@cqnu.edu.cn;
Contributing authors: 2021210516092@stu.cqnu.edu.cn;
2021210516042@stu.cqnu.edu.cn; fb@cqu.edu.cn;
†These authors contributed equally to this work.
Abstract
Isometric Projection (IsoP) is a linear dimensionality reduction method,
which proviedes the best linear approximation to the true isometric
embedding of data. However, IsoP and all its variants only consider the
one-way mapping from high-dimensional space to low-dimensional space.
The projected low-dimensional data may not “represent” the original
sample accurately and effectively. In this paper, based on the structure
of linear autoencoder, a new IsoP method called IsoP-AE (Isometric
projection with autoencoder) has been proposed. In this method, the
conventional projection of IsoP is viewed as the encoding stage, and the
decoder is used to reconstruct the original high-dimensional data from
the projected low-dimensional data. In this way, our algorithm makes
the low-dimensional embedding data “represent” the original data more
accurately and effectively. Experiment results on Handwritten Alphadig-
its, COIL-100, Olivetti Research Laboratory (ORL) and Georgia Tech
face datasets show that the proposed IsoP-AE approach provides a better
representation of the data and achieves much higher recognition accuracy.
Keywords: Isometric Projection, autoencoder, dimensionality reduction,
manifold learning
1
2 Isometric Projection with Autoencoder
1 Introduction
Curse of dimensionality [1, 2] was first proposed by mathematictian Richard
Bellman when he studied dynamic programming problems, and it is used to
describe a series of mathematical phenomena in high-dimensional spaces. In
particular, in the field of Machine Learning (ML) [3], the curse of dimensional-
ity often refers to the exponential relationship between dataset dimensionality
and data size. In general, as the number of features grows, the number of sam-
ples required for the machine model training algorithm increases exponentially.
The difficulty of training machine learning models due to high-dimensional
data is known as the “curse of dimensionality”.
Dimensionalty reduction (DR) [4, 5] is one of the effective ways to solve
the curse of dimensionality. Dimensionality reduction methods are gener-
ally divided into linear and nonlinear [6]. Linear dimensionality reduction
techniques assume that the data structure is linear. It uses a simple linear
function to project high-dimensional data to low-dimensional data to obtain
low-dimensional features of the data. The representative algorithms of lin-
ear dimension reduction include Principal Component Analysis (PCA) [7] and
Linear Discriminant Analysis (LDA) [8, 9]. Their commonality is that they
all assume that the original dataset is embedded in a global linear structure.
However, both PCA and LDA are linear methods, and the non-linear data will
lead to poor dimensionality reduction.
For many nonlinear problems, nonlinear methods have different processing
methods: kernel-based [10] and manifold-based [11] dimensionality reduction
methods are proposed. The kernel function-based dimensionality reduction
method will project the data to a higher dimensional space to make it linearly
possible, but the selection of the most critical kernel method is more difficult
and can only be judged empirically. Due to the limitation of dimensionality
reduction of kernel methods, manifold learning methods have appeared in front
of people as another important nonlinear dimensionality reduction technology
in recent years, and its representative method is Locally Linear Embedding
(LLE) [12] and Isomap [13]. However, the disadvantage of nonlinear methods
is that they are only defined on the training set and cannot be mapped on the
test set. Therefore, nonlinear manifold linearization versions are proposed, such
as Locality Preserving Projections (LPP) [14] is the linearization of Laplacian
Eigenmap(LE)[15,16]andIsometricProjection(IsoP)[17]isthelinearization
of Isomap.
The IsoP algorithm first constructs the nearest neighbor graph of the
observed data, and then computes the shortest paths for all pairs of data
points in the graph. Through this process, an estimate of the global structure
of the data is obtained. Then the Multi-dimensional Scaling (MDS) [18, 19]
technology is used, and the mapping function is required to be linear, and the
objective function of IsoP is obtained. IsoP retains the advantages of Isomap
while overcoming the disadvantage of only providing embeddings for training
data.
Isometric Projection with Autoencoder 3
There are many ways to improve IsoP, and the effect is better than IsoP.
In ML, we are often faced with high-dimensional data. In this case, the num-
ber of samples is much smaller than the dimension of the samples, and the
matrix singular value problem will occur when manifold learning algorithms
are solved. This is the so-called small-sample-size (SSS) [20] problem, and the
IsoP algorithm also faces this problem.
Therawdatais usually preprocessed using PCA or Singular Value Decom-
position (SVD) [21, 22], which avoids the SSS problem but also inherits the
shortcomings of PCA [23]. To address this issue, other variants of IsoP have
also been proposed, such as Tensor based Isometric Projection (TIsoP) [24]
and Isometric Projection base on Maximal Margin Criterion (IsoP-MMC)
[25]. Other improved methods of IsoP include Orthogonal Isometric Projection
(OIsoP) [26] and Uncorrelated Discriminant Isometric Projections (UDIsoP)
[27], of which OIsoP can be regarded as an extension of IsoP, and UDIsoP is a
feature extraction method based on face recognition. According to the regular-
ization method given in Ref. [28], it can be applied to the IsoP method, that
is, the Regularized Isometric Projection (RIsoP) is obtained, and the Expo-
nential Isometric Projection (EIsoP) can be obtained from the exponential
embedding using matrix exponential given in Ref. [29].
Theidea of OIsoP is the same as IsoP, but further requires that the projec-
tion matrix is orthogonal, and its constraints are different from the orthogonal
projection of Cai’s projection. TIsoP is also another extension of IsoP. The
algorithm uses a two-dimensional image matrix instead of a traditional one-
dimensional vector, and performs SVD in the tensor space, thereby avoiding
the small sample problem.
However, current IsoP methods and their variants only consider one-way
mapping from high-dimensional popular space to low-dimensional space. This
mappingenables the embedded low-dimensional data points to preserve intrin-
sic geometry of the original sample, but it may not “represent” the original
sample very accurately and efficiently.
In this work, based on the structure of linear autoencoder, a new IsoP
method called IsoP-AE (Isometric projection with autoencoder) has been pro-
posed. Specifically, under the condition of maintaining the geodesic distance
information of the sample, the data points in high-dimensional manifold space
are encoded into data points in low-dimensional space by using the conven-
tional IsoP projection model. However, we also consider using the decoder to
reconstruct the original high-dimensional data points from the embedded low-
dimensional data points. That is, compared with the conventional IsoP, the
new IsoP method has an additional reconstruction stage. This stage enables
the embedded low-dimensional data to retain as much information as possible
of the original high-dimensional data, so the embedded low-dimensional data
“represent” the original samples more accurately and effectively.
The rest of this paper proceeds as follows: in second section, we review the
Isomap method, IsoP method and autoencoder. In third section, we propose
the novel IsoP method with the encoder-decoder paradigm. In fourth section,
no reviews yet
Please Login to review.