257x Filetype PDF File size 0.48 MB Source: www-personal.umich.edu
LECTURE NOTES ON NON-ASYMPTOTIC THEORY
OF RANDOM MATRICES
MARKRUDELSON
AMSSHORTCOURSEONRANDOMMATRICES
San Diego, January 2013
Partially supported by NSF DMS grant DMS 1161372.
1
2 MARKRUDELSON
1. Introduction
The classical random matrix theory is concerned with asymptotic of various
spectral characteristics of families of random matrices, when the dimensions of
the matrices tend to infinity. There are many examples when these character-
istics, which are random variables themselves, converge to certain limit laws.
This includes the celebrated Wigner semicircle law for the empirical measures
of eigenvalues of random symmetric matrices, Marchenko–Pastur law, which is
the limit of empirical measures of sample covariance matrices, Tracy–Widom
distribution describing the limit of the first singular values of a sequence of
random matrices, etc. [1]. These limits are of paramount importance, yet in
applications one usually needs information about the behavior of such charac-
teristics for large, but fixed n. For instance in problems in convex geometry
one constructs a random section of an N-dimensional convex body by taking
the kernel or the range of a certain random matrix. Random matrices arise
also in analysis of rates of convergence of computer science algorithms. In
both cases, the dimension of the ambient space remains fixed, and one seeks
explicit estimates of probabilities in terms of the dimension. For such problems
knowing the limit behavior is of little help.
The problems involving estimates for a fixed finite dimension arise in the
classical random matrix theory as well. One of the main approaches in deriving
the limit laws is based on analysis of the Stieltjes transform of measures [1].
Toderive the convergence of Stieltjes transforms, one frequently has to provide
explicit bounds on the smallest singular value of a random matrix of a fixed
size, which holds with high probability. This need arises, e.g., in derivation of
the circular law and the single ring theorem.
These questions led to development of non-asymptotic theory of random
matrices, which provides probabilistic bounds for eigenvalues, singular values,
etc. for random matrices of a large fixed size. The situation is roughly parallel
to that arising for the sums of i.i.d. random variables, where the asymptotic
and non-asymptotic results go hand in hand. The asymptotic behavior of the
averages of n i.i.d. random variables is governed by the Strong Law of Large
Numbers establishing the almost sure convergence to the expectation. Yet, to
assert that the average of a large number of random variables is close to the
expectation, we need a non-asymptotic version, e.g. Hoeffding inequality. This
inequality yields a subgaussian bound for the large deviations (see the details
below). Such behavior suggests that the limit distribution of the deviation
should be normal, which leads to an asymptotic result, the Central Limit
Theorem (CLT). To use the CLT in evaluation of probabilities for random
sums, we need its non-asymptotic version, namely the Berry–Esseen Theorem.
This theorem provides in turn a crucial step in deriving another fundamental
asymptotic result, the Law of Iterated Logarithm.
NON-ASYMPTOTIC THEORY 3
These notes discuss the methods of the non-asymptotic approach to the
random matrix theory. We do not attempt to provide an exhaustive list of
references (a reader can check the surveys [5], [27], and [40]). Instead we con-
centrate on three essentially different examples, with the aim of presenting
the methods and results in a maximally self-contained form. This approach
inevitably leaves out several important recent developments, such as invertibil-
ity of random symmetric matrices [42, 19], applications to the Circular Law
[9, 37, 38], and concentration for random determinants [39, 20]. Yet, by re-
stricting ourselves to a few results, we will be able to give a relatively complete
picture of the ideas and methods involved in their proofs. We start with in-
troduction to subgaussian random variables in Section 3. In Sections 5-7 we
obtain quantitative bounds for invertibility of random matrices with i.i.d. en-
tries. As will be shown in Section 6, the arithmetic structures play a crucial
role here. Section 8 studies a question arising in geometric functional analysis.
Here the ambient space is Banach, and the approach combines the methods
of the previous sections with the functional-analytic considerations. We will
also touch upon majorising measures, which are a powerful tool for estimating
suprema of random processes. Section 9 contains another quantitative invert-
ibility result. Here we discuss a random unitary or orthogonal perturbation
of a fixed matrix. Unlike in the first example, the arithmetic structure plays
no role in this problem. The main difficulty is the dependence between the
entries of a random matrix, and the method is based on the introduction of
perturbations with independent entries.
Acknowledgement. Thesenotes are based in part on the material presented
at the workshop “Etats de la Recherche: Probability and geometry in inter-
action” at Paul Sabatier University in Toulouse, the mini-course given at the
Warsaw University, and the Informal Analysis Seminar at Kent State Univer-
sity. The author is grateful to Franck Barthe, Michel Ledoux, Rafal Latala,
Krzystof Oleszkiewicz, Michal Wojchehowski, Fedor Nazarov, Dmitry Ryabo-
gin, and Artem Zvavitch for their hospitality. The author also grateful to
Fedor Nazarov for careful reading of the manuscript and many suggestions,
which led to improvement of the presentation.
2. notation and basic definitions
We shall consider random matrices of high order with independent entries.
For simplicity, we shall assume that the entries are centered (Ea =0) and
j,k
identically distributed (both conditions may be relaxed).
4 MARKRUDELSON
Throughout these notes k·k denotes the ℓp norm
p
!
n 1/p
X p
kxk = |xj| , 1 ≤ p < ∞,
p
j=1
and Bn stands for the unit ball of this norm. The norm of an operator or a
p
matrix will be denoted by k·k. We use Sn−1 for the unit Euclidean sphere. If
F is a finite set, then |F| denotes the cardinality of F. Letters C,C′,c etc.
denote absolute constants.
n
If N ≥ n then an N ×n matrix A can be viewed as a mapping of R into
RN. Thus, a random matrix defines a random n-dimensional section of RN.
For geometric applications we need to know that this matrix would not distort
the metric too much. Let us formulate it more precisely:
Definition 2.1. Let N ≥ n and let A be an N × n matrix. The condition
number of the matrix A is
max n−1 kAxk
σ(A) = x∈S 2.
min n−1 kAxk
x∈S 2
If minx∈Sn−1 kAxk = 0, we set σ(A) = ∞.
2
The condition number of a matrix can be rewritten in terms of its singular
values.
Definition 2.2. Let N ≥ n and let A be an N × n matrix. The singular
∗ 1/2
values of A are the eigenvalues of (A A) , arranged in the decreasing order:
s1(A) ≥ s2(A) ≥ ... ≥ sn(A).
The singular values of A are the lengths of the semi-axes of the ellipsoid
ABn. The first and the last singular values have a clear functional-analytic
2
meaning:
n N
s1(A) = A : R → R ,
and
−1 n n
s (A) = min kAxk = 1/ A : AR →R ,
n n−1
x∈S
whenever A has the full rank. In this notation σ(A) = s (A)/s (A).
1 n
Therefore, to bound the condition number, we have to estimate the first
singular value from above, and the last one from below. For matrices with
i.i.d. random entries the first singular value is the most robust. It can be
estimated using a simple ε-net argument, as will be shown in Proposition 4.4.
The last singular value presents a bigger challenge. We will obtain its bounds
for “tall” rectangular matrices in Section 4, and for square matrices in Sections
5-7.
no reviews yet
Please Login to review.