An introduction to optimization on smooth manifolds

I taught Riemannian optimization at EPFL in spring 2023 for the course MATH-512. The recorded lectures are available below. A semester runs for 14 weeks. Most weeks have a lecture and an exercise session, which each last 90-95 minutes (week 5 had double lectures). The students further learned the material by completing two extensive projects.

Video links bring you to an external hosting service. There, you can change the playback speed of videos (play faster, slower): see controls at the bottom-right of the video panel. You can also download the videos.

I gratefully acknowledge EPFL's CEDE (Matthew Goodman in particular) for video post-production!

#	Slides	Title + videos
exercises
	pdf

Exercises and solutions prepared with Christopher Criscitiello and Timon Miehling in 2023.

Click a question to display a sketch of the answer if available. Click a sketch to display a detailed answer if available. If no sketch is available but a detailed answer is, then clicking the question displays the detailed answer.

Sketches of answers: hide all, show all. Detailed answers: hide all, show all.

To load exercises below, go to the Lectures tab, and click "Exercises" next to a week's header.

You may find the Manopt toolboxes helpful to use Riemannian optimization in your projects:

Manopt in Matlab,
PyManopt in Python,
Manopt.jl in Julia.

For example, in Matlab, to minimize the function $f(x) = x^\top A x$ on the sphere in $\Rn$, you can write:

n = 42;
A = randsym(n);
problem.M = spherefactory(n);
problem.cost = @(x) x'*A*x;
problem.egrad = @(x) 2*A*x;  % or use autodiff: problem = manoptAD(problem);
x = trustregions(problem);   % call an overly fancy algorithm

In Python, you could use JAX (among other backends) and write:

import pymanopt
import jax

n = 42
key = jax.random.key(0)
A = jax.random.normal(key, [n, n])
A = A + A.T

manifold = pymanopt.manifolds.Sphere(n)
@pymanopt.function.jax(manifold)
def cost(x):
    return x.T @ A @ x

problem = pymanopt.Problem(manifold, cost)
optimizer = pymanopt.optimizers.TrustRegions()
result = optimizer.run(problem)

In Julia, you might write:

using Manopt, Manifolds, LinearAlgebra
n = 42
A = Symmetric(randn(n, n))
M = Sphere(n-1)
f(E, x) = x'*A*x
∇f(E, x) = 2*A*x  # define the Euclidean gradient
x = trust_regions(M, f, ∇f; objective_type=:Euclidean)

Contact: nicolas.boumal@epfl.ch.

For general questions about geometry and optimization, please use the Manopt forum, MathOverflow or Math StackExchange.

I gave one of the SIAM Optimization 2023 minitutorials. It introduces the basics of differential geometry and Riemannian geometry for optimization, paralleling the early material in this book.

The two lectures of 90 minutes each are available as video 1 and video 2.

Here are the slides and also the Max-Cut example code.

Earlier (and more condensed) versions of that tutorial are available as a one-hour video (Simons Institute) and a two-hour video (ETHZ/EPFL summer school).

If you enjoy such topics, then you will definitely enjoy the classic book Optimization Algorithms on Matrix Manifolds by Absil, Mahony and Sepulchre (Princeton University Press, 2008), also freely available online.

This book about Riemannian optimization by Nicolas Boumal is published by Cambridge University Press, 2023.

You can also download the pre-publication PDF.

This website further offers recorded lectures (videos + slides) and exercises, as a companion to the book.

Feel free to e-mail me about any mistakes you spot or suspect, be it typos or more serious things. These are added as sticky notes in the pdf above (last updated on Sep. 15, 2023). I appreciate your input, always.

For a shorter video introduction, click the More tab.

About the book

Optimization on manifolds is the result of smooth geometry and optimization merging into one elegant modern framework. This text introduces the differential geometry and Riemannian geometry concepts to help students and researchers in applied mathematics, computer science and engineering gain a firm mathematical grounding to use these tools confidently in their research.

All definitions and theorems are motivated to build time-tested optimization algorithms. Starting from first principles, the text goes on to cover current research on topics including iteration complexity and geodesic convexity. Readers will appreciate the tricks of the trade sprinkled throughout the book, to guide research and numerical implementations.

This book has no prerequisites in geometry or optimization. Chapters 3 and 5 can serve as a standalone introduction to differential and Riemannian geometry, focused on embedded submanifolds of linear spaces, with proofs and an eye towards computability. It then builds from there to cover algorithms and equip the reader for modern research challenges.

Chapter 8 provides the general theory so that we can build quotient manifolds in Chapter 9. The optimization algorithms in Chapters 4 and 6 apply to the general case, but can already be understood after reading Chapters 3 and 5. Chapter 7 details examples of submanifolds that come up in practice. Chapter 10 covers more advanced Riemannian tools, and Chapter 11 introduces geodesic convexity.

Teaching

As a course, this material is popular with applied mathematicians, computer scientists and mathematically inclined engineering students, at the graduate and advanced undergraduate levels.

In a one-semester graduate course of the mathematics department at Princeton University in 2019 and 2020 (24 lectures of 80 minutes each, two projects, no exercises), I covered much of (what became) Chapters 1–6 and select parts of Chapter 7 before the midterm break, then much of Chapters 8–9 and select parts of Chapters 10–11 after the break. Those chapters were shorter at the time, but it still made for a sustained pace.

At EPFL in 2021, I discussed mostly Chapters 1–8 in 13 lectures of 90 minutes, plus as many exercise sessions and two projects. In 2023, the lectures were recorded: see Lectures tab above.

If you (will) teach this topic, feel free to e-mail me: I can share more resources.

How to cite

@Book{boumal2023intromanifolds,
title     = {An introduction to optimization on smooth manifolds},
author    = {Boumal, Nicolas},
publisher = {Cambridge University Press},
year      = {2023},
url       = {https://www.nicolasboumal.net/book},
doi       = {10.1017/9781009166164}
}

It is best to reference sections, theorems, equations, etc., as all such objects are numbered identically in the pre-publication PDF available here and in the published version. In contrast, page numbers differ.

Table of contents (expand/collapse)

Preface
1. Introduction
2. Simple examples
- 2.1 Sensor network localization from directions: an affine subspace
- 2.2 Single extreme eigenvalue or singular value: spheres
- 2.3 Dictionary learning: products of spheres
- 2.4 Principal component analysis: Stiefel and Grassmann
- 2.5 Synchronization of rotations: special orthogonal group
- 2.6 Low-rank matrix completion: fixed-rank manifold
- 2.7 Gaussian mixture models: positive definite matrices
- 2.8 Smooth semidefinite programs
3. Embedded geometry: first order
- 3.1 Reminders of Euclidean space
- 3.2 Embedded submanifolds of a linear space
- 3.3 Smooth maps on embedded submanifolds
- 3.4 The differential of a smooth map
- 3.5 Vector fields and the tangent bundle
- 3.6 Moving on a manifold: retractions
- 3.7 Riemannian manifolds and submanifolds
- 3.8 Riemannian gradients
- 3.9 Local frames*
- 3.10 Notes and references
4. First-order optimization algorithms
- 4.1 A first-order Taylor expansion on curves
- 4.2 First-order optimality conditions
- 4.3 Riemannian gradient descent
- 4.4 Regularity conditions and iteration complexity
- 4.5 Backtracking line-search
- 4.6 Local convergence*
- 4.7 Computing gradients*
- 4.8 Numerically checking a gradient*
- 4.9 Notes and references
5. Embedded geometry: second order
- 5.1 The case for another derivative of vector fields
- 5.2 Another look at differentials of vector fields in linear spaces
- 5.3 Differentiating vector fields on manifolds: connections
- 5.4 Riemannian connections
- 5.5 Riemannian Hessians
- 5.6 Connections as pointwise derivatives*
- 5.7 Differentiating vector fields on curves
- 5.8 Acceleration and geodesics
- 5.9 A second-order Taylor expansion on curves
- 5.10 Second-order retractions
- 5.11 Special case: Riemannian submanifolds*
- 5.12 Special case: metric projection retractions*
- 5.13 Notes and references
6. Second-order optimization algorithms
- 6.1 Second-order optimality conditions
- 6.2 Riemannian Newton's method
- 6.3 Computing Newton steps: conjugate gradients
- 6.4 Riemannian trust regions
- 6.5 The trust-region subproblem: truncated CG
- 6.6 Local convergence of RTR with tCG*
- 6.7 Simplified assumptions for RTR with tCG*
- 6.8 Numerically checking a Hessian*
- 6.9 Notes and references
7. Embedded submanifolds: examples
- 7.1 Euclidean spaces as manifolds
- 7.2 The unit sphere in a Euclidean space
- 7.3 The Stiefel manifold: orthonormal matrices
- 7.4 The orthogonal group and rotation matrices
- 7.5 Fixed-rank matrices
- 7.6 The hyperboloid model
- 7.7 Manifolds defined by $h(x) = 0$
- 7.8 Notes and references
8. General manifolds
- 8.1 A permissive definition
- 8.2 The atlas topology, and a final definition
- 8.3 Embedded submanifolds are manifolds
- 8.4 Tangent vectors and tangent spaces
- 8.5 Differentials of smooth maps
- 8.6 Tangent bundles and vector fields
- 8.7 Retractions and velocity of a curve
- 8.8 Coordinate vector fields as local frames
- 8.9 Riemannian metrics and gradients
- 8.10 Lie brackets as vector fields
- 8.11 Riemannian connections and Hessians
- 8.12 Covariant derivatives and geodesics
- 8.13 Taylor expansions and second-order retractions
- 8.14 Submanifolds embedded in manifolds
- 8.15 Notes and references
9. Quotient manifolds
- 9.1 A definition and a few facts
- 9.2 Quotient manifolds through group actions
- 9.3 Smooth maps to and from quotient manifolds
- 9.4 Tangent, vertical and horizontal spaces
- 9.5 Vector fields
- 9.6 Retractions
- 9.7 Riemannian quotient manifolds
- 9.8 Gradients
- 9.9 A word about Riemannian gradient descent
- 9.10 Connections
- 9.11 Hessians
- 9.12 A word about Riemannian Newton's method
- 9.13 Total space embedded in a linear space
- 9.14 Horizontal curves and covariant derivatives
- 9.15 Acceleration, geodesics and second-order retractions
- 9.16 Grassmann manifold: summary*
- 9.17 Notes and references
10. Additional tools
- 10.1 Distance, geodesics and completeness
- 10.2 Exponential and logarithmic maps
- 10.3 Parallel transport
- 10.4 Lipschitz conditions and Taylor expansions
- 10.5 Transporters
- 10.6 Finite difference approximation of the Hessian
- 10.7 Tensor fields and their covariant differentiation
- 10.8 Notes and references
11. Geodesic convexity
- 11.1 Convex sets and functions in linear spaces
- 11.2 Geodesically convex sets and functions
- 11.3 Alternative definitions of geodesically convex sets*
- 11.4 Differentiable geodesically convex functions
- 11.5 Geodesic strong convexity and Lipschitz continuous gradients
- 11.6 Example: positive reals and geometric programming
- 11.7 Example: positive definite matrices
- 11.8 Notes and references
Bibliography

Last update: February 21, 2025.

an introduction toOptimization on smooth manifolds

About the book

Teaching

How to cite

Table of contents (expand/collapse)

an introduction to
Optimization on smooth manifolds