SFU MOCAD Seminar: Brynjulf Owren
Topic
A dynamical systems approach for designing stable neural networks on Euclidean spaces and Riemannian manifolds
Speakers
Details
Recently, Sherry et al. (2024) reconsidered the pioneering work of Dahlquist and Jeltsch (1979) on circle-contractivity for the study of neural networks. This theory can be used to analyse and improve the robustness of architectures that are devised by a dynamical systems approach. The main idea is to start with a continuous dynamical system which satisfies a certain monotonicity condition. Then we need to discretize the system in a way that preserves the non-expansive behavior of the associated flow. The theory is old, but not necessarily widely known because Dahlquist and Jeltsch only published the results in the form of a preprint. The application to neural networks is new as far as we know, and we shall present some results and examples from Sherry et al (2024). The importance of neural networks set on Riemannian manifolds seems to be increasing and there is a need to develop the theory of non-expansive numerical methods also in such a setting. We present some ideas from Arnold et al. (2024) where a few simple numerical methods for Riemannian manifolds are studied. We consider whether these methods can be non-expansive when applied to non-expansive vector fields. For the geodesic implicit Euler method, which also feature in the proximal gradient method for optimisation, we find that its behaviour is strongly dependent on the sectional curvature of the manifold. As opposed to the Euclidean case, we now also have to be careful about whether the nonlinear equations to be solved in each time step has a unique solution or not.
Arnold, Celledoni, Çokaj, Owren, Tumiotto: B-stability of numerical integrators on Riemannian manifolds. Journal of Computational Dynamics, 2024, 11(1): 92-107. doi: 10.3934/jcd.2024002
Dahlquist and Jeltsch: Generalized disks of contractivity for explicit and implicit Runge-Kutta methods. Dept. of Numerical Analysis and Computer Science, The Royal Institute of Technology, Stockholm, Report TRITA-NA-7906}, 1979.
Sherry, Celledoni, Ehrhardt, Murari, Owren, Schönlieb: Designing Stable Neural Networks using Convex Analysis and ODEs, Physica D: Nonlinear Phenomena, (463) 2024, Paper No. 134159, 13 pp.