This talk concerns the approximation of smooth, high-dimensional functions on bounded hypercubes from limited samples using polynomials. This task lies at the heart of many applications in computational science and engineering -- notably, those arising from parametric modelling and computational uncertainty quantification. It is common to use Monte Carlo sampling in such applications, so as not to succumb to the curse of dimensionality. However, it is well known that such a strategy is theoretically suboptimal. Specifically, there are many polynomial spaces of dimension
n for which the sample complexity scales log-quadratically, i.e., like
c \cdot n^2 \cdot \log(n) as
n \rightarrow \infty. This well-documented phenomenon has led to a concerted effort over the last decade to design improved, in fact, near-optimal strategies, whose sample complexities scale log-linearly, or even linearly in
n.
Paradoxically, in this talk we demonstrate that Monte Carlo is actually a perfectly good strategy in high dimensions, despite this apparent suboptimality. We first document this phenomenon empirically via several numerical examples. Next, we present a theoretical analysis that resolves this seeming contradiction for the class of \textit{
(\bm{b},\varepsilon)-holomorphic} functions of infinitely-many variables. We show that there is a least-squares approximation based on
m Monte Carlo samples whose error decays algebraically fast in
m/\log(m), with a rate that is the same as that of the best
n-term polynomial approximation. This result is non-constructive, since it assumes knowledge of a suitable polynomial subspace (depending on
\bm{b}) in which to compute the approximation. Hence, we then present a constructive scheme based on compressed sensing that achieves the same rate, subject to a slightly stronger assumption on
\bm{b} and a larger polylogarithmic factor. This scheme is practical, and numerically performs as well as or better than well-known adaptive least-squares schemes.
Finally, while most of our results concern polynomials, we also demonstrate that the same results can be achieved via deep neural networks with standard training procedures.
Overall, our findings demonstrate that Monte Carlo sampling is eminently suitable for smooth function approximation tasks on bounded domains when the dimension is sufficiently high. Hence, the benefits of state-of-the-art improved sampling strategies seem to be generically limited to lower-dimensional settings.
This is joint work with Simone Brugiapaglia, Juan M. Cardenas, Nick Dexter and Sebastian Moraga.
References
B. Adcock & S. Brugiapaglia. Is Monte Carlo a bad sampling strategy for learning smooth functions in high dimensions?
Preprint (2022).
B. Adcock, S. Brugiapaglia, N. Dexter & S. Moraga. On efficient algorithms for computing near-best polynomial approximations to high-dimensional, Hilbert-valued functions from limited samples.
arXiv:2203.13908 (2022).
B. Adcock, S. Brugiapaglia, & C. G. Webster, Sparse Polynomial Approximation of High-Dimensional Functions, SIAM, Philadelphia, PA (2022).
B. Adcock & J. M. Cardenas, Near-optimal sampling strategies for multivariate function approximation on general domains, SIAM J.\ Math.\ Data Sci., 2(3):607–630 (2020).