Thursday, June 13 |
07:30 - 09:00 |
Breakfast (Restaurant Hotel Hacienda Los Laureles) |
09:00 - 10:00 |
Luana Ruiz: A Poincaré Inequality and Consistency Results for Signal Sampling on Large Graphs ↓ Abstract: Large-scale graph machine learning is challenging as the complexity of learning models scales with the graph size. Subsampling the graph is a viable alternative, but sampling on graphs is nontrivial as graphs are non-Euclidean. Existing graph sampling techniques require not only computing the spectra of large matrices but also repeating these computations when the graph changes, e.g., grows. In this paper, we introduce a signal sampling theory for a type of graph limit---the graphon. We prove a Poincaré inequality for graphon signals and show that complements of node subsets satisfying this inequality are unique sampling sets for Paley-Wiener spaces of graphon signals. Exploiting connections with spectral clustering and Gaussian elimination, we prove that such sampling sets are consistent in the sense that unique sampling sets on a convergent graph sequence converge to unique sampling sets on the graphon. We then propose a related graphon signal sampling algorithm for large graphs, and demonstrate its good empirical performance on graph machine learning tasks. (Conference Room San Felipe) |
10:00 - 10:30 |
Coffee Break (Conference Room San Felipe) |
10:30 - 11:30 |
Cristóbal Guzmán: The Role of Sparsity in Differentially-Private Learning ↓ Abstract:
With the increasing use of personally-sensitive data in machine learning applications, privacy has become a central concern. In this context, differential privacy (DP) offers a rigorous and quantifiable control of the privacy risk of a machine learning model. One of the main problems of interest in differential privacy is stochastic optimization, where we are interested in computing a model that approximately minimizes the empirical or population excess risk, while satisfying differential privacy with respect to thedata used for training the model.
In this talk, we will present various settings where one can obtain nearly dimension independent accuracy rates in differentially private (stochastic) optimization and saddle-point problems. We start with results involving stochastic optimization under polyhedral constraints, popular in sparsity-oriented machine learning. Next, we move to stochastic saddle-point problems, where we study the use of stochastic mirror-descent methods and vertex sampling; these results are applied to problems including DP and group-fair machine learning, and DP synthetic data generation. Finally, I will present results on a "dual" counterpart of the above problems: stochastic optimization with sparse gradients, a setting of high relevance in large embedding models. Here, we provide new matching upper and lower bounds both in the convex and nonconvex settings. (Conference Room San Felipe) |
11:30 - 12:00 |
Soledad Villar: Symmetries in machine learning: point clouds and graphs. ↓ Abstract:
In this talk, we give an overview of the use of exact and approximate symmetries in machine learning models. We will focus on (1) mathematical tools to efficiently implement invariant functions on point clouds, and (2) symmetries as model selection for graph neural networks and bias-variance tradeoffs. (Online - CMO) |
12:00 - 14:00 |
Lunch (Restaurant Hotel Hacienda Los Laureles) |
14:00 - 15:00 |
Chong You: How many FLOPs is a token worth? ↓ Abstract:
Despite the remarkable capabilities of deep learning models, their training and inference processes are often hindered by exorbitant computational costs, largely attributed to the intensive nature of dense matrix multiplication. This presentation illuminates an intriguing revelation: within trained Transformer models lies a substantial degree of inherent sparsity in intermediate activation maps, with percentages such as 3.0% for T5 and 6.3% for ViT. Through rigorous experimentation, we demonstrate the prevalence of such sparsity across diverse tasks, Transformer architectures of varying scales, and at all levels of depth. By harnessing this sparsity, we offer practical strategies to enhance the efficiency of inference for large language models. Concluding the discussion, we delve into theoretical insights elucidating the origins of sparsity within unconstrained feature models. (Conference Room San Felipe) |
15:00 - 15:30 |
Coffee Break (Conference Room San Felipe) |
15:30 - 16:00 |
Josue Tonelli-Cueto: Lazy quotient metrics: Approximate symmetries for ML models ↓ Abstract:
In many learning contexts, the target labels don't change much if we apply a small deformation to the input data. Mathematically this can be formalized by quotienting out approximate symmetries in the design of the hypothesis class of functions. By doing so we obtain a space of functions that are invariant with respect to "small" symmetries but not “large” ones. In this talk, I present a new approach where we formalize the notion of approximate symmetries using what we call the lazy quotient metric and apply it to toy machine learning problems. This is joint work with Dustin Mixon, Soledad Villar, and Brantley Vose. (Conference Room San Felipe) |
16:00 - 16:30 |
Young Kyung Kim: Vision Transformers with Natural Language Semantics ↓ Abstract:
Tokens or patches within Vision Transformers (ViT) lack essential semantic information, unlike their counterparts in natural language processing (NLP). Typically, ViT tokens are associated with rectangular image patches that lack specific semantic context, making interpretation difficult and failing to effectively encapsulate fundamental visual information. We introduce a novel transformer model, Semantic Vision Transformers (sViT), which leverages recent progress on segmentation models to design novel tokenizer strategies. sViT effectively harnesses semantic information, creating an inductive bias reminiscent of convolutional neural networks while capturing global dependencies and contextual information within images that are characteristic of transformers. Through validation using real datasets, sViT demonstrates superiority over ViT, requiring less training data. Furthermore, sViT demonstrates significant superiority in out-of-distribution generalization and robustness to natural distribution shifts, attributed to its scale invariance semantic characteristic. Notably, the use of semantic tokens significantly enhances the model's interpretability. Lastly, the proposed paradigm facilitates the introduction of new and powerful augmentation techniques at the token (or segment) level, increasing training data diversity and generalization capabilities. To conclude, just as sentences are made of words, images are formed by semantic objects; our proposed methodology leverages recent progress in object segmentation and takes an important and natural step toward interpretable and robust vision transformers. (Conference Room San Felipe) |
16:30 - 17:00 |
Samar Hadou: Robust Unrolled Networks ↓ Abstract:
Deep unrolling is an emerging learning-to-optimize method that unrolls a truncated iterative algorithm in the layers of a trainable neural network. However, the convergence guarantees and generalizability of the unrolled networks are still open problems. To tackle these problems, we provide deep unrolled architectures with a stochastic descent nature by imposing descending constraints during training. The descending constraints are forced layer by layer to ensure that each unrolled layer takes, on average, a descent step toward the optimum during training. We theoretically prove that the sequence constructed by the outputs of the unrolled layers is then guaranteed to converge. We also show that our imposed constraints provide the unrolled networks with robustness to perturbations. We numerically assess unrolled architectures trained under the proposed constraints in two different applications, including the sparse coding using learnable iterative shrinkage and thresholding algorithm (LISTA) and image inpainting using proximal generative flow (GLOW-Prox), and demonstrate the performance and robustness benefits of the proposed method. (Conference Room San Felipe) |
17:00 - 19:00 |
Adjourn (Hotel Hacienda Los Laureles) |
19:00 - 21:00 |
Dinner (Restaurant Hotel Hacienda Los Laureles) |