Schedule for: 21w5508 - Statistical Methods for Computational Advertising (Online)
Beginning on Sunday, October 3 and ending Friday October 8, 2021
All times in Banff, Alberta time, MDT (UTC-6).
Monday, October 4 | |
---|---|
09:00 - 09:05 | BIRS Staff Intro/Welcome (Zoom) |
09:00 - 09:55 |
David Banks: The Statistical Challenges of Computational Advertising ↓ Computational advertising is a relatively young field, but it touches on almost every aspect of statistics. This talk frames the purpose of this workshop, and details some of the ways in which computational advertising intersects with statistics. (Zoom) |
10:00 - 10:55 |
Tim Hesterberg: Surveys and Big Data for Estimating Brand Lift ↓ Google Brand Lift Surveys estimates the effect of display advertising using surveys. Challenges include imperfect A/B experiments, response and solicitation bias, discrepancy between intended and actual treatment, comparing treatment group users who took an action with control users who might have acted, and estimation for different slices of the population. We approach these
issues using a combination of individual-study analysis and meta-analysis across thousands of studies. This work involves a combination of small and large data - survey responses and logs data, respectively.
There are a number of interesting and even surprising methodological twists. We use regression to handle imperfect A/B experiments and response and solicitation biases; we find regression to be more stable than propensity methods. We use a particular form of regularization that combines advantages of L1 regularization (better predictions) and L2 (smoothness). We use a variety of slicing methods, that estimate either incremental or non-incremental effects of covariates like age and gender that may be correlated. We bootstrap to obtain standard errors. In contrast to many regression settings, where one may either resample observations or fix X and
resample Y, here only resampling observations is appropriate (Zoom) |
11:00 - 11:55 | Art Owen: Efficiency of Tie-Breaking Designs (Zoom) |
12:00 - 12:55 |
Mamadou Yauck: Computational Advertising: A Capture-Recapture Perspective ↓ This work is concerned with the analysis of marketing data on the activation of applications (apps) on mobile devices. Each application has a hashed identification number that is specific to the device on which it has been installed. This number can be registered by a platform at each activation of the application. Activations on the same device are linked together using the identification number. By focusing on activations that took place at a business location one can create a capture-recapture data set about devices, or more specifically their users, that "visited" the business: the units are owners of mobile devices, and the capture occasions are time intervals such as days. In this talk, we will present a new algorithm for estimating the parameters of a capture-recapture model with a fairly large number of capture occasions and a simple parametric bootstrap variance estimator. (Zoom) |
13:00 - 13:20 |
Group Photo ↓ Meet in foyer of TCPL to participate in the BIRS group photo. The photograph will be taken outdoors, so dress appropriately for the weather. Please don't be late, or you might not be in the official group photo! (Zoom) |
13:20 - 14:00 | Break (Zoom) |
14:00 - 14:25 | Ben Skraina: eBay: Triumph and Tragedy in A/B tests: War Stories from Amazon, eBay, and Startups (Zoom) |
14:30 - 14:55 |
Anru Zhang: High-order Clustering with Application in Click-through Prediction ↓ In e-commerce, predicting click-through for user-item pairs in a time-specific way plays an important role in the online recommendation system. The click-through data can be organized as an order-3 tensor, where each entry is indexed by (users, items, time) and represents whether there is user-item interaction in a time period. The users/items often exhibit clustering structures due to similar preferences/attributes. It is important to do high-order clustering, i.e., to exploit such high-order clustering structures. The high-order clustering problem also arises from applications in genomics and social network studies. The non-convex and discontinuous nature of the high-order clustering problem pose significant challenges in both statistics and computation.
In this talk, we introduce a tensor block model and the computationally efficient methods, high-order Lloyd algorithm (HLloyd), and high-order spectral clustering (HSC), for high-order clustering. The local convergence of the proposed procedure is established under a mild sub-Gaussian noise assumption. In particular, for the Gaussian tensor block model, we give a complete characterization of the statistical-computational trade-off for achieving high-order exact clustering based on three different signal-to-noise ratio regimes. We show the merits of the proposed procedures on the real online-click through data. (Zoom) |
15:00 - 15:25 | S. Samadi: Dimension Reduction for Vector Autoregressive Models (Zoom) |
Tuesday, October 5 | |
---|---|
09:00 - 09:55 |
Patrick LeBlanc: An Overview of Recommender System Theory ↓ .: This talk is a literature survey of approaches that have been taken to various kinds of recommender systems. I discuss both active and passive systems. (Zoom) |
10:00 - 10:55 | Deborshee Sen: Cross-Domain Recommender Systems (Zoom) |
11:00 - 11:55 |
Grace Yi: Unbiased Boosting Estimation for Censored Survival Data ↓ Boosting methods have been broadly discussed for various settings, especially for cases with complete data. This talk concerns survival data which typically involve censored responses. Three adjusted loss functions are proposed to address the effects due to right-censored responses where no specific model is imposed, and an unbiased boosting estimation method is developed. Theoretical results, including consistency and convergence, are established. Numerical studies demonstrate the promising finite sample performance of the proposed method (Zoom) |
12:00 - 12:55 |
Xuan Bi: Improving Sales Forecasting Accuracy: A tensor factorization approach with demand awareness ↓ Due to accessible big data collections from consumers, products, and stores, advanced sales forecasting capabilities have drawn great attention from many companies especially in the retail business because of its importance in decision making. Improvement of the forecasting accuracy, even by a small percentage, may have a substantial impact on companies' production and financial planning, marketing strategies, inventory controls, supply chain management, and eventually stock prices. Specifically, our research goal is to forecast the sales of each product in each store in the near future. Motivated by tensor factorization methodologies for personalized context-aware recommender systems, we propose a novel approach called the Advanced Temporal Latent-factor Approach to Sales forecasting (ATLAS), which achieves accurate and individualized prediction for sales by building a single tensor-factorization model across multiple stores and products. Our contribution is a combination of: tensor framework (to leverage information across stores and products), a new regularization function (to incorporate demand dynamics), and extrapolation of tensor into future time periods using state-of-the-art statistical (seasonal auto-regressive integrated moving-average models) and machine-learning (recurrent neural networks) models. The advantages of ATLAS are demonstrated on \iv{eight datasets} collected by the Information Resource, Inc., where a total of 165 million weekly sales transactions from more than 1,500 grocery stores over 15,560 products are analyzed. (Zoom) |
13:00 - 14:00 | Break (Zoom) |
14:00 - 14:25 | Phyllis Ju: Towards Cost-Efficient A/B Testing (Zoom) |
14:00 - 14:55 |
Nathaniel Stevens: Modern Design of Experiments for Computational Advertising ↓ Designed experiments have long been regarded as the backbone of the scientific method used as the gold standard for causal inference. Although DOE has traditionally been applied in the realms of agriculture, manufacturing, pharmaceutical development, and the physical and social sciences, in recent years, designed experiments have become commonplace within internet and technology companies for product development/ improvement, customer acquisition/ retention, and just about anything that impacts a business’s bottom line. These online controlled experiments, known colloquially as A/B tests, provide an especially lucrative opportunity for modern advertisers to understand market sentiment and consumer preferences. In this talk we provide an overview of A/B testing and online controlled experiments and we describe ways in which these experiments and this context differ from that of classical experiments. Although this modern “backyard” (as Tukey might call it) is somewhat under-appreciated in the field of industrial statistics, we discuss several important and impactful research opportunities that traditional industrial statisticians could and should get involved with. (Zoom) |
Wednesday, October 6 | |
---|---|
09:00 - 09:55 |
Yiyun Luo: Distribution-Free Contextual Dynamic Pricing ↓ Contextual dynamic pricing aims to set personalized prices based on sequential interactions with customers. At each time period, a customer who is interested in purchasing a product comes to the platform. The customer's valuation for the product is a linear function of contexts, including product and customer features, plus some random market noise. The seller does not observe the customer's true valuation, but instead needs to learn the valuation by leveraging contextual information and historical binary purchase feedbacks. Existing models typically assume full or partial knowledge of the random noise distribution. In this paper, we consider contextual dynamic pricing with unknown random noise in the linear valuation model. Our distribution-free pricing policy learns both the contextual function and the market noise simultaneously. A key ingredient of our method is a novel perturbed linear bandit framework, where a modified linear upper confidence bound algorithm is proposed to balance the exploration of market noise and the exploitation of the current knowledge for better pricing. We establish the regret upper bound and a matching lower bound of our policy in the perturbed linear bandit framework and prove a sub-linear regret bound in the considered pricing problem. Finally, we show the superior performance of our policy on simulations and a real-life auto-loan dataset. (Zoom) |
10:00 - 10:55 |
Aiyou Chen: Robust Causal Inference for Incremental Return on Ad Spend with Randomized Paired Geo Experiment ↓ Evaluating the incremental return on ad spend (iROAS) of a prospective online marketing strategy has become progressively more important as advertisers increasingly seek to better understand the impact of their marketing decisions. Although randomized “geo experiments” are frequently employed for this evaluation, obtaining reliable estimates of the iROAS can be challenging as oftentimes only a small number of highly heterogeneous units are used. In this talk, we formulate a novel statistical framework for inferring the iROAS of online advertising in a randomized paired geo experiment design, and we propose and develop a robust and distribution-free estimator “Trimmed Match” which adaptively trims poorly matched pairs. Using numerical simulations and real case studies, we show that Trimmed Match can be more efficient than some alternatives, and we investigate the sensitivity of the estimator to some violations of its assumptions. This is joint work with my colleague Tim Au at Google. (Zoom) |
11:00 - 11:35 |
Jason Poulos: Retrospective and Forward-Looking Counterfactual Imputation via Matrix Completion ↓ I will discuss the matrix completion method for counterfactual imputation in standard and retrospective panel data settings, with applications to the social sciences. This talk is partly based on joint work with Andrea Albanese (LISER), Andrea Mercatanti (Bank of Italy), and Fan Li (Duke). (Zoom) |
11:40 - 12:15 | Yi Guo: Multiparty Auctions without Common Knowledge (Zoom) |
12:20 - 12:55 | Maggie Mao: eBay: The Fight for Best Practices in Experimentation (Zoom) |
13:00 - 14:00 | Break (Zoom) |
Thursday, October 7 | |
---|---|
09:00 - 09:55 |
Guy Aridor: The Effect of Privacy Regulation on the Data Industry: Empirical Evidence from GDPR ↓ Utilizing a novel dataset from an online travel intermediary, we study the effects of EU’s General Data Protection Regulation (GDPR). The opt-in requirement of GDPR resulted in 12.5% drop in the intermediary-observed consumers, but the remaining consumers are trackable for a longer period of time. These findings are consistent with privacy-conscious consumers substituting away from less efficient privacy protection (e.g, cookie deletion) to explicit opt out—a process that would make opt-in consumers more predictable. Consistent with this hypothesis, the average value of the remaining consumers to advertisers has increased, offsetting some of the losses from consumer opt-outs. (Zoom) |
10:00 - 10:55 | Fiammetta Menchetti: ARIMA Models and Multivariate Bayesian Structural Models for Causal Inference from Sales Data (Zoom) |
11:00 - 11:55 | Ernest Fokoue (Zoom) |
12:00 - 12:55 |
Ron Berman: Latent Stratification for Advertising Experiments ↓ .: We develop a new estimator of the ATE for advertising incrementality experiments that improves precision by estimating separate treatment effects for three latent strata -- customers who buy regardless of ad exposure, those who buy only if exposed to ads and those who do not buy regardless. The overall ATE computed by averaging the strata estimates has lower sampling variance than the widely-used difference-in-means ATE estimator. The variance is most reduced when the three strata have substantially different ATEs and are relatively equal in size. Estimating the latent stratified ATE for 5 catalog mailing experiments shows a reduction of 36-57% in the posterior variance of the estimate. (Zoom) |
13:00 - 14:00 | Break (Zoom) |
14:00 - 14:55 |
Edoardo Airoldi: Estimating Peer-Influence Effects Under Homophily: Randomized Treatments and Insights. ↓ Classical approaches to causal inference largely rely on the assumption of lack of interference, according to which the outcome of an individual does not depend on the treatment assigned to others, as well as on many other simplifying assumptions, including the absence of strategic behavior. In many applications, however, such as evaluating the effectiveness of health-related interventions that leverage social structure, assessing the impact of product innovations and ad campaigns on social media platforms, or experimentation at scale in large IT organizations, several common simplifying assumptions are simply untenable. Moreover, being able to quantify aspects of complications, such as the causal effect of interference itself, are often inferential targets of interest, rather than nuisances. In this talk, we will formalize issues that arise in estimating causal effects when interference can be attributed to a network among the units of analysis, within the potential outcomes framework. We will introduce and discuss several strategies for experimental design in this context centered around a useful role for statistical models. In particular, we wish for certain finite-sample properties of the estimates to hold even if the model catastrophically fails, while we would like to gain efficiency if certain aspects of the model are correct. We will then contrast design-based, model-based and model-assisted approaches to experimental design from a decision theoretic perspective. (Zoom) |
15:00 - 15:55 |
George Michailidis: Regularized and Smooth Double Core Tensor Factorization for Heterogeneous Data ↓ Tensor factorization based models have been extensively used in developing recommender systems. In this talk, we introduce a general tensor model suitable for data analytic tasks for heterogeneous datasets, wherein there are joint low-rank structures within groups of observations, but also discriminative structures across different groups. To capture such complex structures, a double core tensor (DCOT) factorization model is introduced together with a family of smoothing loss functions. By leveraging the proposed smoothing function, the model accurately estimates the model factors, even in the presence of missing entries. A linearized ADMM method is employed to solve regularized versions of DCOT factorizations, that avoid large tensor operations and large memory storage requirements. The effectiveness of the DCOT model is illustrated on selected real-world examples including image completion and recommender systems. (Zoom) |
Friday, October 8 | |
---|---|
09:00 - 09:25 | Simon Mak: TSEC: a framework for online experimentation under experimental constraints (Zoom) |
10:00 - 10:25 | Sammy Natour: Programmatic Advertising for Scale, Efficiency, and Success (Zoom) |
12:00 - 12:55 | David Banks: Closing Remarks (Zoom) |