# Schedule for: 16w5091 - Developing a Comprehensive, Integrated Framework for Advanced Statistical Analyses of Observational Studies

Arriving in Banff, Alberta on Sunday, July 3 and departing Friday July 8, 2016
Sunday, July 3
16:00 - 17:30 Check-in begins at 16:00 on Sunday and is open 24 hours (Front Desk - Professional Development Centre)
17:30 - 19:30 Dinner
A buffet dinner is served daily between 5:30pm and 7:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building.
(Vistas Dining Room)
20:00 - 22:00 Informal gathering (Corbett Hall Lounge (CH 2110))
Monday, July 4
07:00 - 08:45 Breakfast
Breakfast is served daily between 7 and 9am in the Vistas Dining Room, the top floor of the Sally Borden Building.
(Vistas Dining Room)
08:45 - 09:00 Introduction and Welcome by BIRS Station Manager (TCPL 201)
09:00 - 09:30 Willi Sauerbrei: STRATOS – Recent developments and aims for the next 12 months (TCPL 201)
09:30 - 09:45 Michal Abrahamowicz: Aims and overview of the BIRS Workshop (TCPL 201)
09:45 - 10:00 General discussion (TCPL 201)
10:00 - 10:30 Coffee Break (TCPL Foyer)
10:30 - 10:55 Gary Collins: Overview of STRATOS Panels - Literature review panel (RP) (TCPL 201)
10:55 - 11:20 Simon Day: Overview of STRATOS Panels - Glossary panel (GP): A glossary of terms for observation studies
This part of STRATOS is intended as an over-arching project to try to standardise common terminology and common meaning of such terminology.

Aside from the content (i.e. the definition given to each specific term included), the areas we are considering are as follows:

1) Scope: this is the “higher level” aspect to what terms get included. Terminology for clinical trials seems relevant (not all trials are randomised), as does survey methods and epidemiology – but should we drill down to the level of pharmaco-epidemiology (drug safety type issues), or stick with environmental epidemiology? The exact scope has been difficult to decide and, to some extent, we are approaching the task from the other end and thinking about which terms we want to include.

2) Structure for each entry: this we have decided on and each term is being described under the headings of:
Term
Definition
Context [some terms may have different meanings in different contexts]
Sources [source reference(s) used, if any]
Date Created
Date last updated.

3) Process: initially the glossary team are making a start but ultimately we want input from as many members of the STRATOS initiative as wish to comment; and we absolutely need input from each of the topic groups.

4) Delivery format: we plan to make available an open access, searchable glossary. A few key people will have editing rights, but there should be opportunities for others to leave comments (i.e. suggests corrections, updates, additional terms, etc.) We have been trying to engage a publisher (Wiley) to collaborate on the project and possible host the glossary. They have expressed interest but real actions are a little slow at coming.
(TCPL 201)
11:20 - 11:45 Michal Abrahamowicz: Overview of STRATOS Panels - Simulation studies panel (SP) (TCPL 201)
11:45 - 12:00 Willi Sauerbrei: Overview of STRATOS Panels - New membership panel (MP) (TCPL 201)
12:00 - 13:00 Lunch (Vistas Dining Room)
13:00 - 14:00 Guided Tour of The Banff Centre
Meet in the Corbett Hall Lounge for a guided tour of The Banff Centre campus.
(Corbett Hall Lounge (CH 2110))
14:00 - 14:20 Group Photo
Meet in foyer of TCPL to participate in the BIRS group photo. The photograph will be taken outdoors, so dress appropriately for the weather. Please don't be late, or you might not be in the official group photo!
(TCPL Foyer)
14:30 - 15:00 Aris Perperoglou: Overview of STRATOS Panels - Data sets panel (DP) + Discussion (TCPL 201)
15:00 - 15:30 Coffee Break (TCPL Foyer)
15:30 - 15:50 Suzanne Cadarette: Overview of STRATOS Panels - Knowledge Translation panel (TP): If a tree falls in a forest and no one is around to hear it, does it make a sound?
If a tree falls in a forest and no one is around to hear it, does it make a sound? If a statistician publishes a new method that improves the validity of results from observational studies and no one (or few) read it, was the novel method developed? The lack of uptake of novel statistical methods is well documented. The Knowledge Translation Panel will work with Topic Groups and other Panels towards “being heard,” i.e., strategically packaging STRATOS guideline communications to improve diffusion in observational research methods. This talk will briefly review Rogers’ Diffusion of Innovations theory and use the diffusion of two confounder summary scores (disease risk score and high-dimensional propensity score) in pharmacoepidemiology as case examples to illustrate key strategies to help maximize the swift integration of STRATOS guidelines into research practice.
(TCPL 201)
15:50 - 16:10 Jörg Rahnenführer: Overview of STRATOS Panels - Website panel (WP) (TCPL 201)
16:10 - 16:15 Willi Sauerbrei: Overview of STRATOS Panels - Contacts with Societies and Organizations panel (OP) (TCPL 201)
16:15 - 16:30 Overview of STRATOS Panels - General discussion: Are Other Panels needed? (TCPL 201)
16:30 - 16:40 Short break (TCPL 201)
16:40 - 17:30 Frank Harrell: Selection of Variables and Functional Forms in Multivariable Analysis: Current Issues and Future Directions
This talk begins with a contrast of exploratory data analysis (a la Tukey) and formal analysis. Challenges of "too many variables and too few subjects" are briefly discussed in this context. The discussion turns to ways in which variable selection is misleading, contrasting feature selection with successful "kitchen sink" machine learning approaches. This leads to a statistical analogy of Maxwell's demon in which some of the information in the system is "stolen" by feature selection. An example in which the bootstrap is useful in quantifying the difficulty of the task will be shown; this involves getting confidence intervals for importance ranks for predictors. Instead of feature selection, pooled tests of overlapping predictors is advocated for assisting in model interpretation.

Some issues relating to fitting predictor functional form will be addressed, and the statistical advantages of pre-specifying knot locations in regression splines will be outlined. Many statistical analysts are unaware that modern methods for high-dimensional data such as lasso and elastic net frequently trade one set of problems for another, especially related to predictor transformations. This talk attempts to bring these issues more in the open, mentioning how a Bayesian might operate. Finally, some future directions in interaction modeling will be covered.
(TCPL 201)
17:30 - 18:20 Els Goetghebeur: Causal inference at the intersection of many state of the art methods
Methods and techniques available for causal inference have exploded over the past decade. Penetrating into this literature is particularly hard for the practicing statistician, since the material is challenging both at the conceptual and technical level. At the heart of causal inference lies an extra dimension of abstraction in the form of latent variables, also called potential outcomes, representing the possibly counterfactual answers to the question what if exposure had been set to (different) level x’. The endeavor is worthwhile however since causal claims are often the target and they are frequently made in the medical literature, sometimes based on overly simplistic analyses.

The evidence in this setting is not brought in through direct observation of subject-specific potential outcome measures, but is typically derived from assumed links (or independence) between those latent variables and the different observed exposure-response data conditional on covariates. Once assumptions and the target of estimation are clear, most causal effect estimation methods have a specific core, but equally rely on methods and issues which form the topic of other working groups in STRATOS. Specifically:

TG8 outcomes may be of different types; binary, continuous, right censored survival (competing risks), longitudinal…

TG2 outcome regression and propensity score regression involved in causal inference must consider the impact of model selection (and prediction error) etc. while accounting for the special nature of covariates that are confounders

TG1 missing data often occur, and in some sense the alternative’ potential outcomes can be seen as missing data themselves

TG5 studies are ideally designed with causal inference in mind

In this talk we will present the essence of the causal inference approach of TG7 and point to important potential’ links with the work of other topics groups in STRATOS.

(TCPL 201)
18:20 - 19:30 Dinner
A buffet dinner is served daily between 5:30pm and 7:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building.
(Vistas Dining Room)
Tuesday, July 5
07:00 - 08:20 Breakfast (Vistas Dining Room)
08:20 - 09:10 James Carpenter: Handling missing data in observational studies: challenges for teaching and research
Missing data present an inevitable, if unwelcome, challenge to analysts of observational data. Such analysts typically come from a variety of backgrounds, often with limited formal statistical training. Furthermore, they are increasingly looking to go beyond standard regression models and perform relatively complex analyses, e.g. using propensity scores, hierarchical models, and non-linear models.

Alongside this, the methodological literature on missing data is vast, and often relatively inaccessible. Despite excellent reviews [e.g. 1, 2, 3], it is often far from clear to practitioners which methods are essentially equivalent, and the relative strengths of different approaches and software. This is even more true when we move to sensitivity analyses.

To move things forward, in this talk I propose some principles for analysts of all levels, and illustrate how they may be implemented in increasingly complex examples. Beginning with analysts with limited statistical training (level 1), I will argue that STRATOS guidance should highlight:
• the necessity of performing and reporting a careful complete records analysis, and in particular guidance around how the mechanisms giving rise to the missing data impact the validity of the results ([4, Ch 1; 5, 6]);
• the importance of awareness of the scientific context, which should be kept in mind when faced with the results of complex statistical analysis [7];
• the value of including information from appropriate additional variables, not in the primary scientific model [8];
• the usefulness of simple sensitivity analysis; [e.g., 4, Ch 10; 9],
• the complications which necessitate going beyond a relatively standard analysis and seeking further assistance, and
• how analyses of partially observed data should be reported [10].

I will argue that multiple imputation, though not the ‘best’ solution in all cases, has the widest applicability, and therefore should be considered as the primary approach, indicating how it relates to other approaches, such as direct likelihood and the EM algorithm.

As the talk progresses, the examples will become more complex, and I will indicate where I believe guidance would be helpful for level 2 analysts, both in terms of methods and software. I will also briefly discuss how missing data is an example of a broader class of data dependent sampling [7], and the implications of this for developing guidance for researchers.

References:
[1] Little, R.J. (1992). Regression with Missing X's: a Review. Journal of the American Statistical Association, 87, 1227–1237.
[2] Horton, N. J. and Kleinman, K. P. (2007) Much ado about nothing: a comparison of missing data methods and software to fit incomplete data regression models. The American Statistician, 61, 1–12.
[3] Hogan, J. W., Roy, J. and Krokontzelou, C. (2004). Tutorial in biostatistics: handling drop-out in longitudinal studies. Statistics in Medicine 23, 1455–1497.
[4] Carpenter, J. R., and Kenward, M. G. (2013) Multiple Imputation and its Application. Chichester: Wiley.
[5] Little, R. J. and Zhang, N (2011) Subsample ignorable likelihood for regression analysis with missing data. Journal of the Royal Statistical Society, Series C, 60, 591–605.
[6] Bartlett, J. W., Harel, O. and Carpenter, J. R. (2015) Asymptotically unbiased estimation of exposure odds ratios in complete records logistic regression. American Journal of Epidemiology, 182, 730–736.
[7] Morris, T. P., White, I. R., Royston, P., Seaman, S.R., and Wood, A. M. (2014) Multiple imputation for an incomplete covariate that is a ratio. Statistics in Medicine, 33, 88–104.
[8] Spratt, M., Carpenter, J. R, Sterne, J.A.C and Carpenter, J. R. (2010), Strategies for Multiple Imputation in Longitudinal Studies. American Journal of Epidemiology, 172, 478–487.
[9] Hogan, J., Daniels, M. J. and Hu, L. (2015) Bayesian Sensitivity Analysis. In Handbook of Missing Data Methodology, eds Molenberghs, G., Fitzmaurice, G., Kenward, M. G., Tsiatis, A. and Verbecke, G., pages 405–431. New York: CRC press.
[10] Sterne, J. A. C., White, I. R., Carlin, J. B. et al (2009). Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. British Medical Journal, 339, 157–160.
[12] Molenberghs, G., Kenward, M. G., Aerts, M., Verbeke, G., Tsiatis, A. A., Davidian, M. and Rizopoulos, D. (2014) On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data. Statistical Methods in Medical Research, 23, 11–41.
(TCPL 201)
09:10 - 10:00 Per Kragh Andersen: Dealing with Competing Risks in survival analysis
In survival analysis, end of follow-up can be caused by the occurrence of the event of primary interest, by the occurrence of a competing event that prevents the event of primary interest from happening, or by genuine right-censoring' (such as loss to follow-up or end of follow-up). When the ambition is to estimate probabilities (`risks') for the event of primary interest, it is crucial to distinguish between occurrence of a competing event and right-censoring.

Nevertheless, in numerous applied medical articles, competing risks are treated as if it were genuine right-censoring, e.g., by using one minus the Kaplan-Meier estimator as a risk estimator (thereby overestimating the risk) and in many medical disciplines, including cardiology, hepatology, oncology, and epidemiology, papers explaining these difficulties in a supposedly easy-to-read manner have appeared.

For these reasons, competing risks will be a crucial topic for the STRATOS TG8 working with analysis of survival data.

However, dealing with competing risks is also important for other STRATOS topic groups. This includes, among others:

TG1 (missing data) When doing multiple imputation with the response variable included in the imputation model, inclusion of a (possibly right-censored) competing risks response is not trivial.

TG2 (variable selection and functional forms of a dose-response relationship) Special regression models are typically used for survival analysis (possibly with competing risks) though working with models with a linear predictor is quite similar for different types of outcome variable.

TG6 (evaluating diagnostic tests and prediction models) When assessing predictive accuracy, special care is needed for right-censored outcomes, including situations with competing risks.

TG7 (causal inference) Both when using IPTW and when using the g-formula, special techniques are needed in the presence of competing risks.

In the talk, we will discuss both problems in connection with reporting results in the medical literature when competing risks are present and how these points play a role in the work of other STRATOS topic groups. The possible role of so-called pseudo-observations will also be discussed.
(TCPL 201)
10:00 - 10:30 Coffee Break (TCPL Foyer)
10:30 - 11:00 Marianne Huebner: TG3 – Descriptive and initial data analysis: Overview of framework and reporting of initial data analysis (TCPL 201)
11:00 - 11:30 Heike Hofmann: TG3 – Descriptive and initial data analysis: Data visualization for initial data analysis (TCPL 201)
11:30 - 12:15 Gary Collins: TG6 – Evaluating diagnostic tests and prediction models: Evaluating Prediction Models: overview of existing guidance (+ Discussion) (TCPL 201)
12:15 - 13:30 Lunch (Vistas Dining Room)
13:30 - 14:00 Laurence Freedman: TG4 – Measurement error and misclassification: Overview of TG4 activities and progress toward guidance document
I provide a brief overview of the main activities of our topic group. We have three major projects:
first, a survey of practice with regard to handling measurement error in four areas of epidemiology; second, a guidance document on measurement error in epidemiology aimed at biostatisticians who have only a vague acquaintance with statistical methods for handling measurement error, and third, a guidance document on handling dietary measurement error aimed at nutritional epidemiologists. I summarize progress and timelines for these three projects. Other members of TG4 will describe the first and third projects in more detail in later talks at Banff.

I then describe the progress made with the guidance document for biostatisticians. We have completed a draft of the first three sections of a document that is planned to comprise 10 sections. The three sections completed are Introduction, The main types of measurement error, and Effects of measurement error on study results. We are trying to be comprehensive in our coverage but concise. Thus the “Types of measurement error” section includes classical error, the linear measurement error model, Berkson error and misclassification, and explains the concepts of differential and non-differential measurement error. The “Effects of measurement error” section includes how these different types of error affect regression coefficients when the error is in one or more explanatory variables, how it affects regression coefficients when the error is in the response variable, and how it affects the estimation of distributions.

To emphasize that not all of these effects are well appreciated, I show how classical and Berkson errors have totally different effects on estimates, and give an example where lack of appreciation of such effects led to erroneous conclusions.

Finally, I propose a basis for further activities of TG4.
(TCPL 201)
14:00 - 14:30 Pamela Shaw: TG4 – Measurement error and misclassification: Literature surveys of awareness of measurement error issues, and use of methods to mitigate its effects in four areas of observational epidemiology
Many variables of interest in epidemiological observational studies are subject to measurement error and misclassification. However, in many fields of epidemiological research the impact of such errors is either not appreciated or is ignored.

Literature surveys about the current practice of accounting for measurement error in epidemiological observational studies were conducted within the work of TG4. The objectives of these surveys are to describe the frequency and the types of statistical methods applied for measurement error correction in order to identify priorities of a guidance paper.

Research articles in four fields of epidemiology were surveyed: 1) nutritional intake cohort studies, 2) dietary intake population surveys, 3) physical activity cohort studies and 4) air pollution cohort studies.

The survey was conducted both as a general search of articles in these areas (part A) to understand current practice with regards to addressing measurement error and a specific methods search (part B) to understand which methods were used when adjustments for measurement error in the analysis were made. The survey strategy was adapted for each research field.

In the part A survey for nutritional intake cohort studies, the awareness regarding measurement error issues was highest (94%), compared to the other fields (2: 57%, 3: 79%, 4: 40%). Moreover, methods for measurement error correction (particularly regression calibration) were found to be frequently applied in the part B survey for this field, whereas these methods were rarely applied in the other fields.

Despite the general perception of the measurement error problems in statistical analyses most studies do not use methods for measurement error correction. The surveys revealed some reasons including lack of knowledge of the sources of measurement error, incomplete understanding of the implications, missing examples of measurement error correction methods in current practice, especially for more complex situations (several variables with measurement error, correlated predictors, etc.). Guidance documents as well as tutorials and software may encourage practitioners to apply methods for measurement error correction.
(TCPL 201)
14:30 - 15:00 Ruth Keogh: TG4 – Measurement error and misclassification: Error in measurements of dietary intake used in nutritional epidemiology: Impact, corrections, and recommendations
TG4 is currently engaged in several activities to (i) increase the awareness of the implications of measurement error and misclassification for our investigations among biostatisticians and epidemiologists, and (ii) point to methods to address problems arising from measurement error. This presentation will focus on measurement error in nutritional epidemiology and a planned tutorial and guidance paper for researchers working in this area.

Accurately measuring dietary intakes is a major challenge for nutritional epidemiology. Studies usually rely on self-reported dietary intake data, though these are known to be subject to intake-related and person-specific systematic errors of some magnitude. The potential impact of different types of measurement error will be illustrated, emphasising the special nature of measurement error in dietary exposures. Measurement error reduces power to detect diet-disease associations and can result in biased estimates of such associations. Though the prevailing understanding is that this bias is in the form of an attenuation, this is not necessarily the case. Corrections for measurement error can be made if the nature of the error can be ascertained; doing so requires information from a reference measure, for example, from biomarkers. I will give an overview of the above issues and of methods for correcting for measurement error, focusing on regression calibration.

I will summarise what the tutorial paper will include in terms of practical guidance for authors. Finally, I will highlight the challenges for improving practice and summarise some initial recommendations for investigators designing studies and reporting their results, as well as for editors and reviewers.
(TCPL 201)
15:00 - 15:30 Coffee Break (TCPL Foyer)
15:30 - 16:00 Katherine Lee: TG1 – Missing data: Topic to be announced later (TCPL 201)
16:00 - 16:30 Mitchell Gail: TG5 – Study design: TG5 paper on Study Design (TCPL 201)
16:30 - 16:50 General discussion (TCPL 201)
16:50 - 17:00 Short break (TCPL 201)
17:00 - 17:30 Terry Therneau: TG8 – Survival analysis: Overview of issues and challenges in survival analysis (TCPL 201)
17:30 - 17:45 Michal Abrahamowicz: TG8 – Survival analysis: ‘Meta-review’ of selected published reviews and guidance documents in survival analysis (TCPL 201)
17:45 - 18:00 General discussion (re: Links of TG8 with other TGs) (TCPL 201)
18:00 - 18:30 Lisa McShane: TG9 – High-dimensional data: Overview of TG9 progress towards guidance document (TCPL 201)
18:30 - 19:30 Dinner (Vistas Dining Room)
Wednesday, July 6
07:00 - 08:00 Breakfast (Vistas Dining Room)
08:00 - 08:30 Els Goetghebeur: TG7 – Causal inference: Causal questions and principled answers: a guide through the landscape for practicing statisticians. Part I: total effect of baseline exposure (TCPL 201)
08:30 - 09:00 Saskia le Cessie: TG7 – Causal inference: Instrumental Variables (TCPL 201)
09:00 - 09:30 Niels Keiding: TG7 – Causal inference: Generalization from self-selected epidemiological studies
Low front-end cost and rapid accrual make web-based surveys and enrollment in studies attractive. Participants are often self-selected with little reference to a well-defined study base. Of course, high quality studies must be internally valid (validity of inferences for the sample at hand), but web-based sampling reactivates discussion of the nature and importance of external validity (generalization of within-study inferences to a target population or context) in epidemiology. A classical epidemiological approach would emphasize representativity, usually conditional on important confounders. An alternative view held by influential epidemiologists claims that representativity (in a narrow sense) is irrelevant for the scientific nature of epidemiology. Against this background, it is a good time for statisticians to take stock of our role and position regarding surveys and observational research in epidemiology. The central issue is whether conditional effects in the study population may be transported to desired target populations. This will depend on the compatibility of causal structures in study and target populations, and will require subject matter considerations in each concrete case. Statisticians, epidemiologists and survey researchers should work together to develop increased understanding of these challenges and improved tools to handle them.

Reference
Keiding, N. & Louis, T.A. (2016). Perils and potentials of self-selected entry to epidemiological studies and surveys (with discussion). J.Roy.Statist.Soc. A 179, 319-376.
(TCPL 201)
09:30 - 10:00 Aris Perperoglou: TG2 – Selection of variables and functional forms in multivariable analysis: Review of spline function selection procedures in R (TCPL 201)
10:00 - 10:30 Coffee Break (TCPL Foyer)
10:30 - 11:00 Matthias Schmid: TG2 – Selection of variables and functional forms in multivariable analysis: Review of variable selection: issues and methods
In this talk, some popular variable selection methods for explanatory statistical regression modeling will be presented. The focus will be on data-driven techniques for linear covariate effects that are based on information criteria, significance and penalized likelihood. Since each of these techniques has implications on the stability, bias and validity of the final model, many scientists have argued against their use. Nevertheless, variable selection, whether data-driven or not, remains a key issue in observational research that can almost never be avoided in the analysis of non-randomized empirical studies. As a consequence, practitioners are in need of pragmatic recommendations for steps to be done when variable selection has to be conducted. Specifically, the talk will cover aspects such as the selection of candidate covariates based on background knowledge and choosing an appropriate variable selection method for the problem at hand.
(TCPL 201)
11:00 - 11:30 Time to prepare for the excursion (N/A)
11:30 - 12:15 Lunch (Vistas Dining Room)
12:30 - 18:45 Excursion (Banff National Park)
18:45 - 19:30 Dinner (Vistas Dining Room)
Thursday, July 7
07:00 - 08:30 Breakfast (Vistas Dining Room)
08:30 - 09:15 Stephen Walter: Overview of STRATOS Panels - Publications panel (PP) (TCPL 201)
09:15 - 10:00 General discussion: Further issues relevant for any of the panels? (TCPL 201)
10:00 - 10:30 Coffee Break (TCPL Foyer)
10:30 - 12:00 Discussions re: Links/Collaborations between different TGs (TCPL 201 or smaller meeting rooms)
12:00 - 13:00 Lunch (Vistas Dining Room)
13:00 - 15:00 Separate meetings/discussions of individual TGs and Panels (smaller meeting rooms)
15:00 - 15:30 Coffee Break (TCPL Foyer)
15:30 - 16:50 Separate meetings/discussions of individual TGs and Panels (smaller meeting rooms)
16:50 - 17:00 Short break (TCPL 201)
17:00 - 18:20 General discussion: Relevant steps from primary research to developing guidance documents (TCPL 201)
18:20 - 19:30 Dinner (Vistas Dining Room)
Friday, July 8
07:00 - 08:30 Breakfast (Vistas Dining Room)
08:30 - 10:00 General Discussion - Summary and Outlook: I. Panels, II. Topic Groups, III. General issues & future meetings (e.g. Oberwolfach) (TCPL 201)
10:00 - 10:30 Coffee Break (TCPL Foyer)
10:30 - 11:50 General Discussion - Summary and Outlook: Main focus on future steps toward guidance and integration of activities of different TGs (TCPL 201)
11:50 - 12:00 Checkout by Noon
5-day workshop participants are welcome to use BIRS facilities (BIRS Coffee Lounge, TCPL and Reading Room) until 3 pm on Friday, although participants are still required to checkout of the guest rooms by 12 noon.
(Front Desk - Professional Development Centre)
12:00 - 13:30 Lunch from 12:00 to 13:30 (Vistas Dining Room)