Abstracts & Slides
 

April 4, 2017

   
 
"Recurrent Somatic Copy Number Alteration Analysis Identifies Risk Genes that Modulate the Survival of Young Women with Breast Cancer"
by
Pingzhao Hu, University of Manitoba

Breast cancer (BC) diagnosis in young women (<45 years old) has come forth as an independent factor with higher recurrence risk and death than their older counterparts, and it has been suggested that it may exhibit its own unique biology. Copy number alterations (CNAs) have led many to consider them as an alternate paradigm for the genetic basis of human diseases, as these large alterations may encompass key genes that contribute to carcinogenesis and disease progression. Although many complex diseases have been linked to CNAs in the genomic DNA, prior studies have yet to document age-related changes in somatic CNAs for young women with BC.

We hypothesize recurrent somatic CNA regions uniquely found in young women with BC harbor cancer susceptibility genes that modulate the survival of young women with BC. We aim to find recurrent somatic CNA regions identified from BC microarray data and associate the CNA status of the genes harbored in the regions to the survival of young women with BC.

We develop a new interval graph-based algorithm for identifying recurrent somatic CNAs in cancer using a maximal clique detection technique. The algorithm guarantees that the identified CNA regions are the most frequent and identifies the delineated minimal regions. By using the algorithm on the Molecular Taxonomy of Breast Cancer International Consortium CNA data consisting of 2000 breast tumour samples (equally divided into a Discovery set with 130 young women and a Validation set with 125 young women), a total of 38 validated recurrent CNA regions with 39 protein encoding genes have been identified, along with 68 validated recurrent CNA regions that did not encompass any protein encoding genes.

CNA gain regions encompassing genes CAPN2, CDC73 and ASB13 are the top 3 with the highest occurring frequencies in both the young Discovery and Validation dataset, while gene SGCZ ranked top for the recurrent CNA loss regions. Of particular interest, the mutation status of two of these genes, ASB13 and SGCZ, were also significant in the Kaplan Meier survival analysis. Patients with a mutated status in both of these genes resulted in a worse survival outcome when compared to patients without the gene mutations. Association and survival analyses demonstrated that the mutated CNA status in ASB13 seems to lead to correspondingly higher gene expression (binarized by mean), which is able to predict patient survival outcome.

Together, identification of the deletion and amplification events that may be prognostic in young women with breast cancer can be used in genomic-guided treatment.

Talk Slides


   

March 28, 2017

   
 
"Analytic Considerations and Challenges in Epigenetic Epidemiology Studies: An Example"
by
France Gagnon and Nora Zwingerman, University of Toronto

What is epigenetic epidemiology? What are the main study design and analytic challenges? Why is it important for analysts to understand the fundamental biology of epigenetic mechanisms before analysis the data? Why should biostatistics students be interested in epigenetic epidemiology studies? These questions will be discussed in class through the example of an identified bias in a commonly-used high throughput platform to measure methylation levels.

Talk Slides, Recommended Reading

   

March 21, 2017

   
 
"Statistical Learning and Graph Theory Methods for the Development of Imaging Biomarkers for the Computer-aided Diagnosis of Brain Disorders"
by
Pradeep Reddy Raamana, Baycrest Health Sciences, University of Toronto

Brain disorders such as Alzheimer's disease are challenging to be diagnosed until late into their progression. Further, the changes caused in its early stage are subtle and spatially distributed in the brain, which makes clinical diagnosis even harder (via visual inspection of brain MRI scans). Hence development of computer-aided techniques for early detection are key. In this talk, I will briefly discuss the associated challenges, how domain expertise helps us in designing better feature extraction methods, and how machine learning techniques are ideally suited to uncover the hidden patterns in the early changes caused by different disorders. The techniques that I would be discussing are graph theory methods to extract network-level features and how to optimally fuse them via multiple kernel learning and random forests for accurate multi-class classification. If time permits, I would be discussing the impact of different methodological choices of feature extraction methods on their predictive power.

Talk Slides, Recommended Readings 1 & 2

   

March 14, 2017

   
 
"Sampling from Hidden Populations: Respondent Driven Sampling"
by
Marcos Sanches, Centre for Addition and Mental Health

A population is considered hidden, or hard to reach, when no good sampling frame exists for its member, it is rare, may suffer discrimination or stigmatization, and acknowledgement of membership is sometimes seen as threatening. Common examples of hidden population are injection drug users, smokers of contraband cigarettes, sex workers and perhaps biostatisticians! Reaching such populations using traditional probabilistic sampling methodologies is in general too expensive to be feasible. Some of the existing procedures to reach these populations like snowball sampling and other chain-referral methods, targeted sampling, key-informant and other convenience sampling method lack statistical validity because these methods are far from probabilistic. Respondent Driven Sampling (RDS) is a modification of the referral-chain sampling that is statistically more grounded and has been arising as a good alternative, being the method behind many scientific publications. We will quickly review probabilistic sampling and why they fail reaching hidden populations, go through some existing convenience sampling alternatives and understand how RDS has been seen as superior, addressing some of its statistical strengths.

Talk Slides, Recommended Readings 1 & 2

   
March 7, 2017    
 
"Temporal Trends in Severe Injuries in Ontario, 2004-2014: A Joinpoint Regression Analysis"
by
Victoria Landsman, Institute for Work & Health

The research question of this study was to explore temporal trends in severe traumatic injuries in Ontario in the last decade. Developing a diagnosis-based definition for severe injuries was one of the objectives of the study. The newly developed case definition was then implemented to the administrative records of the emergency departments (ED) visits in Ontario during the period 2004-2014. We explored trends in severe injuries using Joinpoint regression. Points of change, annual percent change (APC) and average annual percent change (AAPC) have been estimated for severe occupational and non-occupational injuries, separately for females and males.

Talk Slides, Recommended Reading

   
February 28, 2017    
 
"Methodology on Human Microbiome Data Analysis"
by
 Wei Xu, Princess Margaret Cancer Centre, University Health Network

Technological advances in genomic sequencing have enabled researchers to unveil the wide variability of bacteria presented within different locations of the body, i.e. the microbiome, and how it relates to disease. However, our understanding of how microbiomes affect diseases is still unclear. It is necessary to better understand both environmental and host genetic factors impact the composition of the microbiome to improve disease management. Powerful statistical and bioinformatics tools are needed to overcome these knowledge gaps.  In this talk, I will introduce the general characteristics of the human microbiome data that can be clustered into operational taxonomic units (OTUs) at each taxonomic level by next-generation sequencing. Several analytic approaches will be introduced to summarize and assess the single or multiple OTUs using different computational algorithms. Specific features of the microbiome sequencing data will be explored such as non-negative, highly skewed sequence counts with excess zeros, and clustered taxonomic structure. I will introduce a novel methodology that incorporates the hierarchical nature of bacterial classification to explore interactions both between genes as well as between genes and the environment to understand their effects on the microbiome. This method can help discover both genetic and environmental factors that influence the microbiome, and the information could ultimately be used to modify the microbiome with the goal of improving human health.

Talk Slides, Recommended Readings 1 & 2

   
February 14, 2017    
 
"Reducing Antibiotic Prescribing for Children at Primary Care Facilities in Rural China: Design, Implementation and Analysis of A Clustered Randomised Controlled Trial"

by
 Xiaolin Wei, University of Toronto

Inappropriate antibiotic prescribing in primary care contributes to generating drug resistance globally. In China, recent national policies have failed to improve the situation. We designed and evaluated an intervention to reduce antibiotic prescribing for upper respiratory tract infections (URTIs) in children, targeting clinicians and caregivers in primary care facilities in rural China. The presenter will discuss the design, implementation and analysis of a clustered randomised controlled trial. The study was one of the first c-RCTs conducted in rural primary care of LMICs with a relatively large sample size. Our intervention included guideline training, monthly peer-review of prescriptions, and concise caregiver education, and was designed to operationalise China's national effort to improve antimicrobial stewardship. Our study showed these inventions, when embedded in routine practice, can reduce antibiotic prescribing for childhood upper respiratory tract infections by 29% (ARR). This effect size was much higher than those attained in all previous studies, and suggests these interventions are worth implementing in rural China and other similar settings where over-prescribing is severe. Our intervention rapidly and substantially reduced inappropriate prescribing of antibiotics for children with URTIs in Chinese rural primary care facilities. As the intervention was well-integrated within routine practice it is likely scalable within China and in similar settings.

Talk Slides

   
February 7, 2017    
 
"Partial Least Squares Correspondence Analysis: A Framework to Simultaneously Analyze Behavioral and Genetic Data"
by
 Derek Beaton, University of Texas at Dallas & Baycrest

For nearly a century, detecting the genetic contributions to cognitive and behavioral phenomena has been a core interest for psychological research, and that interest is even stronger now. Today, the collection of genetic data is both simple and inexpensive. As a consequence a vast amount of genetic data is collected across different disciplines as diverse as experimental and clinical psychology, cognitive sciences, and neurosciences. However, such an explosion in data collection can make data analyses very difficult. This difficultly is especially relevant when we wish to identify complex relationships within, and between genetic data and, for example, cognitive and neuropsychological batteries. To alleviate such problems, we have developed a multivariate approach to make these types of analyses easier and to better identify the relationships between multiple genetic markers and multiple behavioral or cognitive phenomena. Our approach--called partial least squares correspondence analysis (PLSCA)--generalizes partial least squares and identifies the information common to two different data tables measured on the same participants. PLSCA is specifically tailored for the analysis of complex data that may exist in a variety of measurement scales (e.g., categorical, ordinal, interval, or ratio scales). In this talk, I first present--in a tutorial format--how PLSCA works, how and why to use it, and how to interpret its results. PLS-CA is illustrated with genetic, behavioral, and neuroimaging data from the Alzheimer's disease Neuroimaging Initiative. Finally, a large scale (i.e., genome-wide) analysis is presented with both ADNI-1 and ADNI-GO/2 data. R code with examples are available via CRAN and GitHub. 

Talk Slides, Recommended Reading

   
January 31, 2017    
 
"Scientific Writing"

by Wendy Lou, University of Toronto

   
January 24, 2017    
 
"
Using Knowledge Fusion to Analyze Avian Influenza H5N1 in South and Southeast Asia"
by
 Erjia Ge, University Health Network

Highly pathogenic avian influenza (HPAI) H5N1, a disease associated with high rates of mortality in infected human populations, poses a serious threat to public health in many parts of the world. This article reports findings from a study aimed at improving our understanding of the spatial pattern of the highly pathogenic avian influenza, H5N1, risk in East- Southeast Asia where the disease is both persistent and devastating. Though many disciplines have made important contributions to our understanding of H5N1, it remains a challenge to integrate knowledge from different disciplines. This study applies genetic analysis that identifies the evolution of the H5N1 virus in space and time, epidemiological analysis that determines socio-ecological factors associated with H5N1 occurrence, and statistical analysis that identifies outbreak clusters, and then applies a methodology to formally integrate the findings of the three sets of methodologies. The present study is novel in two respects. First it makes the initiative attempt to use genetic sequences and space-time data to create a space-time phylogenetic tree to estimate and map the virus' ability to spread. Second, by integrating the results we are able to generate insights into the space-time occurrence and spread of H5N1 that we believe have a higher level of corroboration than is possible when analysis is based on only one methodology. Our research identifies links between the occurrence of H5N1 by area and a set of socio-ecological factors including altitude, population density, poultry density, and the shortest path distances to inland water, coastlines, migrating routes, railways, and roads. This study seeks to lay a solid foundation for the interdisciplinary study of this and other influenza outbreaks. It will provide substantive information for containing H5N1 outbreaks.     

Talk Slides, Recommended Reading

   
January 17, 2017    
 
"Midterm Review"
by Wendy Lou, University of Toronto

   
January 10, 2017    
 
"Evaluation and Feedback"
by Wendy Lou, University of Toronto

   
       
       
       
November 29, 2016    
 
"
SPEED Presentations - Part II"

Presentation Schedule

   
November 22, 2016    
 
"
SPEED Presentations - Part I"

Talk Slides


   
November 15, 2016    
 
"
Investigating Conditional Dependence Structure in Health Utility Instruments Using Graphical Models"
by
 Nicholas Mitsakakis, University Health Network

Economic evaluations of treatments or interventions rely on accurate estimation of health utility, a single global measure of health related quality of life (HRQoL). Health utilities are often measured with the use of questionnaire-type instruments, containing a number of items describing specific domains of HRQoL. The construction of these instruments relies on multi-attribute utility theory, assuming independence among the attributes. This property is rarely tested empirically. Graphical models are statistical tools that can be used for modeling and representing conditional dependences among jointly distributed random variables. In this talk, I will discuss the application of discrete graphical models to patient data for investigating the conditional dependence structure of a prostate cancer-specific utility instrument.      

Talk Slides

   
November 7, 2016    
 
"
Statistical Methods in Health Economics"
by
 Eleanor Pullenayegum, Hospital for Sick Children

Health economics affects all of us, and statisticians have an important role to play in generating the evidence upon which economic evaluations are based. This talk will provide a brief introduction to how both cost and effectiveness outcomes are considered when evaluating a new drug or therapy, and will give an overview of some of the challenges with analyzing cost data.

Talk Slides

   
November 1, 2016    
 
"
A Career in Biostatistics"
by
Paul Mahoney, Roche Canada

The presentation will focus on some personal perspectives on a career in biostatistics, including advice on applying for job opportunities in industry.

"An Introduction to Pediatric Cancer Drug Development"
by
Matthew Kowgier, Roche Canada

Despite progress in cancer research, cancer still remains the leading cause of death from disease in children in Canada and United States. The development of targeted new molecules offers the prospect of potentially more effective and less toxic therapies for children. The iMatrix Trial was developed to provide early access to innovative drugs for children with cancer, and also to provide seamless decision making of drugs in development. An introduction to the iMatrix Trial design will be given in this talk, including examples of : (1) pharmacokinetic (PK) analysis, and (2) initial efficacy analysis.

Talk Slides

   

October 25, 2016 (2-4pm, HSB 610)

   

Special Event, sponsored by DLSPH, SORA and TABA

"
Some Developments and Challenges in Biomedical Data Science"
by
Robert Tibshirani, Stanford University

I will discuss some new developments in the application of statistics and data science to medicine, and some challenges that this exciting field faces. Examples from my own work that I will discuss include cancer diagnosis from DESI mass spec data, estimating the number of units of platelets needed by a hospital each day and making treatment recommendations from observational data (electronic health records)

Poster & RSVP Link

October 18, 2016

 
"
Stratified Regression Analysis of Recurrent Events with Coarsened Censoring Times"
by
Rhonda Rosychuk,  University of Alberta

Emergency Departments (EDs) are crucial health resources for all Canadians. Twenty-four hours per day, every day of the year, EDs care for medically ill and injured patients. Alberta has been one of the few provinces in Canada to have an extensive ambulatory care database that includes both hospital-based and community-based services. Presentations to Emergency Departments are part of the database and when linked with other provincial databases, these population-based data provide an excellent opportunity for comprehensive statistical analyses. I discuss some of the challenges of data that have coarsened censoring times because of unreleased birthdate information. To evaluate time-dependent effect of exposures, a stratified Cox regression model with time-varying coefficients is presented.

Talk Slides

   

October 11, 2016

   
 
"
Predicting Flu Cases using Dynamic Regression with Google Flu Trend"
by
  Kuan Liu & Michael Moon (with Mohammad Khan, Sangook Kim), University of Toronto


Early prediction of influenza outbreak can significantly reduce its burden and impact on a population level. Since 2008, Google uses a validated algorithm to provide a region-specific estimates (GFT) of influenza activities in real-time, with the hope to detect possible flu trend patterns and subsequently assist researchers to better understand the timing of influenza outbreaks. The first objective of our study was to evaluate the association between GFT and the Canadian national surveillance registry FluWatch. The second objective of our study was to predict the annual peak of influenza cases using GFT. Non-parametric Spearman's correlation and cross-correlations were used to assess the association between GFT and respiratory illness outbreaks in Canada. A number of seasonal ARIMA and dynamic regression models with GFT were applied to forecast the trend of influenza cases. Model comparisons were conducted using cross-validation. The results suggested that GFT was significantly associated with the number of influenza tests in Canada. Comparing to prediction using only the FluWatch data, incorporating GFT improved the three-week forecast of the timing of influenza outbreaks

Talk Slides

 
   

October 4, 2016

 
"Soft Skills and Self Introduction - Part II"
by
Wendy Lou, University of Toronto


Talk Slides

 
   

September 27, 2016

   
 
"Self Introduction - Part I"
by Wendy Lou, University of Toronto


   

September 20, 2016

   
 
"Careers in Biostatistics"
by Wendy Lou, University of Toronto


   

September 13, 2016

   

"Introduction"
by Wendy Lou, University of Toronto


     
     
 


Last updated September 21, 2010
All contents copyright 2005, Dalla Lana School of Public Health, University of Toronto.