Back

Investigating the Frequency of DNA Words near Loci Associated with Human Complex Traits

Abstract:

Genome-wide association studies and expression quantitative trait loci (eQTL) studies have identified thousands of variants associated with complex diseases and gene expression levels. The frequency of DNA words associated with these variants has not been extensively evaluated. These words may help understand the biological role of trait-associated variants and also enable their identification in future studies.

An exact word-counting method was developed to investigate the hypothesis that short DNA words have different frequencies near single nucleotide polymorphisms (SNPs) associated with (1) Alzheimer’s disease and (2) thyroid eQTLs, compared to the rest of the genome.

No significant DNA words were found near AD associated SNPs. Some words enriched in GC content have significantly higher frequency around thyroid’s eQTLs compared to controls. These DNA words were no longer significant when the controls were matched for nucleotide frequency, but this is likely due to over-matching.

GitHub link:
Click to view codes

Supervisors:
Dr. Jo Knight (Reader) Lancaster University, United Kingdom.
Dr. Andrew D. Paterson (Senior Scientist) The Hospital for Sick Children, Canada.

Advisors:
Dr. Michael Wilson (Professor) The Hospital for Sick Children, Canada.
Dr. Mario Masellis (Associate scientist) Sunnybrook Health Sciences Centre, Canada.

Examiners:
Dr. Michael Hoffman (Assistant Professor) The Hospital for Sick Children, Canada.
Dr. Boris Steipe (Associate Professor) University of Toronto, Canada.
Dr. James C. Engert (Associate Professor) MgGill University, Canada.

Thesis Defence Date:
27 July, 2018
Link to Thesis: TBA

“Camh     “UoT     “lu     “sick     “Mitacs     “sunny