Haplotype phasing by multi-assembly of shared haplotypes: Phase-dependent interactions between rare variants

Bjarni V. Halldórsson*, Derek Aguiar, Sorin Istrail

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Citations (Scopus)


In this paper we propose algorithmic strategies, Lander-Waterman-like statistical estimates, and genome-wide software for haplotype phasing by multi-assembly of shared haplotypes. Specifically, we consider four types of results which together provide a comprehensive workflow of GWAS data sets: (1) statistics of multi-assembly of shared haplotypes (2) graph theoretic algorithms for haplotype assembly based on conflict graphs of sequencing reads (3) inference of pedigree structure through haplotype sharing via tract finding algorithms and (4) multi-assembly of shared haplotypes of cases, controls, and trios. The input for the workflows that we consider are any of the combination of: (A) genotype data (B) next generation sequencing (NGS) (C) pedigree information. (1) We present Lander-Waterman-like statistics for NGS projects for the multi-assembly of shared haplotypes. Results are presented in Sec. 2. (2) In Sec. 3, we present algorithmic strategies for haplotype assembly using NGS, NGS + genotype data, and NGS + pedigree information. (3) This work builds on algorithms presented in Halldórsson et al.1 and are part of the same library of tools co-developed for GWAS workflows. (4) Section 3.3.1 contains algorithmic strategies for multi-assembly of GWAS data. We present algorithms for assembling large data sets and for determining and using shared haplotypes to more reliably assemble and phase the data. Workflows 1-4 provide a set of rigorous algorithms which have the potential to identify phase-dependent interactions between rare variants in linkage equilibrium which are associated with cases. They build on our extensive work on haplotype phasing,1-3 haplotype assembly,4,5 and whole genome assembly comparison. 6

Original languageEnglish
Title of host publicationPacific Symposium on Biocomputing 2011, PSB 2011
Number of pages12
Publication statusPublished - 2011
Event16th Pacific Symposium on Biocomputing, PSB 2011 - Kohala Coast, HI, United States
Duration: 3 Jan 20117 Jan 2011

Publication series

NamePacific Symposium on Biocomputing 2011, PSB 2011


Conference16th Pacific Symposium on Biocomputing, PSB 2011
Country/TerritoryUnited States
CityKohala Coast, HI

Other keywords

  • haplotype assembly
  • haplotype inference
  • phase inference
  • phasing
  • rare variants


Dive into the research topics of 'Haplotype phasing by multi-assembly of shared haplotypes: Phase-dependent interactions between rare variants'. Together they form a unique fingerprint.

Cite this