Loading...
Please wait, while we are loading the content...
Similar Documents
Haplotyping as Perfect Phylogeny: a Direct Approach Haplotyping as Perfect Phylogeny: a Direct Approach
| Content Provider | Semantic Scholar |
|---|---|
| Author | Yooseph, Shibu Bafna, Vineet Lancia, Giuseppe |
| Copyright Year | 2002 |
| Abstract | A full Haplotype Map of the human genome will prove extremely valuable as it will be used in large scale screens of populations to associate speci c haplotypes with speci c complex genetic in uenced diseases A prototype Haplotype Mapping strategy is presently being nalized by an NIH working group The biological key to that strategy is the surprising fact that genomic DNA can be partitioned into long blocks where genetic recombination has been rare leading to strikingly fewer distinct haplotypes in the population than previously expected In this paper we explore the algorithmic implications of the no recombination in long blocks observation for the problem of inferring haplotypes in populations This assumption together with the standard population genetic assumption of in nite sites justi es a model of haplotype evolution where the haplotypes in a population are assumed to evolve along a coalescent which as a rooted tree is a perfect phylogeny We consider the following algorithmic problem called Perfect Phylogeny Haplotyping problem PPH which was introduced by Gus eld given n genotypes does there exist a set of at most n haplotypes such that each genotype is generated by a pair of haplotypes from this set and such that this set can be derived on a perfect phylogeny The approach taken in to solve this problem reduces it to established very deep results and algorithms from matroid and graph theory Although that reduction is quite simple and the resulting algorithm nearly optimal in speed a linear time lower bound is necessary taken as a whole that approach is quite involved and in particular challenging to program Moreover anyone wishing to fully establish by reading existing literature the correctness of the entire algorithm would need to read several deep and di cult papers in graph and matroid theory A stated open goal in was to nd a simpler more direct yet still e cient algorithm via a self contained approach not needing deep matroid or graph theorems This paper accomplishes that goal for both the rooted and unrooted PPH problems It establishes a very simple easy to program O nm time algorithm that determines whether there is a PPH solution for input genotypes and produces a linear space data structure to represent all of the solutions The approach allows complete although not trivial self contained proofs In addition to algorithmic simplicity the approach here makes the representation of all solutions more intuitive than in and solves another open goal from that paper namely to prove a non trivial upper bound on the number of PPH solutions showing that that number is vastly smaller than the number of haplotype solutions each solution being a set of n pairs of haplotypes that can generate the genotypes when the perfect phylogeny requirement is not imposed |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://www.cs.ucdavis.edu/research/tech-reports/2002/CSE-2002-21.pdf |
| Alternate Webpage(s) | http://www.dei.unipd.it/~lancia/psdir/pp.ps.gz |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |