-
Notifications
You must be signed in to change notification settings - Fork 3
2. Why Patchwork?
Most large-scale phylogenetic studies to date are using a "genome reduction" sequencing technique such as transcriptome sequencing or target enrichment. Unfortunately, transcriptomics requires a large amount of fresh material such that most specimen would have to be collected anew. Most research collection preserve their species in ethanol, thus deeming them unusable for such approach. Likewise, target capture approaches require baits to be designed, something which gets increasingly difficult as the distance between the organisms of interest increases. Furthermore, any subsequent study would have to utilize the same markers, and the data that is produces has little utility aside from phylogenetic studies.
Although whole genome sequencing (WGS) is getting increasingly affordable, this approach has typically gone underutilized for phylogenetic studies, most likely due to the lack of appropriate software tools for working with such data. Utilizing WGS for genome-scale phylogenetic analyses could dramatically increase taxon sampling and data reusability, as the data that is produced could potentially be used for many different types of analyses. All of this has motivated us to design Patchwork, a new program for retrieving phylogenetic markers directly from WGS data. More specifically, we wanted to overcome the problems typically associated with assemblies resulting from low-coverage whole-genome sequencing (LC-WCS) approaches. Because of the fragmented nature of such approach, the target gene may be located on different regions of a contig or spread across two or more contigs. Therefore, the obtained regions are sliced and merged in such a way that none of the hits are overlapping for each sequence that you put in, you obtain one, continuous and homologous sequence that could be used in a phylogenomic context.