Next generation sequencing is a common and versatile tool for biological and medical research1. The Ion Personal Genomic Machine™ (PGM™) is a Next Generation Sequencing platform using semiconducting sequencing technology. Our laboratory has developed strategies for screening for Haemophilia A, an X linked disease caused by F8 gene mutations, by using PGM™ sequencing2. Confirmation of positive findings by Sanger sequencing will still be required. An array of known samples with F8 gene mutations and normal samples were subjected to next generation sequencing using the Ion PGM™ Sequencing platform with an Ion 316 chip. Initial data analysis was processed using Torrent-suite 2.2 and Variant Caller 2.2.3. However, there are multiple and diverse technology platforms available which pose multiple challenges for data processing3.
In this study, we explore and evaluate data analysis strategies using multiple data alignment software and variant callers. The software and platforms evaluated in this study are Assign™ ATF 4544, Avadis NGS5 and Galaxy – Galaxy: the interactive and reproducible genomics webportal6. We describe the analyzing step for the next generation sequencing data obtained from multiple runs for F8 samples. This includes quality checking and mapping to a reference genome. We evaluate the error rates for homopolymer regions, SNP calling and indel identification (true vs false positive) using multiple web base portals and alignment tools.