|Abstract||Whole genome or exome sequencing enables the fast generation of large volumes of data and is currently a hot topic in research. This technique has the subject of extensive research and has vast applications in healthcare and medicine. Next generation sequencing (NGS) has the advantage of providing large length reads when compared to the traditional method of Sanger Sequencing. NGS enables the identification of genetic disease-causing variants, thus, improving the quality of healthcare, diagnostics and biomedical research.
One of the major challenges of NGS is the analysis of large data outcomes. The diversity in DNA library preparation methods for various available platforms may result in data inaccuracies. Furthermore, the disparity in variant calling accuracies as a result of using diverse algorithms complicates the process of NGS data analysis. As a result, there is a large possibility for false positive and/or false negative results due to alignment and/or chemistry errors.
In this project, we utilized the MiSeq platform that was selected based on its cost effective properties and ability to provide rapid genetic analysis. The autism panel is used in this study to assist the investigation of genomic features associated with autism by targeting 101 genes linked specifically to Autism. Here, we hypothesized that we could devise NGS analysis criteria to distinguish false positive and/or false negative sequencing calls to improve the quality of the generated sequencing data. Four Autism patients cohort of Arab descent have been used as a model for this research. We were able to prove our hypothesized criteria by validating the detected variances by Sanger Sequencing.