Evaluation of Automated Next Generation Sequencing Data Analysis Pipeline and Different Enrichment Systems Using Sanger Sequencing

You are here

Laboratorinė medicina. 2018,
t. 20,
Nr. 2,
p. 123 -
129

Purpose. To evaluate automated next generation sequencing SOLiD platform’s data analysis pipeline, defining accuracy, sensitivity and specificity values. Also, to evaluate used target enrichment systems (TargetSeq and SureSelect) using Sanger sequencing.

Participants and methods. Study included data of 96 individuals. Sixty ge­nome variants, detected by next genera­tion sequencing, were investigated using Sanger sequencing and other analysis tools. Coverage values of next generation sequencing data were used to compare TargetSeq and SureSelect target enrichment systems.

Results. Six variants have been identified incorrectly or could not be identified at all because of Sanger sequencing limitations or errors. Two variants were erroneously identified by automated data analysis pipeline. The mean coverage using SureSelect and TargetSeq enrichment systems were 32.77 and 31.58, respectively. Based on these values, the coverage of SureSelect en­richment system is 3.65% higher. Also, more genome variants were identified using the SureSelect enrich­ment system. However, TargetSeq sys­tem was able to identify unique variants.

Finally, the automated next generation SOLiD data analysis pipeline’s accuracy, sensitivity and specificity was estimated to be 99.66%, 99.22% and 99.7%, respectively.

Conclusions. These results indicate that there is no need to verify next gener­ation sequencing data using Sanger sequencing when automatic analysis algorithm parameters are high.

Comparison of TargetSeq and SureSelect coverage values showed that difference between means is not statistically significant. However, both systems were able to identify unique variants, so the most effective way to accurately identify genome variants is to use both enrichment systems.

© 2024, Lithuanian Society of Laboratory Medicine