Go to The Journal of Clinical Investigation
  • About
  • Editors
  • Consulting Editors
  • For authors
  • Publication ethics
  • Publication alerts by email
  • Transfers
  • Advertising
  • Job board
  • Contact
  • Physician-Scientist Development
  • Current issue
  • Past issues
  • By specialty
    • COVID-19
    • Cardiology
    • Immunology
    • Metabolism
    • Nephrology
    • Oncology
    • Pulmonology
    • All ...
  • Videos
  • Collections
    • In-Press Preview
    • Resource and Technical Advances
    • Clinical Research and Public Health
    • Research Letters
    • Editorials
    • Perspectives
    • Physician-Scientist Development
    • Reviews
    • Top read articles

  • Current issue
  • Past issues
  • Specialties
  • In-Press Preview
  • Resource and Technical Advances
  • Clinical Research and Public Health
  • Research Letters
  • Editorials
  • Perspectives
  • Physician-Scientist Development
  • Reviews
  • Top read articles
  • About
  • Editors
  • Consulting Editors
  • For authors
  • Publication ethics
  • Publication alerts by email
  • Transfers
  • Advertising
  • Job board
  • Contact
Customization of a DADA2-based pipeline for fungal internal transcribed spacer 1 (ITS1) amplicon data sets
Thierry Rolling, Bing Zhai, John Frame, Tobias M. Hohl, Ying Taur
Thierry Rolling, Bing Zhai, John Frame, Tobias M. Hohl, Ying Taur
View: Text | PDF
Resource and Technical Advance Infectious disease Microbiology

Customization of a DADA2-based pipeline for fungal internal transcribed spacer 1 (ITS1) amplicon data sets

  • Text
  • PDF
Abstract

Identification and analysis of fungal communities commonly rely on internal transcribed spacer–based (ITS-based) amplicon sequencing. There is no gold standard used to infer and classify fungal constituents since methodologies have been adapted from analyses of bacterial communities. To achieve high-resolution inference of fungal constituents, we customized a DADA2-based pipeline using a mix of 11 medically relevant fungi. While DADA2 allowed the discrimination of ITS1 sequences differing by single nucleotides, quality filtering, sequencing bias, and database selection were identified as key variables determining the accuracy of sample inference. Due to species-specific differences in sequencing quality, default filtering settings removed most reads that originated from Aspergillus species, Saccharomyces cerevisiae, and Candida glabrata. By fine-tuning the quality filtering process, we achieved an improved representation of the fungal communities. By adapting a wobble nucleotide in the ITS1 forward primer region, we further increased the yield of S. cerevisiae and C. glabrata sequences. Finally, we showed that a BLAST-based algorithm based on the UNITE+INSD or the NCBI NT database achieved a higher reliability in species-level taxonomic annotation compared with the naive Bayesian classifier implemented in DADA2. These steps optimized a robust fungal ITS1 sequencing pipeline that, in most instances, enabled species-level assignment of community members.

Authors

Thierry Rolling, Bing Zhai, John Frame, Tobias M. Hohl, Ying Taur

×

Figure 2

Performance of DADA2 on the mock community data set.

Options: View larger image (or click on image) Download as PowerPoint
Performance of DADA2 on the mock community data set.
(A) Strain resoluti...
(A) Strain resolution of DADA2 (ASV) compared with UPARSE (OTU). The balanced community has equal 18S rDNA copy number normalized amounts of DNA per strain. The extreme 1 community include equal 18S rDNA copy number normalized amounts of DNA per strain, except for A. fumigatus and M. guilliermondii, which were included at 50-fold dilution. The extreme 2 community include equal 18S rDNA copy number normalized amounts of DNA per strain, except for A. fischeri and M. caribbica, which were included at 50-fold dilution. (B) Representative quality profile of raw reads that were denoised into exact sequence matches to A. fumigatus and C. albicans. The line represents the median Phred score at that position, while the shaded area represents the 25th to 75th percentiles. (C) Impact of varying truncQ and maxEE on the number of species-specific reads.

Copyright © 2025 American Society for Clinical Investigation
ISSN 2379-3708

Sign up for email alerts