18S: A comprehensive dataset on spatiotemporal variation of microbial plankton communities in the Baltic Sea

Registros biológicos
Versión 1.3 publicado por KTH Royal Institute of Technology el oct 9, 2023 KTH Royal Institute of Technology
Fecha de publicación:
9 de octubre de 2023
Licencia:
CC0 1.0

Descargue la última versión de los datos como un Archivo Darwin Core (DwC-A) o los metadatos como EML o RTF:

Datos como un archivo DwC-A descargar 99.159 registros en Inglés (52 MB)  - Frecuencia de actualización: no planeado
Metadatos como un archivo EML descargar en Inglés (27 KB)
Metadatos como un archivo RTF descargar en Inglés (23 KB)

Descripción

A dataset covering spatiotemporal variation in eukaryotic microbial communities and physicochemical parameters of the Baltic Sea. Between January 2019 and February 2020, 281 transect-time course samples and 65 samples for protocol testing were collected from 19 stations in the Baltic Sea, Kattegat and Skagerrak. We analysed the samples with 18S ribosomal RNA (rRNA) gene and 16S rRNA gene metabarcoding, to capture the eukaryotic and prokaryotic diversity, respectively. Note that this resource was previously published as 'Metadata-only'. This dataset was published via the SBDI ASV portal (https://asv-portal.biodiversitydata.se/).

Registros

Los datos en este recurso de registros biológicos han sido publicados como Archivo Darwin Core(DwC-A), el cual es un formato estándar para compartir datos de biodiversidad como un conjunto de una o más tablas de datos. La tabla de datos del core contiene 99.159 registros.

también existen 2 tablas de datos de extensiones. Un registro en una extensión provee información adicional sobre un registro en el core. El número de registros en cada tabla de datos de la extensión se ilustra a continuación.

Occurrence (core)
99159
ExtendedMeasurementOrFact 
3759729
dnaDerivedData 
99159

Este IPT archiva los datos y, por lo tanto, sirve como repositorio de datos. Los datos y los metadatos del recurso están disponibles para su descarga en la sección descargas. La tabla versiones enumera otras versiones del recurso que se han puesto a disposición del público y permite seguir los cambios realizados en el recurso a lo largo del tiempo.

Versiones

La siguiente tabla muestra sólo las versiones publicadas del recurso que son de acceso público.

¿Cómo referenciar?

Por favor, tenga en cuenta que ésta es una versión antigua del conjunto de datos.  Los usuarios deben citar este trabajo de la siguiente manera:

Latz M, Andersson A, Brugel S, Hedblom M, Jurdzinski K, Karlson B, Lindh M, Lycken J (2023). 18S: A comprehensive dataset on spatiotemporal variation of microbial plankton communities in the Baltic Sea. Version 1.3. KTH Royal Institute of Technology. Occurrence dataset. https://www.gbif.se/ipt/resource?r=prjeb55296-18s&v=1.3

Derechos

Los usuarios deben respetar los siguientes derechos de uso:

El publicador y propietario de los derechos de este trabajo es KTH Royal Institute of Technology. En la medida de lo posible según la ley, el publicador ha renunciado a todos los derechos sobre estos datos y los ha dedicado al Dominio público (CC0 1.0). Los usuarios pueden copiar, modificar, distribuir y utilizar la obra, incluso con fines comerciales, sin restricciones.

Registro GBIF

Este recurso ha sido registrado en GBIF con el siguiente UUID: 6bce37d6-a682-4cca-89c4-7464cefa65e9.  KTH Royal Institute of Technology publica este recurso y está registrado en GBIF como un publicador de datos avalado por GBIF Sweden.

Palabras clave

Occurrence; Observation; Baltic Sea; brackish water; microbial plankton; eukaryotic plankton; prokaryotic plankton; 16S rRNA metabarcoding; 18S rRNA metabarcoding; marine monitoring; salinity; temporal variation

Contactos

Meike Latz
  • Originador
Postdoc
KTH Royal Institute of Technology
Agneta Andersson
  • Propietario
  • Originador
Professor
Umeå universitet
Sonie Brugel
  • Originador
Senior research engineer
Umeå universitet
Mikael Hedblom
  • Originador
Researcher
Swedish Meteorological and Hydrological Institute
Krzysztof Jurdzinski
  • Originador
PhD student
KTH Royal Institute of Technology
Bengt Karlson
  • Propietario
Senior researcher
Swedish Meteorological and Hydrological Institute
Markus Lindh
  • Originador
Senior researcher
Swedish Meteorological and Hydrological Institute
Jenny Lycken
  • Originador
Research assistant
Swedish Meteorological and Hydrological Institute
Anders Andersson
  • Propietario
  • Originador
  • Punto De Contacto
Professor
KTH Royal Institute of Technology
Bengt Karlson
  • Propietario
Researcher
Swedish Meteorological and Hydrological Institute

Cobertura geográfica

The samples were collected at 19 stations distributed along the Baltic Sea, Kattegat and Skagerrak

Coordenadas límite Latitud Mínima Longitud Mínima [54,97, 10,5], Latitud Máxima Longitud Máxima [65,8, 22,45]

Cobertura taxonómica

Eukaryotic plankton

Dominio Eukaryota

Cobertura temporal

Fecha Inicial / Fecha Final 2019-01-10 / 2020-02-20

Datos del proyecto

Metadata also available here: https://figshare.com/s/b2962b2174747c6bc869

Título A comprehensive dataset on spatiotemporal variation of microbial plankton communities in the Baltic Sea
Fuentes de Financiación This work was supported by the Swedish Agency for Marine and Water Management and the Swedish Environmental Protection Agency under the grant number NV-03728-17 and the MACL was additionally supported by a research grant (34442) from VILLUM FONDEN

Métodos de muestreo

In total, 281 transect-time course samples and 65 samples for protocol testing were collected from January 2019 to February 2020 at 19 stations in the Baltic Sea, Kattegat, and Skagerrak , during monthly/bi-weekly sampling cruises. The 281 transect-time course samples were collected during cruises that were part of the Swedish National Marine Monitoring Program implemented by the Swedish Meteorological and Hydrological Institute (SMHI), Umeå University (UU) and Stockholm University (SU) on different research vessels specified for each sample by the vessel's ICES (International Council Exploration Sea) platform code. Samples were collected and physicochemical parameters measured using a Conductivity Temperature Depth (CTD) profiling instrument (model SBE 911plus/SBE19+, Sea Bird Electronics Inc., Bellevue, Washington, USA) deployed on a rosette (model SBE32). Water for the microbial analyses was sampled with a depth-integrating hose covering the depth of 0-10 m. At stations B1 and BY31 the depth covered was 0-20 m and station RÅNEÅ-1 0-5 m. Physicochemical parameters were measured at 0, 5, and 10 m depth. Samples for these measurements were collected using Niskin bottles.

Área de Estudio Between January 2019 and February 2020, 281 transect-time course samples were collected from 19 stations in the Baltic Sea, Kattegat, and Skagerrak. The stations covered the salinity gradient of the Baltic Sea towards the opening to the Atlantic - through the Kattegat and Skagerrak, with average (over time) salinity ranging from 2 PSU in the Bothnian Bay to 31 PSU in the Skagerrak
Control de Calidad On three sampling occasions (2019-05-06, 2019-08-06, 2019-10-07), additional 59 and 6 sample replicates were collected at station SLÄGGÖ, to test the effect of sample filtering volume (10, 100, 200 and 500 mL) and filter storage temperature (-20 °C and -80 °C), respectively. For controlling for potential contamination during the DNA extraction, DNA was extracted from blank filters.

Descripción de la metodología paso a paso:

  1. Sampling In total, 281 transect-time course samples and 65 samples for protocol testing were collected from January 2019 to February 2020 at 19 stations in the Baltic Sea, Kattegat, and Skagerrak (Fig 2a), during monthly/bi-weekly sampling cruises. The 281 transect-time course samples were collected during cruises that were part of the Swedish National Marine Monitoring Program implemented by the Swedish Meteorological and Hydrological Institute (SMHI), Umeå University (UU) and Stockholm University (SU) on different research vessels specified for each sample by the vessel's ICES (International Council Exploration Sea) platform code. Samples were collected and physicochemical parameters measured using a Conductivity Temperature Depth (CTD) profiling instrument (model SBE 911plus/SBE19+, Sea Bird Electronics Inc., Bellevue, Washington, USA) deployed on a rosette (model SBE32). Water for the microbial analyses was sampled with a depth-integrating hose covering the depth of 0-10 m. At stations B1 and BY31 the depth covered was 0-20 m and station RÅNEÅ-1 0-5 m. Physicochemical parameters were measured at 0, 5, and 10 m depth. Samples for these measurements were collected using Niskin bottles. On three sampling occasions (2019-05-06, 2019-08-06, 2019-10-07), additional 59 and 6 sample replicates were collected at station SLÄGGÖ, to test the effect of sample filtering volume (10, 100, 200 and 500 mL) and filter storage temperature (-20 °C and -80 °C), respectively. DNA extraction and sequencing For DNA analyses, 500 mL of seawater were filtered onto a 47 mm membrane filter of 0.22 µm pore size (GSWP04700, MilliporeSigma, Burlington, MA, USA) using a filter funnel with a < 270 mbar/200 mm Hg vacuum. The filtration was initiated within one hour after sampling​​, and the filtration time was kept below one hour or otherwise noted. Subsequently, the filters were rolled into a 5 mL cryotube, flash-frozen in liquid nitrogen and stored at -20 °C until further processed. In short, DNA was extracted using a previously established protocol 18, libraries prepared for metabarcoding of 16S rRNA 11 and 18S rRNA 19,20, and sequenced on Illumina MiSeq flow cells with an average output of 0.13 million paired-end read pairs per sample (0.171 for 16S and 0.095 for 18S). DNA extraction from filters was performed using the ZymoBIOMICS™ DNA Miniprep Kit (Zymo Research Corp, Irvine, CA, USA) following the manufacturer’s instructions with a few modifications 18: After adding the lysis buffer to the filter (and before bead-beating), 10 µL of spike-in DNA were added to each sample (described in the next section). The bead beating conditions were optimised to 10 min and for elution of DNA from the column, 50 µL were used instead of 100 µL. The concentration and quality of the DNA was assessed using the Qubit™ dsDNA HS Assay Kit on a Qubit Fluorometer (ThermoFischer, Waltham, MA, USA) and an Agilent DNA High Sensitivity Kit on a 2100 Bioanalyzer instrument (Agilent Technologies, Santa Clara, CA, USA). Sequencing libraries for 18S rRNA metabarcoding targeting the hypervariable V4 region of the eukaryotic 18S rRNA gene were prepared by using the primers V4F CCAGCASCYGCGGTAATTCC and V4RB ACTTTCGTTCTTGATYRR 19 with the simplified PCR protocol described in 20. Libraries for 16S rRNA metabarcoding targeting the hypervariable V3-V4 regions of the bacterial 16S rRNA gene were prepared at NGI following the protocol 21,22 with the primers 341F CCTACGGGNGGCWGCAG and 805R GACTACHVGGGTATCTAATCC 11. The primers were supplemented with 5’-end Illumina sequence adapters (forward: ACACTCTTTCCCTACACGACGCTCTTCCGATCT-3’, reverse: 5’-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT) and ordered from IDT DNA (IA, US) at 100 μM in TE buffer. To increase the complexity of the libraries, phased primers 22,23 were used for the 18S forward primer, with equal proportions of primers having ATG, TG, G, or no base inserted between the adapter sequence and the target-binding region. For 16S, phasing was used on both primers, with CTAGAGT, TAGAGT, etc for the forward and ACTACTG, CTACTG, etc for the reverse. The PCR reactions were carried out with the KAPA HiFi HotStart ReadyMix PCR Kit (Kapa Biosystems, MA, USA), according to the manufacturer’s instructions, with the final 25 µL reaction mix containing 1x Kapa HiFi HotStart ReadyMix, 0.3 μM of each primer, and 5 ng template DNA for 18S library preparation and 1 ng for 16S. For 18S rRNA amplification the PCR conditions were 95°C for 3 min, 30 cycles of 98°C for 20 s, 52°C for 15 s and 72°C for 15 s, followed by a final elongation step of 72°C for 2 min. For 16S rRNA amplification the following PCR conditions were used: 98°C for 2 min, 28 cycles of 98°C for 20 s, 54°C for 20 s and 72°C for 15 s, followed by a final elongation step of 72°C for 2 min. The PCR product was cleaned with magnetic beads using the MagSi-NGS PREP Plus Kit (MDKT00010075, magtivio BV., Nuth, the Netherlands), indexed through a second PCR with Kapa HiFi HotStart ReadyMix, equimolar pooling and sequencing on three MiSeq lanes (Illumina Inc, San Diego, CA, US) for 18S and 16S rRNA metabarcoding, respectively, were performed at SciLifeLab/NGI (Solna, Sweden). The PCR conditions for indexing were 95°C for 2 min, 8 cycles of 98°C for 20 s, 55°C for 30 s and 72°C for 30 s, followed by a final elongation step of 72°C for 2 min. The Adapterama indexing scheme was used 24,25, using unique forward and reverse indices for every sample sequenced together. Processing of sequencing data Initially, sequences of phased primers were removed from the reads using a snakemake pipeline 30 that utilises cutadapt 31. The pipeline conducts the following steps: removes read-pairs containing Illumina adapters, removes read-pairs that do not contain the expected primer sequences in the 5’ ends of the reads and removes the primer sequences from the remaining reads, removes read-pairs that contain primer sequences anywhere else on the reads, trims reads to fixed lengths. Further analyses of sequencing data and plotting of the data was performed in R version 4.0.3 using the packages ‘DADA2’ 32 version 1.18.0, ‘vegan’ 33 version 2.5-7, and ‘ggplot2’ 34 version 3.4.0. The median sequencing depth was 0.13 M read pairs per sample with >80% of reads of a quality score > 30 for both 18S and 16S rRNA amplicons. The package ‘DADA2’ was used to infer biological sequence variants from amplicon reads; the individual sequencing runs were processed separately and merged after obtaining the sequence tables. Low-quality reads were filtered out. The remaining reads were denoised and forward and reverse reads merged. That resulted in 10,293 amplicon sequence variants (ASVs) for 18S rRNA and 40,369 ASVs for 16S rRNA across 346 samples. Taxonomy of the ASVs was inferred with ‘assignTaxonomy’ using PR2 25 version 4.14.0 as a training set for 18S rRNA amplicons and a curated version of the 16S sequences of GTDB (version R06-RS202-1)35 for 16S rRNA amplicons. For the analyses of the data presented in this publication, one 18S sample with unusually high read number was removed and from the replicated samples one was randomly chosen. The ASVs from the spike-in DNA sequences were identified and removed from the ASV table, sequences assigned to Metazoa were also removed. Finally, read abundance per sample was rarefied to the same counts with the function ‘rrarefy’ from the ‘vegan’ package version 2.5-7 to ~44,000 for 16S and ~8,000 for 18S. Data Records The raw sequencing data generated in this study are available at the European Nucleotide Archive (ENA) under the study accession number PRJEB55296. Processed sequencing data (ASV sequences with taxonomic annotations and counts in samples) are available at our repository 26 (DOI: 10.17044/scilifelab.20751373), along with the contextual, physicochemical, and microscopy data. All physicochemical data can also be downloaded through SHARKweb as described above; detailed instructions on accessing specific parts of the data are available in the repository 26. Processed sequencing data (ASVs of 18S and 16S rRNA metabarcoding) can also be accessed and viewed interactively through the Swedish Biodiversity Infrastructure (https://biodiversitydata.se). Technical Validation Many of the procedures for sampling and measurement of environmental parameters are optimised and routinely performed within the Swedish National Marine Monitoring Program, commissioned by the Swedish Agency for Marine and Water Management, and the countries surrounding the Baltic Sea (HELCOM) 39. In this study, we performed technical validations of the protocols for sampling, sample storage and processing, sequencing library preparation, and quality of the data. We compared different sample filtration volumes (10, 100, 200, 500 ml) taken in five replicates on three sampling occasions at the Släggö station (Fig. 3) to validate that 500 ml was sufficient to cover the microbial diversity. Both α-diversity measured by Shannon index and richness appeared to reach a plateau at around 200 ml sample volume (Fig. 3a,c), and the variation between the replicates decreased with sample volume up to this point (coloured dots within the violin plots). We further compared the influence of sample storage at -20 °C vs. -80 °C on three replicates for a three-months storage period (data not shown) with no significant differences in Shannon α-diversity (Wilcoxon rank sum exact test, p-value 1 and 0.1 for 16S and 18S, respectively) but ANOSIM analysis indicated an effect on community composition, although not significant (ANOSIM analysis on Bray-Curtis distances, R-value: 0.67 and 1 and p-value 0.1 and 0.1, for 16S and 18S, respectively). Blanks (filters without sample) were sequenced to detect contamination sources during the DNA extraction procedure; no sequencing data were recovered from those samples. We tested the influence of two DNA extraction kits (Qiagen DNeasy PowerWater Kit and ZymoBIOMICS™ DNA Miniprep Kit, the latter used for the other samples of this study) on the Shannon diversity obtained from 16S and 18S rRNA metabarcoding on six water samples from two stations (N14 Falkenberg and Hanöbukten) and did not find a significant difference in obtained α-diversity between the kits (Shannon α-diversity, Wilcoxon rank sum exact test, p-value = 0.96 and 0.10, for 16S and 18S, respectively) while community composition was affected (ANOSIM analysis on Bray-Curtis distances, R-value: 0.58 and 0.20, p-value0.003 and 0.024, for 16S and 18S, respectively) (data not shown, available upon request). This calls for some caution when comparing datasets generated using these two kits. We evaluated primers most suitable for metabarcoding of eukaryotic plankton in a previously published study 20. In order to improve the sequencing quality, we used phased primers to increase the complexity of amplicon sequencing libraries 22,41; for the 18S primer pair phasing was only used in the forward primer. The sequencing reads were processed following the DADA2 pipeline 32 to trim and filter low quality reads, infer true sequence variants taking the error rates of the sequencing run into consideration, and removing chimeras from the dataset. The sequencing data validity is also confirmed by the fact that the salinity gradient and seasonality is reflected (Fig.2b,c) as shown in previous studies 11,13,29.

Metadatos adicionales

Propósito Contains all metadata on sequencing experiments and microscopy data.
Identificadores alternativos 6bce37d6-a682-4cca-89c4-7464cefa65e9
https://www.gbif.se/ipt/resource?r=prjeb55296-18s