About us

Baltic Sea Mesozooplankton Dataset is a joint and voluntary effort by number of researchers from institutes that conduct or have conducted zooplankton monitoring on the Baltic Sea. Data harmonisation was partly funded by BONUS INSPIRE and BIO-C3 projects.

Aim of the joint effort was to expand the range of research questions that can be studied when all pooling and harmonizing all the available data, thereby also increasing the value of already existing data.

At the moment, dataset holds 672K records of individual counts and biovolume estimates from 30772 samples, representing 22121 full water column profiles. Data has been provided by 10 institutes.

Last update of dataset: June 21, 2017. You can follow the updates to this project also in the ResearchGate.


Metadata of all samples, as an "RData" file, can be downloaded from the data providers page.

Spatial distribution of the data: size of the bubbles corresponds to number of samples from one location: image

During harmonization, the original data tables were not changed. Operational basis of the harmonization is an R script that extracts the data from original tables, and then re-organizes them into two relational tables: SAMPLE table with metadata (wide format), and COUNT table with species composition (long format), linked with the columns "sampleID" and "samplingID": image

In case of discovering mistakes in the final tables, the harmonization script is easily fixed, and the whole process can be re-iterated from scratch in approximately 10 minutes. The harmonization takes care of the different data organizations, but also corrects the taxonomy, that includes fixing typos and unifying the taxonomic practices. To track the history of taxonomic corrections, original name of the taxa as it appears in the source files, is retained in the final tables.


  • Spatial and temporal variability of zooplankton in a temperate semi-enclosed sea: implications for monitoring design and long-term studies
  • (2016, Journal of Plankton research, 38: 652-661)


    First publication using the current dataset. Includes thorough description of data harmonization. We sought dominant patterns in the variability emerging at scales below 100 km and 90 days, in different hydrological regions – small lagoons, larger gulfs, Baltic Proper, and by differently sized zooplankton groups – large and small copepods and cladocerans. We show that in most cases, temporal variability in one place exceeds the synoptic spatial variability, and that smaller fast reproducing cladocerans vary more in abundance than larger slow reproducing copepods. The average abundance differences increased with increasing time and space between samplings. For copepods, we found the dominant temporal cycle of 60-70 days, requiring the sampling after every 20-23 days. For cladocerans we suggest the two weeks as minimum sampling frequency, the time during which the abundance difference between samples doubled. Caution should be given to calculate long term trends of copepods and cladocerans when sampling frequency has been less than 3 or 2 weeks. From this analysis, we concluded that in order to have reliable annual estimates, most zooplankton groups should be sampled at least once in every 2-3 weeks. Related blog post in the BONUS website.