The objective of WP3 is to design, build, deploy and test analytical
pipelines for generating highly qualified and harmonised data collection
for some selected Essential Ocean and Biological Variables (EOV and EBV
respectively). These pipelines are called 'workbenches'. For the physical
and eutrophication workbenches, these resulting data collections will
integrate and harmonise different datasets from various Blue Data
Infrastructures (BDIs). This includes cleaning the data collection from
duplicates and applying additional quality control (QC) checks. The EOVs
considered for the physical workbench (WB1) are temperature and salinity.
The EOVs considered for the eutrophication workbench (WB2) are chlorophyll,
oxygen and nutrients. For the ecosystem workbench, EOVs and EBVs will be
derived from multiple data sources and will provide interoperable and
incomparable products in a user-friendly format. The EBVs considered for
this workbench are plankton biomass and diversity. In the original
formulation of the project, it was planned to generate and deliver for each
workbench at M20 a preliminary aggregated and harmonised dataset for
essential variables. This should involve a first data homogenisation and
integration, at least for some input data and over small geographical
areas. However, when the project started, there were not yet means to
efficiently access the identified input datasets and to harmonise them
automatically. Therefore, the priority in WP3 has been turned fully to
designing a conceptual and feasible approach for the analytical workflows
and formulating requirements for the system components that together will
make up the workbenches. These include components for facilitating
efficient and fast performing access to the different data repositories,
harmonisation in syntax and semantics of output from these diverse data
sources, and processing the large data collections for validation and
identification and removal of duplicate data sets. Therefore, WP3 for the
Workbenches on Temperature & Salinity and Eutrophication worked closely
together with WP2 and Wp5, for developing and trying out innovative
technology for configuring data lakes with fast sub-setting functionality,
and with WP3 partners developing new technology for semantic analysis and
mapping, and for upgrading the WebODV service from Read-only to Read and
Write as needed for the data processing activities. In addition, WP3 for
these two workbenches has analysed and selected data repositories and
formulated a common metadata profile for supporting harmonisation. This
way, WP3 has made great steps forward in formulating the analytical
pipelines and architecture of the workbenches, while also providing
requirements to the Blue-Cloud technical developers that are being followed
up successfully. For the ecosystem workbench, EOVs and EBVs are derived
from different historical and new sources of biological plankton data
through a standardised mapping and extrapolation pipeline using statistical
and machine-learning approaches that aim to compensate for data scarcity
and spatiotemporal biases. Very good progress has been made in developing a
prototype of the novel analysis pipeline. WP3 and the workbenches are on
track.
121
98
Obaton, Dominique; Pagano, Pasquale; Iona, Sissy; Leroy, Delphine; Schaap, Dick; Weerheim, Paul; Kooyman, Robin; Krijger, Tjerk; Guidi, Lionel; Irisson, Jean-Olivier; Sonnet, Virginie; Atallah, Chris; Richardson, Lorna; Finn, Rob; Mesaroz, Lilli; Pesant, Stéphane; Schickele, Alexandre; Clerc, Corentin; Vogt, Meike; Moncoiffé, Gwenaëlle; Giorgetti, Alessandra; Reyes, Catalina; Simoncelli, Simona
10.5281/zenodo.13889325