Hack the ocean, shape the future – join the Blue-Cloud 2026 Hackathon!

D3.1 First release of aggregated and harmonised EOV datasets

The objective of WP3 is to design, build, deploy and test analytical pipelines for generating highly qualified and harmonised data collection for some selected Essential Ocean and Biological Variables (EOV and EBV respectively). These pipelines are called 'workbenches'. For the physical and eutrophication workbenches, these resulting data collections will integrate and harmonise different datasets from various Blue Data Infrastructures (BDIs). This includes cleaning the data collection from duplicates and applying additional quality control (QC) checks. The EOVs considered for the physical workbench (WB1) are temperature and salinity. The EOVs considered for the eutrophication workbench (WB2) are chlorophyll, oxygen and nutrients. For the ecosystem workbench, EOVs and EBVs will be derived from multiple data sources and will provide interoperable and incomparable products in a user-friendly format. The EBVs considered for this workbench are plankton biomass and diversity. In the original formulation of the project, it was planned to generate and deliver for each workbench at M20 a preliminary aggregated and harmonised dataset for essential variables. This should involve a first data homogenisation and integration, at least for some input data and over small geographical areas. However, when the project started, there were not yet means to efficiently access the identified input datasets and to harmonise them automatically. Therefore, the priority in WP3 has been turned fully to designing a conceptual and feasible approach for the analytical workflows and formulating requirements for the system components that together will make up the workbenches. These include components for facilitating efficient and fast performing access to the different data repositories, harmonisation in syntax and semantics of output from these diverse data sources, and processing the large data collections for validation and identification and removal of duplicate data sets. Therefore, WP3 for the Workbenches on Temperature & Salinity and Eutrophication worked closely together with WP2 and Wp5, for developing and trying out innovative technology for configuring data lakes with fast sub-setting functionality, and with WP3 partners developing new technology for semantic analysis and mapping, and for upgrading the WebODV service from Read-only to Read and Write as needed for the data processing activities. In addition, WP3 for these two workbenches has analysed and selected data repositories and formulated a common metadata profile for supporting harmonisation. This way, WP3 has made great steps forward in formulating the analytical pipelines and architecture of the workbenches, while also providing requirements to the Blue-Cloud technical developers that are being followed up successfully. For the ecosystem workbench, EOVs and EBVs are derived from different historical and new sources of biological plankton data through a standardised mapping and extrapolation pipeline using statistical and machine-learning approaches that aim to compensate for data scarcity and spatiotemporal biases. Very good progress has been made in developing a prototype of the novel analysis pipeline. WP3 and the workbenches are on track.
121
98
Obaton, Dominique; Pagano, Pasquale; Iona, Sissy; Leroy, Delphine; Schaap, Dick; Weerheim, Paul; Kooyman, Robin; Krijger, Tjerk; Guidi, Lionel; Irisson, Jean-Olivier; Sonnet, Virginie; Atallah, Chris; Richardson, Lorna; Finn, Rob; Mesaroz, Lilli; Pesant, Stéphane; Schickele, Alexandre; Clerc, Corentin; Vogt, Meike; Moncoiffé, Gwenaëlle; Giorgetti, Alessandra; Reyes, Catalina; Simoncelli, Simona
10.5281/zenodo.13889325