The European Marine Board Biennial Open Forum acts as a platform to bring together a wide range of marine science stakeholders to discuss and share knowledge, identify common priorities, develop common positions and collaborate. The 7th EMB Forum fostered discussion and collaboration to advance the role of big data, digitalization and artificial intelligence in marine science to support the European Green Deal, the post-2020 EU Biodiversity Strategy, and the development of a Digital Twin Ocean.
The Blue-Cloud consortium was represented by experts from Seascape Belgium and VLIZ, as Patricia Martin-Cabrera (VLIZ) took part in the panel on the EU Biodiversity Strategy, and Kate Larkin (Seascape Belgium) joined the session on the Digital Twin of the Ocean.
The EU Biodiversity Strategy session - Patricia Martin-Cabrera (VLIZ)
How might big data, digitalisation and artificial intelligence play a role in regionally and globally integrated data products for developing indicators for marine biodiversity?
As pointed out by Ward Appeltans (UNESCO-IOC, OBIS), interconnecting the multiple nodes of the different European marine data infrastructures to be able to answer more complex questions is the way forward. And this is the vision at VLIZ as well, that is setting steps in this direction. A key starting point is to bring together different marine data infrastructures, which is exactly the purpose of the Blue-Cloud project, federating leading European marine data management infrastructures, which is already a big step to combine Copernicus satellite data with ARGO floats and biodiversity data. Blue-Cloud will provide discovery and access to these multidisciplinary data, and brings the opportunity to perform high computational analyses in a virtual research environment, using workflows with examples in the form of Blue-Cloud demonstrators, that can be adapted and reproduced by users to build up further on these products. This will allow researchers to explore existing sources in a more integrated way to develop indicators for marine biodiversity. It is very important that these results can be used in marine policy initiatives such as the European Biodiversity Strategy.
What are the main challenges in using biodiversity data to achieve the EU Biodiversity Strategy objectives, and how are you overcoming them using big data?
Nowadays with the rapid advancements in ocean technologies and instruments, as well as in FAIR data and Open Science, the number of observations is growing exponentially and marine data are easier to find and more accessible. For instance, Imaging systems are used more and more frequently in the marine domain to generate huge amounts of imagery data. The challenge here is the manpower needed to process and manage these big amounts of data. However, machine learning and artificial intelligence are the keys to solve this challenge by developing algorithms for automatic species identification and quantifications. With this, we can determine the abundance, size and biomass of plankton communities. One of the objectives of the EU Biodiversity Strategy is to restore marine ecosystems and specifically those that play an important role in sequestering carbon. Hence why this a good example of the importance to study these communities in a more efficient way because phytoplankton is an essential driver of biogeochemical processes and carbon fluxes. And also, both phyto and zooplankton biomass and diversity have been tagged as EOV and ECV by the Global Ocean Observing System.
Discover the Blue-Cloud demonstrator on Zoo- and Phytoplankton EOV Products
Concerning the objective of protecting at least 30% of the European Seas, the use of big data can improve the knowledge of key species and new marine bioindicators. And to achieve this, we need to integrate and harmonise different datasets from multidisciplinary sources, to be made available, interoperable and easily findable to the community. In this aspect, the different EMODNET lots and established European Research Infrastructures such as LifeWatch ERIC and EOSC projects such as Blue-Cloud, Eosc-Life and ENVRI-FAIR, play an important role.
Take-home message
During the different sessions of the event, the need for coordinating and uniting efforts of marine scientists and computational scientists was largely acknowledged. This was already mentioned during the opening session from Lionel Guidi (marine scientist from the Laboratoire d’océanographie de Villefranche) that clearly stated: "The ocean is a challenge as well for Marine Research as it is for Computational Research". Another important point of discussion was the need to improve FAIRness, specifically interoperability and reusability. We already have all the tools necessary to enable Big Data to happen, now we need to bring them together (including human activities, economy and politics) in a way to be used in decision making because FAIR data is fundamental to really produce BIG data. Lastly, another hot topic was the need for good storytelling to truly connect to society. Indeed, in recent years citizens are more frequently involved in decision-making by governments to improve policies at European level. As well, it is more evident the engagement of citizens towards climate action, sustainable development and environmental protection. But, are we scientists communicating efficiently? To enable this, we need to excite the public about the research being done because science has very important things to say about some of the biggest problems society is facing.
The Digital Twin Ocean Session - Kate Larkin (Seascape Belgium)
What questions could be answered with a Digital Twin Ocean (DTO) that we cannot answer now?
This session brought together representatives from the European Commission (DG CONNECT and DG DEFIS), Copernicus Marine Service, EMODnet and SeaDataNet. Kate Larkin (EMODnet Secretariat) noted that the DTO offers an opportunity for Europe to collectively build a next-generation capability for delivering integrated knowledge for science, decision-making and blue economy, and crucially also engage citizens to the point of changing behaviour and taking action. She emphasised that the DTO would not need to start from scratch and could build on the existing capability in Europe, for example, long-term data infrastructures and services EMODnet and CMEMS and the European H2020 project Blue-Cloud that is bringing together key data infrastructures and e-infrastructures to optimise data discovery and access and demonstrate the potential of using Virtual Research Environments with open access to data and analytical tools to conduct marine research through a cyber platform. She continued that the DTO could further democratize data by creating a data ‘lake’ that could be used as a giant workspace to enable Big Data science. She explained that by bringing together all available data (in situ and satellite), using EMODnet and CMEMS as a backbone, there would be a huge diversity of possibilities for bringing big marine data interfaced with models. This could include applications for the Blue Economy including operational tools and decision support, and opportunities to assess at a larger scale the connectivity and cumulative impacts on marine ecosystems to inform ecosystem-based management.
What are the main challenges in developing the Digital Twin Ocean and how can big data help them to be overcome?
Kate Larkin highlighted that a high priority challenge is truly Findable Accessible Interoperable and Reusable (FAIR) data remained a key issue. A lot of progress has been made towards openly accessible marine data through EMODnet, CMEMS and others but more could be done to make data truly interoperable and discoverable both by person-machine and machine-machine communications. This would be a cornerstone of the DTO to have up-stream data from diverse parameters and sources available for all users. She noted that although many marine parameters are not yet available as big data sets, there are examples in EMODnet Physics and Human Activities that could be used as proof of concept. She also noted the challenge that as the community scales up towards wider automation and near real-time delivery of data, including the potential for big data sets we need to do this in a step-wise approach to organise and optimise the data value chain.
Take-home message
We should ensure quality and provenance of data (though metadata descriptions) so that we do not move towards big data until we can ensure the quality and structure of the data so it will be used by the user community. And despite the growing capability for near real-time delivery, there would also remain a need for delayed mode, harmonised and standardised, integrated data sets, for example to validate models. The user-interface for a future Digital Twin of the Ocean would also need to consider the various and different needs of user communities ranging from researchers and modellers to policymakers, industry and the general public.