The Blue-Cloud project is federating distributed marine data resources, computing platforms, and analytical services to better understand and manage the many aspects of ocean sustainability. In order to do so, a “Blue-Cloud technical framework” has been set-up, where leading marine data management infrastructures are federated with horizontal e-infrastructures to maximise the exploitation of data resources available from different sources. The Blue-Cloud framework consists of two major technical components: a Blue-Cloud Data Discovery and Access service to serve federated discovery and access to blue data infrastructures, and a Blue-Cloud Virtual Research Environment (VRE) to provide computing platforms and analytical services facilitating the collaboration between researchers.
Test the Blue-Cloud VRE on D4Science
Read the EOSC in Practice story
The Blue-Cloud Virtual Research Environment
The Blue-Cloud Virtual Research Environment (VRE) facilitates collaborative research using a variety of data sets and analytical tools, complemented by generic services such as sub-setting, pre-processing, harmonizing, publishing and visualization. Within the Blue-Cloud VRE different Virtual Labs enact a family of analytical workflows (or pipelines) which consist of a series of applications, including services that facilitate collaboration between users, services supporting the execution of analytical tasks embedded in a distributed computing infrastructure, as well as services enabling the co-creation of entire Virtual Laboratories, aimed at realising open science-friendly working environments. The VLabs make use of selected datasets as input that can be retrieved from the blue data infrastructures by means of the Blue-Cloud Data Discovery and Access, or can be retrieved and ingested by users from other data portals and (own) resources.
The Blue-Cloud VLabs are developed by the Italian National Research Council (CNR), built on the D4Science infrastructure and the gCube open source technology, and offered via the Blue-Cloud Gateway which makes the services and Virtual Laboratories available. Thanks to Blue-Cloud, scientists and practitioners are able not only to easily access different sets of marine data but also to process and experiment with them via the analytical and visual tools made available by each demonstrator.
How to use the Blue-Cloud Virtual Research Environment - video tutorial
Infrastructure technical overview
The D4Science architecture consists of a hardware layer and a service layer. The hardware layer is organized as a dynamic pool of virtual machines, supporting computation and storage, while the services layer is organized into e-infrastructure middleware, storage, and end user services. The hardware layer consists of an OpenStack installation, supporting the deployment of services in the upper layer by provision of computational and storage resources. The service layer consists of five service frameworks, which can be summarized as follows:
- Enabling Framework: the enabling framework includes services required to support the operation of all services and the VREs supported by such services. As such it includes: a resource registry service, to which all e-infrastructure resources (data sources, services, computational nodes, etc.) can be dynamically (de)registered and discovered by user and other services; Authentication and Authorization services, as well as Auditing Services, capable of both granting and tracking access and usage actions from users; and a VRE manager, capable of deploying in the collaborative framework VREs inclusive of a selected number of “applications”, generally intended as sets of interacting services. This framework has been enhanced in several components and extended to include an orchestrator service to manage complex management workflows;
- Storage Framework: the storage framework includes services for efficient, advanced, and on-demand management of digital data, encoded as: files in a distributed file system, collection of metadata records, and time series in spatial databases; such services are used by all other services in the architecture, exception made for the enabling framework;
- Analytics Framework: the analytics framework includes the services required for running methods provided by scientists taking advantage, in transparent way, of the power of the underlying computation cloud (e.g. parallel computation) and of a plethora of standard statistics methods, provided out of the box to compute over given input data. This framework has been enhanced in several components and extended to include support for computational notebooks;
- Collaborative framework: the collaborative framework supports all VREs deployed by the scientists and for each of them provides social networking services, user management services, shared workspace services, and WebUI access to the information cloud and to the analytics framework, via analytics laboratory services;
- Publishing framework: the publishing framework includes services for documenting (by rich metadata) and publishing research artifacts (datasets, notebooks, processes, as well as any community-defined artifact) produced by the VRE and its VLabs thus to promote their FAIRness.
Each Blue-Cloud VRE Collaborative Framework features a set of useful services and components, enabling users to collaborate on an activity by sharing materials and communicating in smart and flexible ways.
- The Workspace allows users to organise and share digital materials (datasets, notebooks, codes, etc.) via an interface resembling a standard file system with items organised in folders.
- Via the Social Networking function, users can have discussions and exchange information through common social networking approaches and practices such as posts, hashtags, mentions, likes.
Thanks to the Blue-Cloud VRE, users are able to further exploit the data collected and accessed via the Blue-Cloud Data Discovery and Access Service following a FAIR data (Findable, Accessible, Interoperable, Reusable) approach.
“Thanks to the Rstudio component offered by the Blue-Cloud VRE, I was able to reduce the computing time from hours to minutes when running the model that quantifies the relative contributions of the drivers in phytoplankton dynamics, as part of the Zoo and Phytoplankton EOV Demonstrator”. Viviana Otero (Flanders Marine Institute - VLIZ)
Virtual Labs
Fourteen VLabs are currently operational on Blue-Cloud VRE. The following Virtual Labs have started operating for testing since 2020 in order to demonstrate Blue-Cloud’s potential in five real-live demonstrators, welcoming marine researchers and stakeholders to test them and interact with the many functionalities and existing datasets made available via the Blue-Cloud framework through a single login:
- The Aquaculture Atlas Generation VLab, developed in the context of the Aquaculture monitor
- The Fisheries Atlas VLab and the GRSF VLab, developed in the context of the "Fish, a matter of scales" demonstrator
- The Marine Environmental Indicators VLab
- The Zoo and Phytoplankton EOV VLab
- The Plankton Genomics VLab
In order to support the Blue-Cloud Hackathon event held in February 2022 a new Blue-Cloud Hackathon VLab has been developed. Finally, in the framework of the Blue-Cloud synergies programme two additional VLabs were developed as pilots to support the work of the JERICO-CORE multi-platform research infrastructure dedicated to a holistic appraisal of coastal marine system changes, and the JONAS initiative, addressing the issue of underwater noise in the Atlantic Seas.
Last, two Virtual Labs (the Blue-Cloud Lab VLab and the Blue-Cloud Project VLab) are also developed to support the Blue-Cloud community with cross-thematic services.
We expect more and more researchers and other end-users to discover the expanding environment of Blue-Cloud, encouraging a collaborative effort for the FAIRisation of marine research data and the continuous improvement of these frameworks.
Test the Blue-Cloud VRE on D4Science | Learn more Blue-Cloud Architecture (Release 2)