The Blue-Cloud Virtual Research Environment (VRE) facilitates collaborative research using a variety of data sets and analytical tools, complemented by generic services such as sub-setting, pre-processing, harmonizing, publishing and visualization. Within the Blue-Cloud VRE different Virtual Labs enact a family of analytical workflows (or pipelines) which consist of a series of applications, including services that facilitate collaboration between users, services supporting the execution of analytical tasks embedded in a distributed computing infrastructure, as well as services enabling the co-creation of entire Virtual Laboratories, aimed at realising open science-friendly working environments. The VLabs make use of selected datasets as input that can be retrieved from the blue data infrastructures by means of the Blue-Cloud Data Discovery and Access, or can be retrieved and ingested by users from other data portals and (own) resources.The Blue-Cloud VRE is based upon the existing D4Science e-infrastructure as developed and managed by CNR-ISTI. This e-infrastructure already hosts multiple Virtual Labs and offers a variety of services, which can be adopted for the Blue-Cloud. The D4Science e-infrastructure also has proven solutions for connecting to external computing platforms and means for orchestrating distributed services, which will be instrumental for smart connections to the other e-infrastructures in the Blue-Cloud system.
The Blue-Cloud VLabs are developed by the Italian National Research Council (CNR), built on the D4Science infrastructure and the gCube open source technology, and offered via the Blue-Cloud Gateway which makes the services and Virtual Laboratories available. Thanks to Blue-Cloud, scientists and practitioners are able not only to easily access different sets of marine data but also to process and experiment with them via the analytical and visual tools made available by each demonstrator.
Providing computing platforms and analytical services to facilitate the collaboration between researchers
There are several research infrastructures or other data services running in Europe that cover a multitude of marine-related sciences, providing specific datasets coming from observations collected with different methods. These infrastructures constitute a diverse world, each looking at a piece of the big picture, sometimes hindering collaboration and data sharing. Blue-Cloud aims to overcome fragmentation and build a bridge between thematic science clusters - such as marine, climate, food and agriculture sciences - and EOSC, creating a data federation and providing a common access to a so-called thematic EOSC for marine data.
Download the EOSC in Practice Story
How to use the Blue-Cloud Virtual Research Environment - video tutorial
Infrastructure technical overview
The D4Science architecture consists of a hardware layer and a service layer. The hardware layer is organized as a dynamic pool of virtual machines, supporting computation and storage, while the services layer is organized into e-infrastructure middleware, storage, and end user services. The hardware layer consists of an OpenStack installation, supporting the deployment of services in the upper layer by provision of computational and storage resources. The service layer consists of five service frameworks, which can be summarized as follows:
- Enabling Framework: the enabling framework includes services required to support the operation of all services and the VREs supported by such services. As such it includes: a resource registry service, to which all e-infrastructure resources (data sources, services, computational nodes, etc.) can be dynamically (de)registered and discovered by user and other services; Authentication and Authorization services, as well as Auditing Services, capable of both granting and tracking access and usage actions from users; and a VRE manager, capable of deploying in the collaborative framework VREs inclusive of a selected number of “applications”, generally intended as sets of interacting services. This framework has been enhanced in several components and extended to include an orchestrator service to manage complex management workflows;
- Storage Framework: the storage framework includes services for efficient, advanced, and on-demand management of digital data, encoded as: files in a distributed file system, collection of metadata records, and time series in spatial databases; such services are used by all other services in the architecture, exception made for the enabling framework;
- Analytics Framework: the analytics framework includes the services required for running methods provided by scientists taking advantage, in transparent way, of the power of the underlying computation cloud (e.g. parallel computation) and of a plethora of standard statistics methods, provided out of the box to compute over given input data. This framework has been enhanced in several components and extended to include support for computational notebooks;
- Collaborative framework: the collaborative framework supports all VREs deployed by the scientists and for each of them provides social networking services, user management services, shared workspace services, and WebUI access to the information cloud and to the analytics framework, via analytics laboratory services;
- Publishing framework: the publishing framework includes services for documenting (by rich metadata) and publishing research artifacts (datasets, notebooks, processes, as well as any community-defined artifact) produced by the VRE and its VLabs thus to promote their FAIRness.
Each Blue-Cloud VRE Collaborative Framework features a set of useful services and components, enabling users to collaborate on an activity by sharing materials and communicating in smart and flexible ways.
- The Workspace allows users to organise and share digital materials (datasets, notebooks, codes, etc.) via an interface resembling a standard file system with items organised in folders.
- Via the Social Networking function, users can have discussions and exchange information through common social networking approaches and practices such as posts, hashtags, mentions, likes.
Thanks to the Blue-Cloud VRE, users are able to further exploit the data collected and accessed via the Blue-Cloud Data Discovery and Access Service following a FAIR data (Findable, Accessible, Interoperable, Reusable) approach.
“Thanks to the Rstudio component offered by the Blue-Cloud VRE, I was able to reduce the computing time from hours to minutes when running the model that quantifies the relative contributions of the drivers in phytoplankton dynamics, as part of the Zoo and Phytoplankton EOV Demonstrator”. Viviana Otero (Flanders Marine Institute - VLIZ)
Fourteen VLabs are currently operational on Blue-Cloud VRE. The following Virtual Labs have started operating for testing since 2020 in order to demonstrate Blue-Cloud’s potential, welcoming marine researchers and stakeholders to test them and interact with the many functionalities and existing datasets made available via the Blue-Cloud framework through single sign-on:
- The Aquaculture Atlas Generation VLab, developed in the context of the Aquaculture monitor
- The Fisheries Atlas VLab and the GRSF VLab, developed in the context of the "Fish, a matter of scales" demonstrator
- The Marine Environmental Indicators VLab
- The Zoo and Phytoplankton EOV VLab
- The Plankton Genomics VLab
In order to support the Blue-Cloud Hackathon event held in February 2022 a new Blue-Cloud Hackathon VLab has been developed. Finally, in the framework of the Blue-Cloud synergies programme two additional VLabs were developed as pilots to support the work of the JERICO-CORE multi-platform research infrastructure dedicated to a holistic appraisal of coastal marine system changes, and the JONAS initiative, addressing the issue of underwater noise in the Atlantic Seas.
Last, two Virtual Labs (the Blue-Cloud Lab VLab and the Blue-Cloud Project VLab) are also developed to support the Blue-Cloud community with cross-thematic services.
We expect more and more researchers and other end-users to discover the expanding environment of Blue-Cloud, encouraging a collaborative effort for the FAIRisation of marine research data and the continuous improvement of these frameworks.