Call for papers! Submit your paper to the "International Journal of Data Science and Analytics" before the 30th of June 2024!

D4Science

About

The D4Science e-infrastructure is at the core of the Blue-Cloud VRE’s. D4Science core services are used for building and running Virtual Labs, which are VRE’s that are dedicated to specific research objectives. The D4Science e-infrastructure implements proven solutions for connecting to external services and orchestrates distributed services, which will be instrumental for smart connections to other e-infrastructures in Blue-Cloud, including EUDAT and DIAS (WekEO). Each VLab enables services and data exploitation to its authorized users.
 

Type & number of data sets

D4Science serves different domains in 50+ countries worldwide. D4Science hosts more than 150 Virtual Research Environments (VREs) to serve the biological, ecological, environmental, social mining, culture heritage, and statistical communities world-wide.

Core Services

D4Science Security

D4Science provides access to a set of services hosted by different organisations in the EU. The connection between the sites is secured through Transport Level Security (TLS), which provides communication security over the computer network.
D4Science ensures privacy and data integrity between two communicating computer applications. In particular, any connection between a client (e.g., a web browser) and a D4Science server has the following properties:

  • Private (or secure) connection through the adoption of symmetric cryptography, which encrypts the data transmitted. The keys for this symmetric encryption are uniquely generated for each connection and are based on a shared secret (negotiated at the start of the session). The server and client negotiate the details about which encryption algorithm and cryptographic keys shall be used before data are transmitted. The negotiation of a shared secret is both secure, as it is unavailable to eavesdroppers (and even not to attackers who place themselves in the middle of the connection). For this reason, D4Science is reliable, as no attacker can modify the communications during the negotiation without being detected.
  • Authentication of communicating parties happens by using public-key cryptography. This authentication can be made optional on the client’s side; however, it is ensured on the server’s side.
  • Integrity of the connection is ensured as each information transmitted is linked to a message authentication code to prevent undetected loss or alteration of the data during the transmission.
  • Forward secrecy ensures that a future disclosure of encryption keys cannot be applied to decrypt any TLS communications recorded in the past

Function in Blue-Cloud

VRE Management

Service to enable authorized users (i.e. VRE Managers) to dynamically create VLabs and manage their users. VRE Managers can (i) authorize users to access VLabs, (ii) assign or withdraw roles to users, (iii) remove users, and (iv) send communications to the current users.

Collaboration Framework

Set of social tools to share data, updates and messages with others and to keep abreast of new data, prospects, services, users.

Workspace

Service to enable every user to store and organise the information objects he/she is interested in working with. In addition to that, the user is allowed to collaborate with other users by sharing objects and messages. 100 GB storage volume is guaranteed to every user.

Secure File Sharing and Storage

Service to enable secure and controlled file sharing.

Accounting Framework

Service enabling accounting resource usage and usage control via quota mechanisms.

Data Analytics Framework

A set of data analytics services including: (i) the DataMiner engine (a rich array of ready to use analytical methods ranging from data clustering methods to geospatial data analytics, occurrence data management, and species distribution maps generation); (ii) the Software and Algorithms Importer (SAI) (an interface allowing any user to easily and quickly import scripts into DataMiner); (iii) the RStudio suite allowing its users to perform online statistical analyses with R; (iv) the JupyterHub, a web-based interactive development environment for Jupyter notebooks, code, and data. It allows users to configure and arrange the user interface to support a wide range of workflows in data science, scientific computing, and machine learning.

SDI

The SDI (Spatial Data Infrastructure) provides users with the capability to store, discover, access, and manage vectoral and raster georeferenced datasets. The SDI exploits the following technologies: GeoServer equipped with PostgreSQL and PostGIS, GeoNetwork, Thredds. All the exploited technologies are fully integrated and deployed to ensure fault-tolerance, load-balancing, controlled and secure access, monitoring and accounting.

Publishing Framework

The Publishing Framework includes a set of services and components enabling their users to document and make “public” (made available online) any generated product. Its primary component is the VRE Data Catalogue. The VRE Data Catalogue service is a catalogue service built on open-source technology for data catalogues (CKAN ckan.org), but extended to (a) be integrated with D4Science services, (b) support a rich, community-defined, and extensible set of catalogue item typologies, and (c) manage publications for each VLab and automatic integration at Blue-Cloud VRE.

Website: D4Science.org

D4Science

Blue-Cloud partners involved