Hack the ocean, shape the future – join the Blue-Cloud 2026 Hackathon!

Conference paper

Ocean Data Quality Assessment through Outlier Detection-enhanced Active Learning

Ocean and climate research benefits from global ocean observation initiatives such as Argo, GLOSS, and EMSO. The Argo network, dedicated to ocean profiling, generates a vast volume of observatory data. However, data quality issues from sensor malfunctions and transmission errors necessitate stringent quality assessment. Existing methods, including machine learning, fall short due to limited labeled data and imbalanced datasets.

Sharing digital object across data infrastructures using Named Data Networking (NDN)

Data infrastructures manage the life cycle of digital assets and allow users to efficiently discover them. To improve the Findability, Accessibility, Interoperability and Re-usability (FAIRness) of digital assets, a data infrastructure needs to provide digital assets with not only rich meta information and semantics contexts information but also globally resolvable identifiers. The Persistent Identifiers (PIDs), like Digital Object Identifier (DOI) are often used by data publishers and infrastructures.

Decentralized workflow management on software defined infrastructure

Data intensive workflow applications are characterized by their continuously growing volumes of data being processing, complexity of tasks in the pipeline, and infrastructure capacity required for computation and storage. The infrastructure technologies of computing, storage and networking have made tremendous progress during the past yeas. We review the emerging trends in the data-intensive workflow applications, in particular the potential challenges and opportunities enabled by the decentralized application paradigm. 

A Trustworthy Blockchain-based Decentralised Resource Management System in the Cloud

Quality Critical Decentralised Applications (QCDApp) have high requirements for system performance and service quality, involve heterogeneous infrastructures (Clouds, Fogs, Edges and IoT), and rely on the trustworthy collaborations among participants of data sources and infrastructure providers to deliver their business value. The development of the QCDApp has to tackle the low-performance challenge of the current blockchain technologies due to the low collaboration efficiency among distributed peers for consensus.

Towards A Robust Meta-Reinforcement Learning-Based Scheduling Framework for Time Critical Tasks in Cloud Environments

Container clusters play an increasingly important

role in cloud computing for processing dynamic computing tasks.

The resource manager (i.e., orchestrater) of the cluster automates

the scheduling of the dynamic requests, effectively manages the

resources’ utilization across distributing infrastructure resources.

For many applications, the requests to the cluster are often

with restricted deadlines. The scheduling of container clusters

is often tricky, especially when the cluster’s size is large and the

CBProf: Customisable Blockchain-as-a-Service Performance Profiler in Cloud Environments

Blockchain technologies, e.g., Hyperledger Fabric and Sawtooth, have been evolving rapidly during past years and enable potential decentralised innovations in a substantial amount of business applications, e.g. crowd journalism, car-sharing and energy trading. The development of decentralised business applications has to face challenges in selecting suitable blockchain technologies, customising network protocols among distributed peers, and optimising system performance to meet application requirements. Also, manually testing and comparing those different technologies are time-consuming.

A Decentralized Service Control framework for Decentralized Applications in Cloud Environments

Effectively managing decentralized applications in cloud

environments using a decentralized control paradigm is essential, as current

cloud providers usually only offer a control interface for monitoring

cloud infrastructures. This study proposes a decentralized service control

framework for implementing the control across various organizations and

coordinating collaboration among operators in a decentralized application.

The proposed framework allows a consortium of organizations to

An Adaptable Indexing Pipeline for Enriching Meta Information of Datasets from Heterogeneous Repositories

Dataset repositories publish a significant number of datasets

continuously within the context of a variety of domains, such as biodiversity

and oceanography. To conduct multidisciplinary research, scientists

and practitioners must discover datasets from various disciplines unfamiliar

with them. Well-known search engines, such as Google dataset and

Mendeley data, try to support researchers with cross-domain dataset

discovery based on their contents. However, as datasets typically contain

An Adaptable Framework for Entity Matching Model Selection in Business Enterprises

Entity matching is the process of identifying data in different data sources that refer to the same real-world entity. A significant number of entity matching approaches have been introduced in the literature, which complicates the selection process. In this study, we propose a framework to support researchers in finding the best fitting entity matching model (s) based on the characteristics of their datasets. The proposed framework can be extended by adding more models, features, and use cases.

Subscribe to Conference paper