EcoTaxa is a web application dedicated to the visual exploration and the taxonomic annotation of images that illustrate the beauty of planktonic biodiversity. EcoTaxa was born from the experience developed at Laboratoire d'Océanographie de Villefranche (LOV) regarding the quantitative, highthroughput imaging of plankton and of the Oceanomics project which covered the exploitation of data collected during the Tara Oceans cruise, including quantitative imaging. It is now developed mainly through the WWWPIC project funded by the Belmont Forum and as part of the Blue-Cloud project.

The aim of EcoTaxa is to centralize images of plankton, to allow their collaborative sorting along a universal taxonomy and to accelerate it through machine learning. It produces ecological data in the form of concentration and biovolume of organisms in a given taxon, at a given station (lat, lon, time). Visitors have free access to the specimens that have been already identified by taxonomist experts. They can explore the database by navigating along the UniEuk taxonomic tree which aims at unifying taxonomic names and tree according to reliable and curated molecular phylogenies. It encompasses the whole Eukaryotic and Prokaryotic lineages (Viruses coming soon) that have been molecularly described. Then images can be filtered according to several sample criteria. Tools are provided to support the annotation of large image datasets by supervised machine learning prediction.

An interview with Jean-Olivier Irisson (EcoTaxa)

Type & number of data sets

Currently, EcoTaxa contains circa 95 million images of which circa 40 million have been annotated in about 1300 datasets. Of these, circa 20 million images concern living orgasnisms. The growth rate is circa 1 million images per month. Not all of these datasets will be accessible for Blue-Cloud, because this depends on the data policy of data providers while EcoTaxa does not enforce datasets to be public. It also depends on the ability in the Blue-Cloud project to develop a connector towards EuroBioImaging and/or EMODnet Biology. Priority will be given to the datasets needed by demonstrators.

Figure 1: EcoTaxa dashboard for annotation

Core Services

Currently, data discovery is manual. As part of the WWWPIC and Blue-Cloud projects it is aimed to build an API to allow programmatic browsing of the datasets and download of the data. Validated data from all public datasets can be browsed at and the datasets themselves are listed at Data can be downloaded once access to the dataset is granted by the dataset owner.

Figure 2: EcoTaxa interface for exploring images

See EcoTaxa Training Video below

