One way or another, the European Open Science Cloud (EOSC) will become a reality. The EOSC is the project through which the European Union hopes to make open research data the new standard. Preliminary plans suggest a model with a single portal where researchers or research fields can order and access different services.
Overambitious and vague
The progress of the EOSC was discussed during the international eScience conference, where the national eScience centers in Europe, united in PLAN-E, convened. Many of those involved in the project are disappointed in the plans for the EOSC thus far. Three years after it’s initial launch it remains unclear what the function of this cloud should be, and what the definitive design should be.
The EOSC will be launched on November 23rd at a big gathering in Vienna. What exactly is going to be launched on that day remains a question to the speakers at the eScience conference who are connected to the EOSC. They wonder out loud if the plan isn’t overly ambitious, and if existing infrastructures aren’t too diverse to integrate.
“The EOSC is supposed to be one large infrastructure,” says director of the Netherlands eScience Center Wilco Hazeleger, “but we need to be careful that it does not turn into a giant monolithic structure.” He emphasizes that there is an abundance of initiatives already in existence. Researchers all over the world already share their data within their own specific fields, so what’s the gain for them in joining a large project that lacks an end goal?
“Regardless of the goal and final form, this kind of initiative should do its very best to stay as close to researchers and research institutes as possible.” According to Hazeleger and many of the other attendees, the distance between researchers and the plans the European Commission has drawn up for the EOSC is, in fact, enormous. “There has to be a middle man who can deliver the EOSC’s possibilities to the researcher.”
Arjen van Rijn (Nikhef) emphasizes that this is by no means an easy taks. Van Rijn is one of the founders of e-IRG, a European strategic body made up of researchers that have been working on supporting an international e-infrastructure since 2003. “As researchers, we often spend many years working on creating functional data sharing infrastructures within our own respective fields.”
He suggests “horizontal” integration as a more desirable alternative than large infrastructures that span a variety of fields. On a local and national level, interesting and functional partnerships often already exist. Connecting these can be a fruitful enterprise within the EOSC. At the same time, Van Rijn admits that this would be a very difficult task.
The desired level of collaboration for the EOSC requires transparency about existing data-infrastructures from participating countries. In order to take stock hereof, the e-IRG sent out a survey to the different responsible parties. His results reveal primarily the degree of diversity.
“To be honest, I’m still not sure what the EOSC is either. When I heard the term for the first time, I thought: well, I’m really curious what this national open science cloud is going to look like.” As long as research institutes, let alone from member states, don’t have their open data-infrastructure in order, Van Rijn doesn’t see the how the European structure is going to actualized on time.
Uniformity shouldn’t be the goal
Whether the current funds to get the EOSC up and running are sufficient remains unclear. The so-called EOSC-hub, a collaboration aimed at launching the project, has considerable financial means, but whether that is enough is also unsure. “About 33 million euros was made available. That seems like a lot of money”, says director of the Finnish eScience center Damien Lecarpentier.
As manager of the EUDAT initiative, Lecarpentier is involved in the setting up of a pan-European data-infrastructure, and frequently runs into problems. “If we compare the scale of the project to the available funds, they are pretty disappointing.” At the very least, he notes that an undesirable result would be to strive for uniformity in the saving and sharing of data.
“Integrating all existing services and data is simply impossible.” According to him, this should not be the goal. Peter Doorn (DANS) adds that to this from both a Dutch and an international point of view. “There is so much diversity that it is impossible. It’s not just data, but also the fact that these are all different kinds of organizational models. You cannot put all of these together in one bag. I’m getting the impression more and more that the only reason we are still continuing is because the EOSC is simply ‘too big to fail’.”
The ‘long tail’ of science
In the end, all in attendance agree that the value of the EOSC should be that individual researchers or research fields consider the project to be of value. In that regard, the biggest challenge is going to be taking into account the so-called ‘long tail’ of science. This expression refers to the both the limited amount of research fields that consists of a large and homogeneous community, and the countless number of small research communities.
A comparable description can be used for datasets and the use of computing power within large infrastructures. About 80% of the capacity of French supercomputers goes towards one party: CERN. The other 20% is spread out over other areas, and the same goes for DANS. There are a few large datasets, and countless smaller ones.
The stinging reality is that several important domains of considerable size simply haven’t joined the EOSC yet. Medical and biomedical fields, for example, haven’t joined yet because of privacy concerns, and climatologists are perfectly content with the existing infrastructure as is. “If we have to choose, we should go for the horizontal approach,” Van Rijn says, referring to collaboration within disciplines. “I think that this is the only useful approach because it promises, at the very least, to be more efficient and sustainable.”