Paired with the emerging concept of Open Science, what influence does the digital paradigm have on the sharing of academic research results?
What about scientific research’s concepts of reproducibility and reusability? Quick overview and focus on the current situation of Data Science.
A highly digitalized world at the origin of Open Science
Within the past few years, the notion of Open Science has become the subject of numerous debates and articles. The aim is to impart hypotheses, protocols and results and submit them to critical analyses as to achieve continued improvement. Open Science promotes a scientific approach covering practices based on the use of internet, collaborative tools and the “social” web.
The portrait of the isolated scientist furthering his work in his workshop or laboratory, before publishing a paper disclosing frequently difficult to verify conclusions, has now been archived.
For numerous researchers Open Science implies the idea that everyone should be able to validate a published experiment.
Society’s digitalization has led to a significant evolution of the way research results are imparted whether within an academic or industrial framework.
Science has grown from publishing detailed conclusions (first paper-based then electronic) to the publication of associated data, then proceeding to code and, in the near future, to an interconnected “internet of things”.
For numerous researchers Open Science implies the idea that everyone should be able to validate a published experiment and therefore scientific publications should and will progress towards an interconnected complex ecosystem. This step requires the development of digital solutions and innovative computer tools.
Data Science does not escape this movement and even tends to speed it up. For instance, the sensitivity of certain data emanating from Medicine or Finance implies a need for access control, data protection and increased traceability which is only possible when data storage, as well as analysis and interpretation are part of the same implementation environment. Such an environment would also allow for the validation of research results by substituting one data set for another. Likewise we could envisage replacing part of the research code as to compare results and ultimately validate, invalidate or improve the conclusions of other scientists. This open and collaborative approach thus implies the ability to reproduce the experiments conducted by others within a system ensuring optimal traceability, a feat still difficult to achieve today.
Future developments will keep datascientists busy for a few years!
Olivier Verscheure, Executive Director, Swiss Data Science Center