Thursday, 9 April 2015

Open data and the future of science

 Speaker: Geoffrey Boulton
By: Neeshe Khan

Science should be open and not closed behind lab doors” was the concluding sentence of Geoffrey Boulton’s talk Open data and the future of science which received a resounding applause.

The talk began with Henry Oldenburg’ correspondence, the first Secretary of the Royal Society who exchanged letters with scientists discussing the quality of manuscripts prior to publishing- the very beginning of peer review. But perhaps more importantly, one of the requirements for publication was that the concept being proposed by the scientist was to be published with the data. This open communication of data between scientists and the public not only revolutionized science at the time but formed a basis of scientific progress ever since.

Currently, sharing large data files accompanying articles can be problematic which in turn can result in a lack of replicability and credibility of the concept being proposed. However, it is fundamental that a published concept must be supported by and printed alongside its metadata for science to progress, even if it is by way of disproving. In the words of Charles Darwin when referring to disproving a concept, “…one path towards error is closed and the road to truth is often at the same time opened.”

The sharing of metadata is also of crucial for us to be able to link data in an intelligent way that supports an in-depth understanding. With the advancement in technologies and access to data we are able to solve progressively complex problems and produce solutions. We are also able to make increasingly accurate predictions (for example weather forecasts which are then re-evaluated in context of reality to increase accuracy for future predictions) and due to the technological advancements the data that is gathered is ever more complex, sophisticated, and factual. Sharing this data will allow science to move from “simplicity” to complexity and from uncoupled systems to highly coupled systems with iterative integration.

This leads to the bigger question of how to extract meaningful knowledge and information from the “Big Data” that is collated to seize the opportunities above, as opposed to the ethos of sharing the data itself. Bearing in mind that deductions from Big Data would make a lot of conventional analytics invalid, for Big Data to be effectively exploited it is imperative to move beyond the current notions of openness and start with “intelligent openness”. This means that the data, metadata and software must be discoverable, accessible, understandable, assessable and reusable, and catered to its respective audience (scientists, citizen scientists or the public) whilst maintaining certain boundaries such as privacy, safety, security, dual use and legitimate commercial interests. And in order to make sense of this intricate, complicated data, imagine a black box that churns out visualizations of a string of mathematical equations for instance. This “black box” is then a source of numerous difficult questions like “who owns the back box?” “What is the human role?” “Who has access to this box?” “Can we analyse and scrutinize what is in the black box?” and “What does it mean to be a researcher in a data intensive age?” etc.


But how do we adopt this infrastructure of highly coupled systems which are supported by iterative integration and intelligent openness? Historically this used to be under the remit of the library but recently adaptability has been driven by the changing technology (and thus the evolving job roles). There are now also many organizations and institutes alongside the Library that assist with the Library’s efforts to collect, organize, and to preserve knowledge, making it accessible and dissipating it to the wider group (for example the efforts of The Royal Society). However the responsibility of facilitating this infrastructure lies with a range of groups, from scientists, universities, funders of research, Publishers, learned societies to the government & EU. Currently, although science is universal it is carried out within a jurisdiction. For this infrastructure science needs to transcend borders, supported by a shift in scientists’ thinking towards sharing data and information to achieve and share scientific progress. In essence, science has been and will be the driver for societies to develop and progress and thus science now, more than ever before, needs to be open and not closed behind lab doors.

No comments:

Post a Comment