By: Neeshe Khan
“Science should be
open and not closed behind lab doors” was the concluding sentence of
Geoffrey Boulton’s talk Open data and the
future of science which received a resounding applause.
The talk began with Henry Oldenburg’ correspondence, the
first Secretary of the Royal Society who exchanged letters with scientists
discussing the quality of manuscripts prior to publishing- the very beginning of
peer review. But perhaps more importantly, one of the requirements for
publication was that the concept being proposed by the scientist was to be
published with the data. This open communication of data between scientists and
the public not only revolutionized science at the time but formed a basis of
scientific progress ever since.
Currently, sharing large data files accompanying articles
can be problematic which in turn can result in a lack of replicability and
credibility of the concept being proposed. However, it is fundamental that a
published concept must be supported by and printed alongside its metadata for
science to progress, even if it is by way of disproving. In the words of
Charles Darwin when referring to disproving a concept, “…one path towards error
is closed and the road to truth is often at the same time opened.”
The sharing of metadata is also of crucial for us to be able
to link data in an intelligent way that supports an in-depth understanding.
With the advancement in technologies and access to data we are able to solve progressively
complex problems and produce solutions. We are also able to make increasingly
accurate predictions (for example weather forecasts which are then re-evaluated
in context of reality to increase accuracy for future predictions) and due to
the technological advancements the data that is gathered is ever more complex,
sophisticated, and factual. Sharing this data will allow science to move from
“simplicity” to complexity and from uncoupled systems to highly coupled systems
with iterative integration.
This leads to the bigger question of how to extract
meaningful knowledge and information from the “Big Data” that is collated to
seize the opportunities above, as opposed to the ethos of sharing the data
itself. Bearing in mind that deductions from Big Data would make a lot of conventional
analytics invalid, for Big Data to be effectively exploited it is imperative to
move beyond the current notions of openness and start with “intelligent
openness”. This means that the data, metadata and software must be
discoverable, accessible, understandable, assessable and reusable, and catered
to its respective audience (scientists, citizen scientists or the public)
whilst maintaining certain boundaries such as privacy, safety, security, dual
use and legitimate commercial interests. And in order to make sense of this
intricate, complicated data, imagine a black box that churns out visualizations
of a string of mathematical equations for instance. This “black box” is then a
source of numerous difficult questions like “who owns the back box?” “What is
the human role?” “Who has access to this box?” “Can we analyse and scrutinize
what is in the black box?” and “What does it mean to be a researcher in a data
intensive age?” etc.
But how do we adopt this infrastructure of highly coupled
systems which are supported by iterative integration and intelligent openness? Historically
this used to be under the remit of the library but recently adaptability has
been driven by the changing technology (and thus the evolving job roles). There
are now also many organizations and institutes alongside the Library that assist
with the Library’s efforts to collect, organize, and to preserve knowledge, making
it accessible and dissipating it to the wider group (for example the efforts of
The Royal Society). However the responsibility of facilitating this
infrastructure lies with a range of groups, from scientists, universities,
funders of research, Publishers, learned societies to the government & EU.
Currently, although science is universal it is carried out within a
jurisdiction. For this infrastructure science needs to transcend borders,
supported by a shift in scientists’ thinking towards sharing data and
information to achieve and share scientific progress. In essence, science has
been and will be the driver for societies to develop and progress and thus
science now, more than ever before, needs to be open and not closed behind lab
doors.
No comments:
Post a Comment