Friday, 30 March 2012

The future of the eTextbook

Breakout Session: -Sara Killingworth, Maverick Outsource Services

Sara’s session discussed the market situation, development and possible future for eTextbooks. The data focuses on faculties and user behaviour.

Market transition from print to e:

ETextbooks are the last of the ebook category to be opened up. Sara pointed out that while they have been evolving through various mediums, they are still essentially in their original formatting.
  • The market value of eTextbooks in 2008 was $1.5bn and is expected to rise to $4.1bn by 2013.
  • In 2010 there were 19.5million ereaders sold and 18million tablets (15million were iPads). This is expected to rise to 150million ereaders and 100million tablets in 2013.

Even so, print textbook sales are still growing and students currently still prefer printed textbooks where they prefer the look, feel, permanence and ability to resell. Second hand print books are still cheaper which better suits a student budget. There are book rental options where a license can be bought on a chapter by chapter basis.

Subject area will affect the need for a permanent reference copy. For example: medicine students will want a reference copy of a textbook they can consult all through their studies and then as they progress on to a career but engineering students will find electronic textbooks better for gathering the latest data.

Faculties tend to choose relevant content over format and there is still a lack of titles in e-form. It was commented at this stage that there has been reluctance from faculties to allow students to use tablets rather than print books as they cannot tell in lectures whether the student is working or on Facebook. Despite this, data suggests the market is set to implode.

JISC usage study findings:
  • 65% of users use ebooks to support work/study
  • 50%+ access them through the library
  • Use of eTextbooks is linked to teaching/assessment
  • Flexibility and convenience of ebooks is valued
  • Use is hindered by platform limitations such as printing/downloading and access speeds

Basic Requirements of eTextbooks:
  • Access across all platforms and operating systems
  • Ability to personalise with notations and highlighting
  • Inclusion of self assessment tools
  • Inclusion of support materials from lectures
  • Links to real time data
  • Online tutorials
  • Video/audio to liven text

Development of e Textbooks
The JISC Observatory project showed ebooks are mostly used for quick fact finding, whereas printed books are preferable for extended reading. This type of usage suggests an expectation of a lower price point for ebooks. It was found that there wasn’t a considerable impact on print textbooks throughout the trial.

Benefits of eTextbooks:
  • Ability for them to ease bottlenecks in libraries when print items are on loan, particularly as there is increased usage of mobile devices amongst students.
  • The interactive tools can increase student engagement and learning outcomes as well as offer the ability to break them up into chapters and for them to be added into course packs along with videos, article and audio appropriate to the subject.
  • The online environment also offers the ability to collect usage statistics and faculties can see whether students are using non-recommended texts.
  • They could address students’ use of Wikipedia/Google if developed in line with user behaviours and expectations, but with the added benefit of the information coming from companies of professionals.
  • Tablets are also beginning to emerge as alternative access devices to laptops as their prices are driven down and they better suit the ubiquitous lifestyles of students

Apple iBooks
Sara mentioned iBooks which are eTextbooks designed specifically for iPads and apple devices and feature materials from large published such as Pearson and McGraw-Hill. There is also an option to create PDF versions for other devices. Apple are looking at selling preloaded iPads to schools in the US, though there was a general feeling that this was a marketing opportunity to sell iPads it was thought others would release competing products.

Pearson Foundation Study:
The study showed tablet ownership had trebled for college students in the last 3 years with 70% of students reading digital text and 75% using tablets for daily learning. It is believed that eTextbooks will replace print within 5 years.

The Future?
Sara finished by saying it was an evolutionary process and the speed of adoption was likely to depend on the subject area. Ease of access and use would also feature heavily.
There are different business models and it is still uncertain which one will be most popular. These include individual purchase by students, material-included based fees, PDA or all library budget being absorbed by digital materials.
Sara stated we are most likely going to live in a hybrid world for the foreseeable future.

Some comments from the audience at the end of the session:
  • Librarians are keen to buy eTextbooks for their students but institutional packages set forward by the publishers are felt to be unrealistic, particularly as they are then restricted by DRM issues.
  • DRM is a big problem, particularly as students will use an ebook to scan chapters/TOC to see if they want to read the whole item and then want to print the bits they are interested in.
  • Students are still reluctant to use purely e over print and not everyone has a tablet yet. Ebooks on smart phones are not ideal
  • There is a demand for eTextbooks but they are not being delivered.
  • Whilst the individual prices of ebooks may have gone down, the institutional prices are still very high.
  • Librarians will look at smaller publishers who are willing to offer more competitive prices over the larger companies.
  • There is a want for perpetual access to books.

Thursday, 29 March 2012

“I wouldn’t start from here” Overcoming barriers to accessing online content in libraries

Breakout Session 1:  – Dave Pattern, University of Huddersfield

This breakout session discussed the issues users have when trying to access electronic resources and why we should be making it as easy as possible to access information.

Dave had used Twitter to ask for feedback on what the one thing would be that people would improve about e-resources if they had a magic wand. Responses were:

  • Authentication
  • Ease of Access
  • Discoverability
  • Affordability
  • No DRM
  • Licensing

Conspiracy Theories:

Before going into these points a little further he discussed some conspiracy theories about libraries:

  • MARC 21: Why is there still the punctuation? Is it so cataloguers can print off perfect cataloguing cards? What are they really up to?

  • Why are librarians trying to turn users into mini-librarians and bombarding them with library terminology? We should be aware that users will use the path of least resistance, the easiest way from point A to point B, for example Wikipedia and Google.
    As an example he discussed helping students and troubleshooting issues they had getting into resources. This showed a user following an almost never ending chain of links and password logins (some not so obvious) before finally being turned away from the article they wanted to use. Then trying Google by searching for the article title and finding the first result to be an open access PDF. – Why would users want to go through all those complicated steps when the information they want could have been found so much easier? This led on to the last conspiracy theory:

  • We don’t want our users to be able to access our e-resources!?  There appear to be multiple barriers to gaining access to resources and this all works against Ranganathan’s 4th law of “Save the time of the reader”. Seamless access to resources is possible when everything works as it should so we need to simplify the process as much as possible for the user.

Discoverability Tools:

He then discussed discoverability tools and proxy server authentication and the impact it had had on e-resource usage. At the University of Huddersfield students are being directed to Summon as a first point of call and stats showed that full text download numbers increased suddenly with the use of a discovery tool.
Data they had gathered also showed that full text COUNTER statistics shot up after a publisher became indexed on Summon and that there was a decline in usage for those that were not indexed. There was also a decline in the use of platforms with open URL issues.

These statistics can of course have a significant impact once it comes to renewals so could be used as ammunition to get publishers to work together with discovery services (in this case Summon).

He then discussed serendipity in the library using recommendations like “people who borrowed this item also borrowed…” Adding these messages showed a wider use of library stock.

Library Impact Data Project:

This project run in 2011 aimed to prove the value that libraries give to students and to prove correlation between library usage and academic success or failure.
Usage data was taken from eight UK universities and strong correlation was found between good grades and the number of Athens logins, the total number of downloads and average total number of resources.
However, coming to a library PC was not necessarily as productive.

A study by Manchester Metropolitan University shows there is a possibility that students who use the VLE late at night are more likely to be struggling and to drop out. It also appears that students who use the library between 9am and 11am are most likely to be the highest achievers.

In Summary:

  • Save the time of the user
  • Make accessing e-resources as easy as searching Google
  • Information literacy is important but goes against the path of least resistance
  • E-resource usage is linked to attainment
  • Publishers need to make content available to discovery services
  • Build e-resources with serendipity

"I'd like to thank" linklist

I wanted to post a big public thank you to our blogging team, who have managed to capture so much of the conference, so quickly, despite the challenges of spotty wifi, lack of sleep and the temptation to get offline and into the sunshine. We've had over 500 visitors on this blog in the last few days, and no doubt many more reading via RSS. It's great to know so many people are benefiting from the hard work put in by the bloggers; do look out for a few final posts appearing in the next few days.

It was also exciting to see our Twitter stream so densely populated by faces old and new. You can view an archive of tweets here (big thanks to @mhawksey and @chriskeene for this genius bit of Tweet gathering and analysis - there's also an alternative archive with some interesting content analysis here). Thanks to everyone who participated in this way - the backchannel discussions were a fascinating mix of additional perspectives and good-natured banter.

  • Conference photos will appear in due course here, thanks to the peerless @SimonPhotos - @daveyp has also put a few on Flickr while @arendjk has some lovely timelapses on YouTube
  • There are already lots of presentations online here
  • Videos of the plenary sessions will arrive here
  • @archelina has written a great summary of the whole conference here and @AnnMichael has managed to capture the Wednesday morning debate here
Thanks again to all involved - bloggers, tweeters, speakers, sponsors, delegates, exhibitors, volunteers, committee members, staff and especially to the SECC wifi team .. oh, wait. Maybe not ;-) Hope to see many of you at our one-day conference, "Rethinking Collections: approaches, business models, experiences" - put it in your diaries now! (15th November, London). Meanwhile, don't forget to enter our photo competition now that you've got your hands on our new logo!

Use and abuse of analytics in the search for value

Grace Baynes, Nature Publishing  Group 

Grace took the breakout group through the complex issue of value: what does it mean and how do we measure it?

If we define value as relative worth, utility or importance, we have a dizzying number of things that we can measure, from absolute to relative, quantitative to qualitative. With the advances in technology, we can measure more than ever before, but being able to measure every single thing is not necessarily a good thing.  

Using the Gartner Hype Cycle’s different stages of enthusiasm and disappointment triggered by new technologies, we can see we are aiming toward the Plateau of Productivity with a lot of these new metrics, but we might have some way to go with some.

How do we pick through all the available information meaningfully?  Her advice is straightforward: think about what you really need to know, the questions that you need to ask, and then think about how exactly you want to go about discovering the answers.

The underlying areas of value measurement are still the journal, the article, grant, research output, the researcher, as well as the Eigen Factor and Article Influence score, but how reliable are all these metrics?

Usage continues to help understand the uptake of holdings decision and initiatives like COUNTER help benchmark and break down the information so that librarians can compare use across multiple publishers for their institutions.

The Journal Usage Factor allows the evaluation of journals and fields not covered by ISI as well as permitting the measurement of journals that have high undergraduate or practitioner use, although Grace noted that if you compare JUF with IF, you see very little correlation.

Cost Per Download should help us understand what is good and what is not so useful, but is there an absolute value to a download?  Is the download you cited more valuable than the one that you just read?  Recent research carried out by Nature Publishing Group show that Cost per Local Citation might move us closer to evaluating the real impact in research, as might Cost per Local Authorship.

And what about the Eigenfactor, Peer Evaluation or altmetrics including tweets, likes and shares?

It is a bewildering task to try to measure all this data, and while products from Symplectic or Elsevier’s Scival  can help gather and measure critical information, we have to think about what are the most important factors for decision making.

Grace then opened the floor to consider which metrics are important for the participants:
Common themes 
  • Information needed depends on who you are talking to and what is most meaningful, or will help academics keep the resource they need.
  • CPD is still important to feed back to finance department, and some institutions use CPD versus Document Supplied or ILL to get an idea of value for money.
  • Some institutions don’t look into great detail, gathering data on the bundle or platform, rather than individual titles.  This is usually done to convince funders that the content being bought is useful. 
  • Others have to go into detail to identify the best titles for the institution. This is due to funding restrictions.
  • CPD isn’t always accurate, as cheap journals aren’t necessarily good value for money, even if on the surface they look good.
  • Usage stats are helpful at a local level when deciding to buy or cancel, but from discipline to discipline  download levels and cost per downloads vary.
  • Local citations and actual use may be more helpful to understanding value, but this is very time consuming.
  • There’s a big call for being able to access denial data, to understand patron demand, but up until recently one had to ask publishers - difficult if you don’t have a relationship with the publisher. The next release of COUNTER will include denials.
Grace ended this highly interactive session with a caveat: we can’t quote metrics in isolation, we need to contextualize. We must present metrics responsibly.

Wednesday, 28 March 2012

Fail fast and frequently

The role of games, play and not being afraid to take risks were the major order of the hour in this breakout session led by Ruth Wells.

Innovation is one of those things that can be surrounded by management speak and a feeling of something I should be doing, but I don't know where to start. Or maybe that's just me?

To start the session Ruth led us on a journey of what innovation is and how it comes about? We discussed the role of games and play; that innovation is chiefly the meeting point between insight and invention and one of the ways to gain this meeting is to be free to play.

How much time do you get in your working day to play?

We discussed this in small groups and it was clear that the "doorway discussion" was quite important in publisher working environments, but for others who worked more remotely, there was quiet contemplation time, but less option for collaboration. In other organisations there was little or no time for this sort of play, unless it was taken out of personal time such as lunch breaks.

We then watched a video from Steven Johnson about the creation of good ideas. This introduced the concept of the slow hunch. The very best innovations are cumulative ideas that evolve over long periods of time, and during this time ideas are thrown out, reworked, refined and incubated until the innovation is born.

Hunches cannot progress in a vacuum, they are usually part formed and need collisions in order to fuse into ideas. The great driver of scientific progress has been collaboration and the internet, mobile devices and the increasingly sociability of the world around us offers many new ways to connect with people who have that missing hunch we are looking for. Chance favours the connected mind.

The group then talked about how chance can be enabled within our organisations, including creating the right spaces and dedicated time for people to come together. Much like the doorway collaboration, a coffee area can provide inter-team discussion and spark new innovations by providing a fresh perspective on problems.

Then we discussed company culture of allowing play and discussion and the drivers to this sort of experience:
  • a concise company mission
  • an understanding of organisational values
  • the strategic goals agreed and aligned
  • clear business objectives articulated
  • an understanding of the need for project planning and resources
  • buy-in from organisational leaders
It is not enough to say Go Innovate! the culture must come from the top, and be accepted from the employee to the CEO.

We then talked about workshops as a means of achieving the culture and collaboration. One group suggested that a sort of speed dating for innovation, or as I thought a musical chairs scenario, could work very well to mix up ideas between different employees from different departments.

It was explained that capturing the results of workshops and closing each idea that was opened, no matter how off topic, was as important as the process of idea generation itself. The ideas that were left after this closing process need to be followed up and acted upon.

As a summary of how enable this kind of culture, Ruth gave us the key points for leadership on innovation:
  1. Encouragement
  2. Leading by example
  3. Create space for discussion
  4. Actively feedback on ideas
  5. Direct but do not control
  6. Accept the potential to succeed AND fail
  7. Provide resources and mechanisms to deliver ideas
I've highlighted point 6 as this was the major take home message from the session for me. There is no point trying to create a culture of innovation if you cannot allow those innovations to fail. Pursuing ideas involves risk, an evaluation of that risk is important in projects, but the idea generation in itself must be free of this risk assessment, lest it be curtailed by it. Ideas can be closed before the project stage if the risk is deemed to be great.

In order to highlight the importance of failing we watched a snippet of this presentation from Tina Seelig from the Stanford Technology Ventures Program, entitled Fail Fast and Frequently, where she explains that if you are not failing sometimes then you are not taking enough risks. As long as you learn from failure then what you are doing is worthwhile.

After a short departure from talk about gameplay into an actual game, where we passed around bits of paper with ideas on about the function of a publisher rather than the form, the discussion moved onto ideas as a response to a problem without a solution.

Radical ideas can be like gambling and it makes sense for many organisations to not want to or not be able to gamble, therefore in closing out ideas it is important to have a common set of evaluation criteria.

These will help with the creation of a roadmap to move your ideas and innovations into projects, put your ideas into a four stage funnel:
  1. filter
  2. research more detail, consider the implications, lifetime costs
  3. develop
  4. provide ongoing support or abandon
Note that in step 4 there is still the possibility of an idea being closed. If at any point during the delivery process costs are expanding beyond the worth of the idea then it should abandoned.

Finally, Ruth outlined some top tips for innovation in organisations:
  • define process and strategy first
  • define what innovation means to your organisation
  • do no harm, but don't be anti risk
  • prototyping can avoid technical ambiguity
  • look at innovation as a function of your whole business

Finding out what to cut, how far to go and getting users to champion the library in a healthcare setting

This plenary session given by Anne Murphy discussed the systematic approach taken by the library in Adelaide and Meath Hospital in Ireland, when facing cuts of 25% in 2011 and 15% the following year to their library budget.

It addressed the key questions of when every journal is seen as essential what markers of true value can you assign, and how can you get your users to accept cuts within their department, or more precisely how to keep cuts fair and not lose the engagement of library patrons?

The first point discussed was how openness surrounding planned cuts was important to retain the library champions within the hospital. Working against this was communication channels that were not always as obvious as they should be. For example, use of the hospital email system is poor, so users had to be contacted by mail to ensure good coverage. It is no good trying to be open if you cannot reach the people you need to tell.

The second point was that the project was not just about balancing the books (or journals). The library saw the cuts as an opportunity to promote themselves with a use it or lose it message and to build credibility within the hospital's senior management.

How did they go about deciding what to cut? A three pronged approach combined to give a rounded picture of a resources value:
1. Cost per use
2. Responses from departments about value
3. The librarian's knowledge (e.g. is it a no-brainer keep or a small niche journal)

The library also tried to adhere to a few ground rules that attempted to retain the balance of the collection, such one journal cut in each department. All of this information and the ground rules were then used to assign each journal a category through a 3 stage process:
1. The no-brainers
2. The very expensive or low download journals
3. Department or specialty cut

After stage three the budget was totalled and they were still not at a 25% reduction, so a fourth was introduced:
4. Larger departments who have more than 1 journal

After one last "sanity check" evaluation, 73 journals were finalised as to be cancelled, one quarter of their total collection.

The library published a report on the process and marked all cut journals in red, retained ones in green, to make the whole process transparent and again re-enforce the idea that usage is important in retention decisions.

The comments from users on feedback forms demonstrated that expectations, throughout the project, were successfully managed and the library did not suffer any disengagement despite the large percentage cuts.

In 2012, a cut of 15% in budget was proposed. The library underwent the same process with usage analysis and mailing out questionnaires, except this time they asked users to nominate one title for cancellation.

The comments were overwhelmingly negative and it demonstrated that users felt that the cuts had gone too far. Overall the process was much more difficult and the library expects there to be a full report published to the hospital staff soon and for this to help galvanise users to support the library from further cuts.

The next step is a survey about content discovery and literature use, and there is a possibility of documents on demand in the future, depending on the outcome of this surveying process.

Debate: The future for scholarly journals: slow evolution, rapid transformation – or redundancy?

Plenary Session 5
The first session of the last day at UKSG took the form of a debate between Cameron Neylon and Michael Mabe on the future for scholarly journals. There was an impressive turnout despite the 9am start and the Ceilidh the night before!

The transformation is already here - it's just unevenly distributed
Cameron Neylon

First presentation in the debate, arguing that the transformation is already here.

"Large institutions seek to preserve the problem to which they are the solution" - Clay Shirky

What do we mean by a journal?
Traditionally we have thought of a journal as having the following characteristics:
  • Journals contains articles
  • There is a process to select articles for inclusion
  • There is a publisher who manages the process
  • A journal will only belong to one publisher and a single article will only belong to one journal
  • There is a single version of record
How is technology changing the look of the 'journal'?
Neylon argued that new tools are changing how content is made available and that this should challenge our view of that a journal is.

He gave 2 examples of this:

WordPress - WorldPress, the free blogging software, now supports many journal type publications. The service is free and will only continue to get better. The software gives anyone the ability to put together a 'journal' in 10 minutes and lots of free plugins are available to add functionality such as commenting or PubMed IDs for citations. Some examples of journals on the WordPress platform includes PLoS Current: Disasters the Journal of Conservation and Museum Studies

Figshare - Site for sharing figures and metadata, doesn't fit traditional idea of a journal, but provides very useful information to researchers. For Neylon this raises questions of what is smallest useful piece of research? why are we still tied to the idea of a journal article?

Do we need journals any more?
Neylon argues that you can get answers from Google that direct you into trusted databases. When looking for an answer to a specific question to progress an experiment he did a comparison between Googling for the answer or looking at scholarly articles. After spending 6 hours collating the information from the articles he had what he needed, yet he had the answer in minutes from Google. Neylon said he would choose a database or Wikipedia over a journal article when looking for answers to specific questions as it is just so much quicker.

"The research literature just has a poor user interface" Greg Gordon - SSRN

Neylon gave the examples of stackoverflow and mathoverflow as great forums for finding answers.

Why do researchers still write articles?
Researchers are attached to journals, but why? Neylon argued this was more about prestige and wanting to feel like their research is important; collecting notches on their bed posts! When reading they hate people who write articles, but yet they keep writing the articles, as they need them for advancement, this is just not sustainable.

What will the future look like?
Neylon argues that someone somewhere is going to figure out best way to make the user interface work, if publishers don't look to do this another player will. Once we stop presenting and consuming articles people will stop writing them.

He suggested that the future would involve publication in smaller pieces (think Lego heads), which might then be built into larger things (think Meccano cars), with different pieces being put together to create exactly what the user wants, delivered differently to different audiences.

So why hasn't the journal changed more as a result of the internet?
Michael Mabe

Second presentation in the debate, arguing that the fundamental appearance of journal articles has remained, and will remain, remarkably unchanged.

Why hasn't the journal changed more?
Mabe argued that he wasn't defending a non-technical status quo, but that even those cross-referencing and linking between content is now the norm, the fundamentals seem unchanged.

Digital Incunabula argument
Mabe argued that the real revolution was not in the introduction of printing, but in the idea of the book with the introduction of the Codex; splitting up long scroll into pages. The structure of a book is deeply embedded in human culture and 2 millennia of habit and utility are going to take some undoing.

Darwian Angle
Mabe argued that researcher behaviour is key to understanding why the journal still exists. With a researcher having 2 very different modes: Author mode and reader mode.

Author mode:
  • To be seen to report and idea first
  • To feel secure in communicating that idea
  • To have their claim accepted by peers
  • To report their idea to the right audience
  • To get recognition for their idea
  • To have a permanent public record

Reader mode:
  • To identify relevant content
  • To select based on trust and authority
  • To locate and consume it
  • To cite it
  • To be sure it is final and permanent
Functions of the journal a la Oldenburg
Henry Oldenburg outlined the key roles for academic journal publishing as registration, certification, archiving, dissemination, navigation. These roles are still seen today.

Generational Change?
Mabe argued that we are confusing the mass market with the scholarly market. How people act in professional life is different to how act in private life. Researches are still required to publish their work and young researchers are actually more conservative that their older peers as they need to make their name. There are NEW tools but they serve OLD purposes - Technology just enables greater efficiency. The system has evolved to satisfy needs, the human needs of researchers, until these change the scholarly article will remain the same. If an asteroid hit tomorrow and we rebuild from scratch we are likely to create something very similar.

The debate

After both Neylon and Mabe had presented their arguments the session was opened up to debate, a summary of which is included below.

Neylon: Where the disagreement is on what will be the important pressures. The asteroid that could makes us re-think things will be public looking at what we are doing and saying it is not up to scratch. Researchers are very conservation, but there will be pressure to change.

Mabe: It is a case of publisher or perish, is not just about career progression. Authors from industry, rather than in academic institutions, are not being promoted for publishing, but publishing because they want to be recognised as being first person to think of idea.

Gedye: We are getting little views through little windows, but what going on in room behind the window? Where are the PDFs in the examples show, a lot of contradictions in what has been said.

Neylon: Loath looking at PDFs, want to read on screen. Sense shifting, people no longer printing out PDFs.

Mabe: Download figures still show a predominance of PDF usage. The form of the article is more about establishing trust and authority not consumption. We only so much time so want to read something we trust. There are 2 types of behaviour information seeking and literature consumption.

Neylon: Most information is in text form in journal articles, people using tools that sit on top of those.

Audience: There seems to be a disconnect between what early adopters think is important and what mass market wants.

Twitter: Isn't this discipline specific?

Neylon: Within Physical and Biological sciences there are smaller fragments that are useful. A lot more work to be done to understand the differences; boundary between smallest useful fragment and how this needs to be aggregated to be useful to different audiences. Likely to end up with different forms in different disciplines.

Mabe: There is a tendency to paper over differences in disciplines. It is the idea that really matters. In the sciences are concerned about speed of publication, not such a concern for other disciplines. Who does registration and do you trust them to register it, this is where trusted 3rd parties comes in.

Twitter: Micro-publication will change behaviour and needs.

Mabe: Argument about reducing publication down to lower level such as paragraph, could become more of a network of links.

Neylon: Will see a change in what researchers do. Lower burden of publication and authoring; very expensive process, lots of things never get authored as too much work. But what can we do with the content?

Audience: Driving force is finding more satisfying way to meet needs - Can make better ideas when work together rather than alone.

Neylon: Stack Exchange model - Asking and answering questions, can up vote responses to build reputation, then get more control to down vote, remove comments etc. Managed by community, reputation is the key. Great place to find people with specific expertise. Registration and certification still very important. Works in specific domains.

Mabe: Moderation is very important. Community need to have confidence in something.

A very thought provoking and lively session, a great way to start the final day!

Tuesday, 27 March 2012

Mobilising your e-content for maximum impact

Breakout session 5 led by Ruth Jenkins (Loughborough University) and Alison McNab (De Montfort University).

The session kicked off with a brief overview of some of the mobile services currently offered. Most are based around issues and articles and browsing content and are publisher specific. This creates a number of issues.

  • The user has to know who publishes the journals they want to read (this also assumes they know what they want to read) and go and download the right App!
  • Users are publisher agnostic, they just want the stuff and don’t really care about who the publisher is
  • Apps are often designed for browsing - Issue to table of contents to article, whereas users want to search
  • No link with resource discovery systems such as Primo or Summon
  • No integration with reference management software
  • May not be available on all platforms - Device specific apps
  • Off campus access is often limited – not truly a mobile service

Positioning of the library

Mobile gives publisher opportunities to interaction directly with end users, previously libraries were the gatekeepers and directed users to the content.

Publishers overestimate how much end users know their brand, certainly undergraduates and early years researchers don't know the publishers or in some cases the titles they should be focusing on. Libraries try and present everything they have access to, not publisher by publisher.

What are challenges mobilising your e-content?

At this point the post-it notes came out and the audience was asked to think about the challenges they face in mobilising e-content, both from the library and the publisher perspective.

Common issues for libraries:

  • No single place listing which publishers have mobile offering
  • How to make users aware of the mobile sites/apps available
  • How to integrate mobile optimised links in the library catalogue
  • Support for large number of interfaces - lack of standardisation. How do you test access problems on multiple devices? Budgets don't extend to purchasing all types of devices let alone ensure these are up to date
  • Connectivity issues. Not everyone has or can afford 3G and wireless can be unreliable
  • Sites try to replicate all of desktop functionality, but it this what the users want?
  • Multiple authentication processes, hard to explain to users
  • Off campus authentication - in some institutions e.g. the Open University there is no campus or the student never comes onto campus
  • No way to search across apps
  • High student expectations
  • Licensing restrictions
Common issues for libraries:

  • Cost of development
  • Pace of technology change
  • Whether to create device specific apps
  • Providing user friendly tools to allow libraries and users to get the most out of mobile
  • What features to include

Kevin Ashley on the Curation of Digital Data

Curation is often thought of as a passive activity.  Once deposited, content is simply “preserved” into perpetuity.  This couldn’t be further from the truth and Kevin Ashley, Director of the Digital Curation Center made the point deftly in his plenary talk at the end of day one of UKSG.  If there was a person who could keep a group in rapt attention after a long day of sessions, it certainly would be Kevin.

Much like the curation of publications, active curation of research data is critical to its good stewardship. Curation implies active management and dealing with change, particularly technological change related to electronic information.  Ashley made the point that while curation and preservation are linked, they are not synonymous activities.  Curation is both slightly more and slightly less than preservation.  Curation implies an active process of cutting, weeding and actively managing the content, and regularly deciding when things should be retired from the collection. Ashely also made the pint that there are benefits to good preservation management.  It can generate increased impact, it can add a layer of accountability, and can address some legal requirements.
The DCC Curation Life Cycle model during Ashley's presentation

Interestingly, within the UK, while most of the Research Councils place expectations on data management policies on the researcher, the Engineering and Physical Sciences Research Councils (EPSRC) has begun putting the expectations onto the institutions, not on the PIs.  (NOTE CORRECTION, applies only to EPSRC, not all RCs as originally noted). In part the UK’s system of educational funding allows for this type of central control on institutions.  Each approach has its benefits, but from a curatorial perspective the institutional mandate focus, will likely ensure longer term and more sustainable environment for preservation.  The current mandate in the UK is that data be securely preserved for a minimum of 10 years from the last use.  Realistically, this is a useful approach for determining what is most valuable.  If data are being re-used regularly, than curating it for the next 100 years or more is a good thing.  Any content creator would hope for that type of success in the long-term continued use of their data.

The other aspect of data curation is actually to support the data’s eventual use.  “Hidden data are wasted data,” Ashley proclaimed.  Again, it is important to reflect on why we are preserving this information; for use and reuse.  Which reinforces the need to actively encourage and manage digital data curation.

Particularly from a data sharing perspective, data are a more than an add-on to publication process, but it also poses some other challenges.  One example Ashley described is that “Data are often living”, by which he means that data can frequently be updated or added to regularly, so the thing an institution is preserving is constantly changing.  This poses technical problems as well as issues with metadata creation and preservation.

There are several projects ongoing related to scientific data curation, use and reuse.  Those interested in more information, certainly should look to some of the reports that the DCC has published on What is digital Curation?,  Persistent Identifiers, and Data Citation and Linking.  There is also a great deal of work being undertaken by DataCite and the Dryad project.  NISO and NFAIS are working on a project on how best to tie these supplemental materials to the articles to which they are related, one question this project is addressing is who in the scholarly communications community should be responsible for curation of these digital objects.

One might well reflect on one of the quotes that Asley began his presentation with:
 “The future belongs to companies and people that turn data into products”
-- Mike Loukides, O’Reilly. 
If this is really to be the case, ensuring those data are available for the long term will be a crucial element of that future.

Marshall Breeding on the future of web-scale library systems

Every information management business seems to be moving “to the cloud”.  Over the past few years, a variety of library software providers have been applying this model to a rapidly growing segment of the library community.  The technology that libraries use to manage their operations is undergoing significant change and transformation.  Marshall Breeding, Director for Innovative Technology and Research at Vanderbilt University Library, presented during the second plenary session on The Evolving Library.  Breeding’s talk “The web-scale library – a global approach” focused on the opportunities that this move could include.
Breeding's Slide of current LMS/ERMs

Beginning with the observation that current library management systems are overly print-focused, siloed and suffer from a lack of interoperability. In addition, the online catalog – as a module of most ILS – is a bad interface for most of the resources that patrons are most interested in.  For example, an OPAC’s scope doesn’t include articles, book chapter, or digital objects.  The fact that libraries don’t have the appropriate automation infrastructure and while this creates significant challenges, it also presents an opportunity for libraries to rethink their entire technology stack related to resource management. 

Moving library information from in-house servers to a cloud solution provides a variety of benefits and cost savings.  There are the obvious benefits, such as hardware purchases, regular maintenance, power, and system updates and patches.  However, this really not the core benefit of a cloud solution.  Breeding described simply having data hosted on the network, provided only the simplest and least interesting benefits.  Breeding focused more on the potential benefits and efficiencies of having a single application instance and a cooperatively collected and curated data set

Breeding's vision of how new Library Management Systems will be integrated
In Breeding’s view, the future consideration of which systems to select will not be based upon features or services.  All systems providers will end up concentrating on a similar set of services.  What will distinguish and differentiate product will be how open the systems are.  Of course, as many of these systems move increasingly to an integrated service suite, fewer libraries will want to or need to patch some other service onto the system.  He also made an interesting note that since the launch of these new systems, the pace of implementations has skyrocketed.

Breeding covered a tremendous range in his talk, so one can’t be critical of what wasn’t included.  That said, here are some questions this move will elicit eventually: Who can claim ownership of data that is collectively gathered and curated?  What is specifically one institution’s versus another?  Once an institution moves into a web-scale system that is based on a collective knowledgebase, how might an institution transfer to a new provider and what data would be taken with them to a new provider?  A great deal of these issues will be the focus of many conversations and best practice developments over the coming years as libraries work to deal with these new systems.

Marshall tweets @mbreeding and blogs regularly on library technology issues at

Exploring the impact of article archiving

Julia Wallace, plenary 4 (repository reality)

The PEER project is a collaboration between all stakeholder groups in scientific education, investigating the impact of large-scale systematic article archiving. Its participating publishers include the large publishers, university presses, society publishers - between them contributing 241 journals, including top, middle and lower tier journals across four broad subject areas. Participating repositories are similarly broad. Articles were either deposited by publishers or self-archived by authors; publishers provided metadata for all their articles, whether or not they deposited the full text. Authors were invited to self-deposit via a project-specific interface. Over 53,000 manuscripts were submitted by publishers. 11,000 authors were invited to self-archive; 170 did so.

Challenges on the publishing side:
  • Publisher workflows - extracting manuscripts at an unusual point in the workflow required changes
  • File formats and metadata schemas varied and required normalisation
  • Journals contained many different types of content
  • Some metadata didn't exist at early stages of the workflow eg DOI (some publishers updated metadata after initial deposit)
  • Some repositories wanted additional metadata beyond core elements
Challenges on the repository side:
  • Varying metadata requirements and ingestion processes
  • Struggled with embargo management
  • Author authentication
  • Log file provision
Ongoing research:
Independent from the executive members, to avoid bias - managed by a Research Oversight Group. Author questionnaires, usage analysis, interviews. Behavioural research looked at the behaviour of authors and users, exploring perceptions of green open access and expectations / concerns around repositories:
  • Only a minority of researchers associated repositories with self-archiving
  • Preference for final version
  • Authors don't see self-archiving as their responsibility
The research has also explored processes and costs - eg salary cost of peer review is $250 per article plus overheads. No economies of scale. Production costs of up to $470 per article. Platform set-up and maintenance costs range from $170k to $400k [interesting data!]. Challenges of competing with established publisher platforms.

Main outputs from research: preliminary indicators show 5% migration from publisher platforms to repositories. Continuing to explore accuracy of that across the board, and trends. Registration is free and open for a review meeting in Brussels at the end of May.

Defining value: putting dumb numbers to work

Grace Baynes, Nature Publishing Group leading a group discussion on use (and abuse) of analytics, with a good mixed group of librarians, publishers and one precious researcher.

What do we mean by value?
We're focussing today on the *relative* worth, utility or importance of something - numbers by themselves aren't that valuable; we need a context. We can measure downloads, time spent, social media influence, but just because we can measure something doesn't mean it is helpful or valuable to do so. We need to become more refined in how we are applying metrics.

What do we want to know the value of?
We talk primarily about journal value, but article-level metrics are increasingly important, as is the value of an individual researcher.

What indicators can we use to measure value?
Impact factor, cost, return, usage, meeting demand (qualitatively assessed). Usage breaks down in a number of ways and combines with other data eg to calculate cost per use - but what represents good value? Does it vary from field to field? How do you incorporate value judgements about the nature of the usage? Nature doing some preliminary research with Thomson Reuters here, looking at local cost per citation (ie comparing usage within institution to citations of those articles by authors within those institutions), in comparison to competing journals (Science, Cell, PLOS Biology). Picked some of the leading institutions in the US, and also looked at numbers of authors, number of citations across key journals. Grace throwing out to librarians - is this interesting? Would this data be useful in evaluating your collections?

Moved on to discuss the Eigenfactor - a Google PageRank for journals? Combining impact factor / citations and usage data in a complex algorithm.

Peer review - F1000's expert evaluations being turned into a ranking system.
what about truly social metrics e g Altmetrics (explore free demo from Digital Science, looking at tweets, blogs, reddits etc). Also ref Symplectic and SciVal as examples of visualising Analytics data.

Questions: as we move to OA, cost per use less important - what metrics will become more important? E.g. Speed to publication? BioMedCentral publish this for each article. How would it be translated into an easy to measure value?

Are all downloads equally valuable? OUP did some research into this; good articles would get double downloads (initially viewed in HTML, then downloaded as PDF if useful - so that conversion is one good indicator of actual value. Likewise, if people who *could* download full text but didn't bother, having read the abstract, that's a potential indicator of non-value).
But, this approximator of value becomes less reliable as the PDF becomes less popular as a format. Assuming that a download is more valuable if it leads to a citation - flawed - what about teaching value? Point of care use? Local citation gives a flawed picture.

The library experience
Big institutions have to use crude usage metrics to inform collection development, because more detailed work (as reported by this morning's plenary speaker Anne Murphy) is not viable at scale. But librarians know that an undergraduate download is less valuable than a postgrad download, in the sense of how important that precise article is to the reader. Citation too doesn't equal value - did they really read, understand, develop as a result of reading that article?

Ask users for reason that they're requiring ILL: what are you using this for? Fascinating insight into different ways that content is valued. Example from healthcare: practitioners delaying treatment until they can consult specific article. "We're sitting on a gold mine" - value of access to information services - showing impact. [Perhaps useful for publishers to try and capture this type of insight too - exit overlay surveys on journal websites, perhaps?]

Role of reading lists - can we "downgrade" usage where we know it has been heavily influenced by a reading list? Need to integrate reading lists and usage better.

As well as looking at usage and citations, there's a middle layer - social bookmarking sites and commentary can also indicate use / value and are much more immediate to the action of using the article than the citation.

What impact will changing authentication systems have? Will Shibboleth help us break down usage by user type? This is what Raptor project does - sifting Shib / EZProxy logs to identify users, but reliant on the service provider having maintained the identifiers and passing them back to the institution with usage data. Current bid in to JISC to combine JUSP with Raptor - agreement from the audience that this would be *HUGELY USEFUL* (hint, hint, please, JISC!)

Do people have the time to use the metrics available? One delegate recommends Tableau instead of Excel to analyse data - better dashboards.

Abuse of metrics: ref again to problem with impact factor using mean rather than median, and examples of when that has caused problems (a sudden leap to an impact factor in the thousand thanks to one popular article). Impact factor also cannot cope when bad articles are cited a lot because they're bad - not all citations are equal.

"Numbers are dumb unless you use them intelligently," says Grace. "We need to spend less time collecting the data, and more time assessing what it means."

Alternative metrics for journal quality: the usage factor

Jayne Marks is starting things off for us on another glorious day here in Glasgow. Usage factor vs impact factor - why will the extra metric be useful? What research has been done so far? What next?

Usage factor vs impact factor
With pretty much all journals online, and COUNTER well established and respected, we have a good source of reliable data to explore individual journals and their usage, as an alternative to citations (which underpin the impact factor). The impact factor, while widely respected and endorsed, is not as widely applicable across different disciplines (it's optimised for the hard sciences; in other areas eg nursing or political science, content can be well used and valued but not cited) and is US-centric. The usage factor will provide a new perspective, available and applicable to all disciplines, to any publisher prepared to provide the data. It will serve better those disciplines where usage of the content is more relevant than citations.

How did the usage factor evolve?
The usage factor project - sponsored by UKSG among others - sought to consider critical questions, including:
  • Will it be statistically meaningful?
  • Will it gain traction?
  • Will it be credible / robust?
Authors, editors, publishers were surveyed to assess whether such a metric would be of interest. Example data was analysed - 150,000 articles - to model different ways of calculating a usage factor (detailed report available from CIBER).

How is the usage factor calculated?
To avoid gaming, the usage factor is calculated using the median rather than arithmetic mean. (Comments welcome to elucidate that - my maths is a bit rusty!) a range of usage factors would be published for each journal (afraid I missed the detail on this while pondering the arithmetic - I think it would mean across a range of years). The initial calculation for a title would be based on 12 months of data within a maximum usage window of 24 months. A key question is when the clock starts ticking - when the article is submitted? When it's published online? When it goes into an issue? Does it matter if publishers decide this differently?

What might the future hold?
The team behind the usage factor suggests that ranked lists of journals by usage could be compiled eg by COUNTER to enable comparison. There are concerns about gaming but the robustness of the COUNTER stats and the use of the median should help to repel most attempts at gaming (CIBER's view is that the threat is primarily from machine rather than human "attack"); the project's leaders continue to consider gaming scenarios and welcome input from "bright academics" who can help to posit potential gaming scenarios ("we're not devious enough").

Work is still required on the infrastructure, for example, to understand how we can extract data from publishers and vendors. The project's ongoing work is being led by Jayne Marks (now at Wolters Kluwer) along with Hazel Woodward and a board of publishers, librarians and vendors. Thanks were noted to Peter Shepherd and Richard Gedye.

Question: shouldn't we be moving to article-based metrics? Marks: it could break down to that in due course.