NBN Crowdsourcing Data Capture Summit

Written by: Rachel Stroud and John Sawyer, NBN Secretariat

E: r.stroud@nbn.org.uk

 

On the 25th September, 58 members of the National Biodiversity Network (NBN) met to kick-start a strategy for using crowdsourcing to mobilise our extensive historic collections.  Held at Manchester Museum this event had representatives from Local Environmental Record Centres, NGOs, Government Agencies, libraries, herbaria and museums, Wildlife Trusts, National Schemes and Societies, academia, crowdsourcing platforms and international representatives from GBIF (based in Denmark) and the Naturalis Biodiversity Center (in the Netherlands). This is one programme of work that the NBN Secretariat is coordinating in conjunction with Network members to implement the new NBN Strategy 2015-2020. 

 

NBN members are world experts in collecting biological records, with more than 30,000 records collected each day. However, despite this, large quantities of undigitised historical data collections are threatened with loss or destruction and are inaccessible for onward reuse. Over the past 8 months the NBN Secretariat has collated meta-data for over 21.8 million biological records held in 21 different ‘undigitised’ formats by Network members across the UK. This is likely to be just the tip of the iceberg in terms of undigitised data.  

 

Some of these datasets will require professional data handling to mobilise, however for many datasets we can turn to ‘Crowdsourcing’ – the empowering of citizens to help with a task, in this case, mobilising data.  The focus of the NBN Crowdsourcing Data Capture Summit was to raise awareness of the wide spectrum of approaches to data mobilisation and use this information to build a strategy for the NBN to start mobilising its historic data holdings. 

 

Some people may question why historic data are important and why we are looking at this as part of the NBN Strategy 2015-2020.  The presentations during the day reaffirmed the need to collaborate to protect these data.  Not only because museums and other data centres are changing rapidly and there is an ever increasing lack of awareness of the collections held within, but also the increasing need to use these data in describing our changing environment. We know a lot about biological change over the last 3-4 decades, however most transformation of the British landscape happened before WWII. As Nick Isaac (CEH) said during his presentation, crowdsourcing the extraction of data from notebooks and diaries is an opportunity to re-shift the baseline of our knowledge of biodiversity in the UK.  The NBN must also engage more effectively with UK museums to help merge historic data collections with the data collected daily by NBN Data Partners. There was also discussion about how if we cannot justify efforts to preserve data collected years ago then why should anyone support future data collection. We undermine our entire industry by not caring for historic data.

Delegates involved in one of the workshops at the summit

 

Throughout the course of the day many common themes emerged including:

  • Crowdsourcing is not free and we should not assume that participation is a given.  Participants need a narrative and rapid feedback and platforms require community management for best results.
  • Understanding the motivation of participants is often very hard, however it is important to keeping volunteers engaged.   
  • Crowds are easily distracted and they will not always do what you ask them!  However, this is not necessarily a bad thing as datasets might contain rich information that we do know about until a crowd starts looking at them.
  • People want to feel they are making a difference, achieving a higher goal, and it is therefore imperative to link data digitisation to the end product and data users.  
  • One size will not fit all data. We have an amazing diversity of systems available and a diversity of approaches and tools will help us to be as efficient and effective as possible.
  • A clear work flow is needed to mobilise historic specimens not only to ensure enough material for participants to continually work with but also to ensure data quality once the data are mobilised. 

A full Summit Proceedings report will be published in the next fortnight.  In the interim the presentations from this event can be found online at the links below. They included talks from:

Zooniverse
Natural History Museum
Centre for Ecology and Hydrology
GBIF
Herbaria@Home – presentation to be added
The Atlas of Living Australia
Manchester Museum
Northamptonshire Biological Records Centre
NATSCA
Naturalis Biodiversity Centre
NBN Trust.

 

Our gathered group was never going to solve the enormous problem of determining how much data we have, where it is held and how to mobilise it in one day but we did want to start a conversation across our Network. That happened during three workshops in the afternoon on quality, engagement and efficiency.

 

The NBN Secretariat will develop this programme into an NBN working group that will progress the actions identified throughout the course of the day. Please email us at support@nbn.org.uk if you would like to be involved further in implementing this work.

Web design by Red Paint