NBN Record Cleaner
NBN Record Cleaner is a new, free software tool to help people improve the quality of their wildlife records and databases.
Whether you are an individual recorder or work in an organisation such as Local Record Centre or a Recording Scheme, the NBN Record Cleaner is designed to help you spot common problems in your data. The goal is to aid the process of data cleaning and ensure the quality of any datasets you pass on to others.
It is designed to access biological records stored in a wide variety of formats such as text files (CSV, tab delimited, etc), Excel spreadsheets and databases - including those in biological recording packages such as Recorder and MapMate. It also allows you to check that your dataset is in the NBN Exchange Format prior to submission to the NBN Gateway.
“validation is the process of checking if something satisfies a certain criterion”
The tool first “validates” your data – checking the format against a set of built-in rules. This includes spotting bad dates (e.g. 31st February) or spatial references (e.g. TL123) and checking the spelling of items like species and vice county names. You can correct any problems on screen or change the original source and reload before proceeding.
“verification: confirmation or additional proof that something that was believed (some fact or hypothesis or theory) is correct”
You then choose the “verification rules” you want to apply. These essentially check whether the data is credible and give you warnings about records that are unusual in some way and need further investigation. These checks can include issues such as a record of a terrestrial animal which is in the sea, or outside its currently known range, or occurring at a time of the year when it is not expected.
Verification requires additional information that is supplied in the form of “rule files” which are maintained by relevant experts. You choose which rule sets you wish to use and they are downloaded and installed from the internet. The application automatically notifies you about which rule-sets are available, or have been updated.
As well as presenting the records with the potential problems highlighted, the tool also allows you to map your records. This helps you to spot misplaced records.
The tool does not change your original data. It produces reports of the items that were queried, but you must apply any required changes using whatever tools you normally use to manage your data.
An updated version of the NBN Record Cleaner (V22.214.171.124) has been released (Feb 2013). This is a minor update to fix a couple of bugs relating to exporting passed records and error generated from running the seasonal rules. This update can be downloaded from the NBN Record Cleaner itself by clicking on the software updates link or as a fresh download and install by clicking on the link above. The previous upgrade was in October 2012 - NBN Record Cleaner (V126.96.36.199).
Note that when installing this on Windows Vista or Windows 7 please ensure you are running as an administrator.
The species dictionary used by the NBN Record Cleaner has been updated to use the latest version of the UK Species Inventory (previously called the NBN Species Dictionary). This can be downloaded using the software updates available link on the Data Load form. On using the NBN Record Cleaner the species dictionary should then automatically update with an "updating species dictionary" message before the dataset is validated. If this does not happen then it can be forced to update by downloading one of the ID difficulty rules available from the rule categories link on the Data Load form.
New or fresh installs of the NBN Record Cleaner also require the updated UK Species Inventory to be downloaded using the software updates.
NBN Record Cleaner demonstration
View the powerpoint presentation in video format which demonstrates how Record Cleaner works.
Improving data quality - creating verification rules
The reports below are all concerned with improving data quality and have all been produced under the NBN Trust's contract with Defra. They outline how verification rules were created and what verification processes should be used - essentially, what to do with records flagged up by Record Cleaner.
Under the NBN Trust and Defra contract 2011 - 2014
British Arachnnological Society & Spider Recording Scheme - Improving the quality of spider records available via the NBN Gateway
Bristish Dragonfly Recording Network - Verification rule sets for NBN Record Cleaner and recommendations on species whose records should be treated as sensitive
Enhancing data quality of bryophyte records for the National Biodiversity Network
Mammal records verification rule sets for NBN Record Cleaner and recommendations on species whose records should be treated as sensitive
Under the NBN Trust and Defra contract 2008 - 2011
Improving data flow was a considerable part of the Greater Access to Data theme under the NBN Trust's contract with Defra from 2008-2011 and the following reports have been produced:
Marine Biological Association - Marine Biodiversity Data Flow
Butterfly Conservation - Improving the quality of lepidoptera records available via the Gateway
British Trust for Ornithology - Enhancing data quality of bird records
Botanical Society of the British Isles - Improving the quality of botanical records available via the NBN Gateway