Back to Top

Share Data > Providing Data > NBN Data Exchange format

GUIDE TO THE NBN DATA EXCHANGE FORMAT

INTRODUCTION

The NBN Exchange Format is the principal way of supplying datasets for upload to the Gateway. The format is also suitable for exchanging data more generally within the NBN. The guide is split in to four sections:

  • Section 1 provides information on providing metadata.
  • Section 2 describes the NBN Exchange Format, explaining what it is and which columns of data you may include.
  • Section 3 provides information on tools currently available to help format your dataset into the NBN Exchange Format.
  • Section 4 provides examples of datasets in the NBN Exchange Format.
  • Annexe provides an Optional XML metadata header

If you require further help with formatting your dataset, please contact data@nbn.org.uk

Download the Exchange Format

Download the Data provider pack

Download the Data Exchange Agreement

Demonstration of the NBN Exchange Format

The video below shows a powerpoint presentation explaining the NBN Exchange Format. Please download the latest version of the NBN Exchange Format (version 2.6) above to see the additional changes made to the Format since version 2.4 is shown in this video.

METADATA SECTION

Datasets exchanged within the NBN should always be accompanied by metadata. Metadata allows potential data users to assess whether a dataset is suitable for a particular application. Metadata is usually supplied by filling out the Metadata Form for Species Data, downloadable within the Data Provider Pack.

An alternative approach is to provide the metadata using a XML header within the exchange format file. This approach is best suited for automated export tools, particularly if you anticipate exchanging datasets regularly within the NBN. The XML header is optional provided a completed metadata form is supplied.
If you wish to supply metadata as an XML header within the exchange format file, then please refer to the list of metadata elements, illustrating the structure of the XML header, given in Appendix.

WHAT IS THE NBN EXCHANGE FORMAT?

The NBN Exchange Format is text-based and has been designed to be straightforward to produce from a variety of applications. In its simplest form, it encapsulates the basic components of a species occurrence record (what was recorded, where it was recorded, when it was recorded, and who recorded it). However, it is extensible and can include any additional data associated with each record.
The text file comprises one record per row with values separated by tabs. It must fulfill these following rules:

  • The first row of the data section must contain column names, selected from the list of reserved names listed below, plus any additional columns you want to include.
  • Each record within the exchange format file must occupy one line only. Tab and end-of-line characters must not appear anywhere else in the file.

There is no need to include optional columns that don’t contain any data in your dataset and the columns can be in any order.

Examples of datasets in the NBN Exchange Format are provided in section 4.

Information on columns within the NBN Exchange Format

Below is a full list of the reserved column names and a brief description of them. For clarity the columns have been divided into groups each headed by a brief summary relevant to that group of columns.

Reserved column names are given in bold.

  • [Required] indicates that that particular column is compulsory. Other columns are optional and may be left out.
  • Maximum number of characters allowed within each column are given underneath the column name.
  • Restricted format or values within a particular column are given underneath the column name.
  • Additional information relevant to each column is given to the right of that particular column.

Identifying records

It is essential that each record within your dataset can be uniquely identified on the NBN Gateway for validation and updating purposes. Use the RecordKey to uniquely define each record.

Datasets on the Gateway may be further divided into logical subsets, for example representing real separate surveys, records from different sources, field trips, museum collections, or recorders. Use the survey keys to divide your dataset as you wish.

Samples are defined as records collected at the same site on the same day, but you may use the sample key to group records as you would like.

RecordKey

[REQUIRED]

·      max 30 characters

·      each key unique

This should ideally be the primary key associated with the occurrence or biotope record in your database.

If your records do not have a primary key an alternative is to use sequential numbers instead (i.e. 1,2,3,4 ….).

SurveyKey

·      max 30 characters

 

In addition, you may optionally supply the names of the surveys, along with their surveykey, in a separate text file or spreadsheet.

These survey names may then be incorporated into survey section of the metadata section on the NBN Gateway during loading of your dataset.

SampleKey

·      max 30 characters

 /span>

It is not necessary to supply the name of the sample.

 

WHAT: Species/Habitat                                                                                                           
Each species or habitat record must be associated with its NBN Gateway code. Records should contain either the TaxonVersionKey or Biotope Key but not both keys.
A list of TaxonVersionKeys or BiotopeKeys can be obtained from the Gateway team (data@nbn.org.uk)

TaxonVersionKey [REQUIRED]

·      max 16 characters

 

This is compulsory for species records.

In addition, you may optionally supply the taxon name, as an extra field. This will allow us to check that each taxon-versionkey is associated with their correct taxon.

ZeroAbundance

·      T or F

(True or False)

This column indicates whether or not the record represents a confirmed absence.

The NBN Gateway can now support absence records. If this field is not supplied in the dataset it will be presumed that all records refer to presence data.

Sensitive

·      T or F

(True or False)

 

Whether the record is sensitive or not (default is F).

A sensitive record may have additional access restrictions applied to it on the NBN Gateway.

 

 

WHEN: Dates
Dates on the Gateway are stored in the ‘vague date’ format. Vague dates are created by specifying the start and end dates of a date range together with a one or two character code (DateType), which identifies the type of vague date.  Examples of vague dates are:
 
StartDate
EndDate
DateType
Description
16/06/2000
16/06/2000
D
Date specified to the nearest day.
16/06/2000
18/06/2000
DD
Date specified to a number of days. 
01/06/2000
30/06/2000
O
Date specified to the nearest month (first day of the month to the last day of the month)
01/06/2000
31/07/2000
OO
Date specified to a range of months (first day of the start month to the last day of the end month)
01/01/2000
31/12/2000
Y
Date specified to the nearest year (first day of the year to the last day of the year)
01/01/2000
31/12/2001
YY
Date specified to a range of years.
 
31/12/2000
-Y
Only the end date to the nearest year known.
 
 
ND or U
‘No date’ or ‘unknown’.






















Alternatively, you can supply a single Date column, which translates into a vague date with the same StartDate and EndDate and with date type “D” (a single day). A dataset can contain dates presented as vague dates in some rows and date in others but a single row must never contain both.
One or two digits can be supplied for the day and month but take care to supply the year properly. Dates with 2-digit years will be interpreted literally. For example, the date 21/09/97 will be interpreted as the 21st September AD97. Ensure you supply the full year.
StartDate
[REQUIRED]
(except –Y, ND,U dateType)
Date Formats allowed
·      DD/MM/YYYY
·      YYYY-MM-DD
 
EndDate
[REQUIRED]
Date Formats allowed
·      DD/MM/YYYY
·      YYYY-MM-DD
 
DateType
[REQUIRED]
·      D,DD,O,OO,Y,YY, -Y, ND or U
 
 

 

WHERE: Location information

Each row of the dataset must contain location information as a valid spatial reference. This can either be GridReference, two separate values for East and North or a FeatureKey to an associated spatial feature such as a site boundary defined as a polygon. A single row can contain either GridReference, East and North, or FeatureKey values but not more than one of these three spatial types.

GridReference and East and North georeferences must be associated with its respective projection and precision.

SiteKey

·      max 30 characters

A key to identify unique sites within the dataset.

SiteName

·      max 100 characters

The name of the site or location where the species or habitat was recorded.

GridReference

[REQUIRED]

(if no East and North or FeatureKey)

e.g. NY532471

35/532471

Grid references should be in the typical Ordnance Survey ‘Landranger’ format.

Grid references should not contain any spaces or “-“

For 5km grid references the lower left 1km grid square can be used along with a precision of 5000m. 5km grid references will be converted to the corresponding to 10km grid reference on the NBN Gateway.

The DINTY system for 2000m precision should be used. Alternatively the lower left 1km grid square can be used along with a precision of 2000m.

8 or 10 figure grid references can be supplied with the corresponding 10m or 1m precision. These grid references will be converted to the corresponding 100m grid reference on the NBN Gateway.

East

[REQUIRED]

(if no GridReference or FeatureKey)

 

Position of site in an east/west direction.

Can either be an easting (in metres, either on British or Irish grids) or a longitude (in decimal degrees, according to a particular datum).

Positive longitude values indicate a position east of the Greenwich median, negative values positions to the west.

North

[REQUIRED]

(if no GridReference or FeatureKey)

 

 

Position of site in a north/south direction.

Can either be a northing (in metres, either on British or Irish grids) or latitude (in decimal degrees, according to a particular datum).

Positive values of latitude indicate a position north of the equator.

Projection

[REQUIRED]

(for GridReference or East and North)

Grid allowed

·      OSGB (British Isles)

·      OSNI or OSI (Ireland)

.     CI (Channel Islands)

Datum allowed

·      WGS84 (eg. GPS device)

·      OSGB36

.     ED50

Projection system for the grid reference:

Can be the British or Irish National Grid (“OSGB”, “OSNI”, “OSI”), Military Grid used over the Channel Islands ("CI") or the datum for long/lat coordinates ( “WGS84”, “OSGB36”, "ED50").
For the Channel Islands the datum will be assumed to be ED50. Grid references or eastings and northings using other datum will need to be transformed to ED50 before submission.
"CI" projection should not be used for lat/longs coordinates recorded using WGS84 datum (eg from a GPS device). These Channel Islands records should use WGS84 in the projection field.

Precision

[REQUIRED]

(for GridReference or East and North)

·      Min = 1

·      Max = 10000

The spatial precision, of the georeference in metres. This is used to determine the size of the square used to display the data on the NBN Gateway.

This might have to be estimated in some cases (especially with long/lat georeferences).

Values for the NBN Gateway are 10000 (a 10km precision), 2000 (2km precision), 1000 (1km precision) and 100 (100m precision). Other spatial references can be included but these will be rounded up with a maximum of 100m precision on the NBN Gateway.

FeatureKey

[REQUIRED]

(if no GridReference or East and North)

·       Datasetkey+ProviderKey

·       max 40 characters

 

This key refers to the unique key of the feature to which the record is associated.

On the NBN Gateway this key is a combination of the datasetkey of the site layer and the providerkey of the individual site boundary. A list of these keys can be provided by the NBN Gateway Team (data@nbn.org.uk).

 
 

 

WHO: Recorder’s Name
 

Recorder

·      max 140 characters

 

The name or list of names for one or more recorders for the species or habitat record.

Determiner

·      max 140 characters

 

The name or list of names for one or more determiners for the species or habitat record.

 

Attributes

Additional data associated with each occurrence record can be added to the file as extra columns, known as occurrence attributes.

Occurrence attributes are not directly comparable between datasets from different sources (e.g. abundance can be measured in many different ways) so they are treated as unique to the dataset.

Some examples of typical occurrence attributes are Count, Males, Females, PercentageCover, Area, Comment etc.

 The maximum number of characters that may be used for the title of the name column is  40 characters.

Attribute Name

·         max 255 characters

One or more attribute columns can be included in your dataset.

Tools available for formatting your dataset

The Gateway team can provide you with a range of functions, addins and advice to help you format your dataset into the NBN Gateway Format (data@nbn.org.uk).

NBN Record Cleaner tool is available to allow you to check that your dataset is in the correct format and is ready to be loaded onto the NBN Gateway. This tool and accompaning document on its use with the NBN Exchange Format is available to download from the NBN website

TaxonVersionKeys and FeatureKeys

The NBN Gateway team can supply a list of taxonversionkeys for the taxa present in the NHM species dictionary or FeatureKeys for sites present on the NBN Gateway. Please contact data@nbn.org.uk to request these keys.

Microsoft Excel and Access

The following Visual Basic for Applications functions are available to help you format your dataset in either Microsoft Excel or Access.

  • Vague Dates: Converts single date in DD MMM YYYY, MMM YYYY, YYYY, Season YYYY (eg 12 MAR 2008, Summer 2008) to start date or end date format (DD/MM/YYYY)
  • Date Type: Gets vague date type from start date and end dates in date format (DD/MM/YYYY)
  • Precision: Gets precision from grid reference (in Ordnance Survey ‘Landranger’ format eg. NY532471). Precisions for grid references with incorrect format are returned as ‘Unidentified’.
  • Projection: Gets projection (OSGB or OSNI) for grid reference (in Ordnance Survey ‘Landranger’ format eg. NY532471). Projections for grid references with incorrect format are returned as ‘Unidentified’.
  • SiteName: Checks name of site is not longer than 80 characters.
  • Attribute: Checks attribute is not longer than 255 characters.

When using any of the above functions please check that they are producing the correct output. Send any comments and requests for further functions to data@nbn.org.uk.

Recorder

Recorder add-ins for both Recorder 2002 and Recorder 6 are available to export records in the NBN Exchange Format

Recorder 2002 add-in (obtained from data@nbn.org.uk)
This add-in exports the columns: RecordKey, SurveyKey, SampleKey, 3 Vague date columns, TaxonVersionKey, ZeroAbundance, 2 Site columns, 5 georeference columns and 2 Recorder names columns.
It does not export confidential records (Sensitive column) or any additional attribute columns. If you wish for these columns to be included in the NBN Exchange Format then contact the Gateway team. By providing a copy of the Recorder 2002 database (nbndata.mdb), the Gateway team can help extract the records into the NBN Exchange format.

Recorder 6 add-in (obtained from Recorder website)
This add-in exports the columns as described in the Recorder 2002 add-in. In addition it may also export confidential records (Sensitive column) and a number of attribute columns (Abundance, Substrate, SampleMethod and Comment).
If you wish to export records from Recorder 3 or other recording software packages then contact the Gateway team We will be able to provide advice or assistance in formatting your dataset into the NBN Exchange Format.

MapMate

SQL to extract records from MapMate in the NBN Exchange Format is available to for use as a user defined query. Please contact the NBN Gateway team (data@nbn.org.uk) who will be able to supply this SQL as well as providing advice and assistance in its use.

If you wish to export records from Recorder 3 or other recording software packages then contact the NBN Gateway team (data@nbn.org.uk). We will be able to provide advice or assistance in formatting your dataset into the NBN Exchange Format.

NBN Record Cleaner

This desktop application will check that your file is in the NBN standard exchange format and is ready to be uploaded onto the NBN Gateway without any errors. It also allow you to verify and map your records against verification rules. The validator can handle datasets up to 500,000 records/span>

Instructions on using this tool are available with the software download from the NBN website.

 

Example datasets in the NBN Exchange Format

 Included below are tthree example datasets formatted into the NBN Exchange format.

  • Example 1 shows a simple file of 20 records, all with grid references and vague dates. The dataset has not been further divided into surveys or samples. It includes additional optional columns of SiteName and SiteKey, Recorder and one attribute column, Abundance.
  • Example 2 consists of 20 records, recorded using different georeferencing (Grid References and Latitude and Longitude). The date columns include a mixture of vague dates and date column. The dataset has been further subdivided into surveys.
  • Example 3 shows 13 records recorded against a site. The FeatureKey is a combination of the datasetkey for the spatial layer on the NBN Gateway “GA000413” and the providerkey for the site boundary “3435”.

Example 1: A simple file with few optional fields and one occurrence attribute (Abundance)
A simple file with few optional fields and one occurnace attribute (abundance)


 

Example 2: A file with geographical and date information supplied in different ways in different rows

A table with geographical and date information supplied in different ways in different rows
 

Example 3: A file showing records recorded against a site using the FeatureKey field/span>

example3.png

 

© National Biodiversity Network 2011. Registered in england and wales 3963387. Registered charity 1082163