Below is a brief whiz through the findings of the project thus far, looking at the each work package in turn. Take a deep breath…
WP1: Geography variables review
Review geography variables in national survey datasets (1960-2010). Verify to which readily available boundary and point geography resources spatial units can be mapped.
During the review the following information has been compiled in a metadatabase:
UKDA study number; Dataset title; Depositor; Temporal coverage; Geographic coverage; Geographic variables / spatial units (in cat.); Smallest geography variable; Sample size; Access conditions dataset; URL to catalogue record,; Availability of variable; Variable name(s); Variable label(s); Unit definition(s); Values and labels from data file; Coding (if not arbitrary); Boundary file dataset could be linked with; Notes
Information on appropriate UK Borders boundary files have also been added to the metadatabase.
Secure Data Service: 12 surveys reviewed
Smallest geography variables: 2 surveys have postcode grid reference (BHPS and Understanding Society); 8 surveys have ONS anonymised postcodes; 1 survey has parliamentary constituency
Special Licence access: 17 surveys or time series reviewed
Smallest geography: Local Authority district (6), Output Area (2), Lower Super Output Area (2), county (1), parliamentary constituency (1), postcode (1), GOR (5)
Standard Licence: 172 datasets selected for review; 52 datasets fully reviewed (various part of time series); others partially
For at least 54 datasets the smallest geography is Government Office Region. In other datasets, smallest geographies present include: parliamentary constituency; NHS Trust; Police Force Area; County; Constituency code; Local authority; Standard region; Ward; District
General comments based on the review:
There is a continuing upward trend in the quantity, resolution and quality of geospatial variables in data deposits. We need appropriate metadata to make these variables accessible and interpretable by potential users.
We need to provide solutions and resources for making such low level geographies available
The variables and variable labels used across time series may not always be consistent, a fact which should be reflected in documentation (currently is not)
Depositor supplied documentation of spatial units can be poor. Harnessing survey data in GIS can be a painstaking process; the information we have compiled makes it easy for users by providing defined and time-stamped units, information on temporal coverage, and links to pertinent boundary files.
Output Area (and the related Super Output Area) is becoming an increasingly common choice of provided unit by depositors.
WP1: User consultation
Assess user needs of existing users of Archive data who use data in geospatial applications.
Processes and problems discussed are common to many users spoken with, and indeed to working with social science data full stop. The availability of low level geographies (such as postcode/grid reference) via special licence and SDS is extremely important to researchers who work with spatial data.
Some of the key points that users would like addressed are:
Time referenced unit definitions (the single most emphasised issue!)
Preferably postcode or grid reference – lookups such as the NSPD can then be used to derive other units
Choice of units (if postcode or grid reference not made available) to suit different users – for example, electoral research may require Parliamentary Constituency while a teacher may just want GOR.
Complete coding of variables to conform with above definitions using some kind of standardised system e.g. the GSS Coding and Naming provided by ONS
The relaxation of licensing rules on data. The obvious disclosure prevention measures (removing detailed geography) have been mentioned, but also copyright issues attached to boundary data.
Small sample size is often an impedance to spatial analysis, and guidance could be provided by the Archive on this
WP2: Spatial unit definitions
Develop standardised and time-referenced spatial unit definitions for use as metadata in the Archive catalogue
An extensive database of spatial units is under construction, which includes administrative, electoral, health, historical and statistical geographies. It also contains information on the frequency of changes and the authority that manages/maintains them.
WP4: Metadata mapping to INSPIRE
Map UK Data Archive catalogue metadata to INSPIRE/Gemin2.1 metadata requirements and develop a roadmap of measures and developments required for the UK Data Archive catalogue metadata to become fully INSPIRE / Gemini 2.1 compliant and to be cross-searchable by Go-Geo!
Metadata Mapping from UKDA metadata to INSPIRE, GEMINI to DDI2.1 is complete, and is now being reviewed and assessed by EDINA. There are no problems anticipated in reaching INSPIRE compliant metadata alongside impending DDI3 update to Archive metadata schema
There is ongoing discussion with EDINA on DDI/INSPIRE workshop for social science data at UKDA.