Curating Spatial Data

The Why of Preserving, sharing, and harmonizing the Where

Kevin R. Dyke
Spatial Data Analyst/Curator
University Libraries, Univ. of Minnesota

What is spatial data curation?

Process of mediating the shape and flow of data from producers (for example, land use change, public health surveys) to consumers (planners, public health officials, researchers)

"a continuum of activities, supporting the requirements for both current and future use" (Rusbridge et al. 2005, p.1)


Why should anyone care?

The developing maturity of born-digital data products and their ever greater volume

"Dark data" and the "long tail of science"

Dark Data

Data that are not easily discoverable


The Long Tail of Science

"While great care is frequently devoted to the collection, preservation, and reuse of data on very large projects, relatively little attention is given to the data that is being generated by the majority of scientists" (Heidorn 2008)

"data from projects in the tail generally do not make it into repositories and fall into disuse and darkness"

Heidorn 2008, p. 288
As scholarly research and scientific study becomes increasingly driven by the analysis of data, long term access to these data is crucial in enabling the verification of scientific discovery and to providing a data platform for future research.
Rusbridge et al. 2005, p. 1-2

What is spatial data?

Data that contain some locational attribute

Put in such broad terms, what is and what is not spatial data comes down mainly to formatting

Who benefits from spatial data curation?

Producers

Consumers

Students

The Public!

Producers

Consumers

Students

The Public!

Recent survey of CLA faculty (Hofelich Mohr 2013)

  • Nearly two thirds of respondents reported that they do not currently share their data in any manner
    • 13% said their results should not be available
    • Large majority are willing to share in some way, but cited either lack of time, funding or a standard in their field

    • In other words, they could use some help.

Data management and sharing requirements from funding agencies, journals

Also Data-centric academic journals

Easing the task of data management while simultaneously improving it

Producers

Consumers

Students

The Public!

Data Consumers

  • Encourage meta-analyses and systematic reviews
  • Transform current datasets into historical systematically via coordinated life-cycle management
  • Develop more thorough lineage and quality information (and metadata more generally)
  • Harmonize and centralize data formatting and storage
  • Producers

    Consumers

    Students

    The Public!

    Students

    Spatial data sandbox

    Producers

    Consumers

    Students

    The Public!

    The Public!

    Access to raw (but curated!) research data

    Questions of intellectual property, types of access, security of different data

    Summing Up

    Help researchers secure funding

    Improve data managment and long term storage

    Demonstrate breadth and depth of research underway at the University

    references

    Heidorn, P. Bryan. 2008. "Shedding Light on the Dark Data in the Long Tail of Science." In Library Trends, Volume 57, Number 2, 280-299.

    Hofelich Mohr, Alicia. 2013. ""Results of CLA Faculty Data Management Survey." https://docs.google.com/a/umn.edu/file/d/0B1tV96Ef2Ic_a0RwaEF1M1c0UmM/edit

    Rusbridge, Chris, Peter Burnhill, Seamus Ross, Peter Buneman, David Giaretta, Liz Lyon, and Malcolm Atkinson. 2005. "The Digital Curation Centre: a Vision for Digital Curation." In Local to Global Data Interoperability-Challenges and Technologies, 2005, 31–41. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1612461.