The Gulf of Mexico Coastal Ocean Observing System (GCOOS) is one of many data centers that gathers this information in central data portals and streams it out to industry, researchers, resource managers and the public with the goal of providing timely, reliable and accurate information about coastal and open ocean waters.
But how do the people putting the data to work judge the accuracy and reliability of the information they’re using? A new National Science Foundation (NSF)-funded project will develop the tools and the social and technical infrastructure to gather this “metadata” — the data about the sensors — so end users know where the information came from and how it was collected. The project will make this metadata easily discoverable, searchable and available to be incorporated into automated archival systems so users have a better understanding of the data’s quality and can use it appropriately.
The two-year pilot project is being led by scientists at Woods Hole Oceanographic Institution and research partners from GCOOS/Texas A&M University-Corpus Christi, the University of California, Santa Barbara, the Monterey Bay Aquarium Research Institute and Botts Innovative Research, Inc., and builds upon a previously developed model (called Q2O).
The project called EarthCube IA: Collaborative Proposal: Cross-Domain Observational Metadata Environmental Sensing Network, or X-DOMES, is part of a wider initiative between the NSF Directorate for Geosciences and the Division of Advanced Cyberinfrastructure called EarthCube. EarthCube is a community-led cyberinfrastructure initiative for the geosciences that supports teams who create, assess and align frameworks for sharing data and knowledge in an open and inclusive manner to enable an integrated understanding of the Earth system. EarthCube began in 2011 and is expected to last until 2022.
“This pilot project, if successful, could lead to products that will allow scientists to better understand data emanating from these sensors so they can explore issues like data discrepancies or how current observations can be used in conjunction with historical records to conclude a statistical trend,” said Co-Investigator Felimon Gayanilo, GCOOS Systems Architect and a researcher at Texas A&M University-Corpus Christi’s Harte Research Institute. “It should also help scientists, among others, figure out what could be causing differences in reports coming from neighboring sensors.”
The project will leverage existing relationships with NSF-funded data management programs, EarthCube, the ESIP Federation and environmental sensor manufacturers to establish a community with a unified approach to sensor description and allow for the automated recording and extraction of sensor-related metadata, he said. “Enhancing existing metadata tools and developing new software products will help researchers and scientists answer the commonly asked questions: How was this data recorded? What sensor or method was used to generate the data? Or even, how did we arrive at this data?”