Publishing the Grasslands Data Integration Database for access to extensive and integrated cross-site ANPP data

Nicole Kaplan
Kristin Vanderbilt
Christine Laney

Annual Aboveground Net Primary Production (ANPP) datasets represent a core area of research in the Long Term Ecological Research (LTER) Network and many other programs. The Grasslands Data Integration (GDI) database is innovative in that it contains ANPP data from seven sites that are integrated at the level of the species and sampling unit, which facilitates fine temporal and spatial scale analysis of patterns of ANPP and species diversity. We propose an ASM working group to prepare the GDI dataset for submission to Ecological Archives and to discuss opportunities for including the GDI within the cyberinfrastructure of the LTER network and beyond.

Background: In 2003, LTER information managers and computer scientists from The Evergreen State College interested in data integration and semantics began work on the GDI project to address the challenges in integrating long-term ANPP data from different LTER sites whose data were in incompatible syntactic formats and used different experimental methods and semantics. In 2007, we began working with ecologists interested in ANPP and expanded the GDI database to over 100,000 observational records from four LTER sites (KNZ, JRN, SEV and SGS) and one ILTER site (Kruger National Park in South Africa). Ecologists, information managers, and computer scientists participated in the development of database tools that enabled both data producers and consumers to upload and explore the data sets in the multi-site GDI database, which helped to rectify data quality issues, standardize species codes, and identify statistically valid units of comparison across sites. In early 2009, ecologists from Kiskun LTER, Hungary and two grassland sites in South Africa integrated their ANPP data into the GDI database. The GDI database is now both reliable and robust enough to facilitate synthetic research, enable reporting of long-term trends in ANPP, and support cross-site multivariate analysis. The data integration work was published in the proceedings of Ecology Informatics, 2008 and was selected for inclusion in a special issue of Ecology Informatics.

ASM Workshop: At the 2009 All Scientists Meeting, we will prepare the data set for publication in Ecological Archives, as well as explore next steps for the GDI, which might include a plan to integrate more data, and contribute to EcoTrends and the LTER Network Information System. We invite scientists and IMs who have been involved in the development of the GDI to join this working group to conduct a final quality assurance check of the data before it is published. We also will seek final permission to publish all years of data that should be publicly accessible based on the terms of the LTER Data Access Policy. The published dataset will include a MS Access database, minimal metadata for each site, and some typical queries scientists have used to extract data from the database.

 Integrating Ecological Data: Notes from the Grasslands ANPP Data Integration Project Judith B. Cushing, Nicole E. Kaplan, Christine Laney, Juli Mallett, Ken Ramsey, Kristin Vanderbilt, Lee Zeman Jincheng Gao, Judith Kruger, Carri Leroy, Daniel Milchunas, Esteban Muldavin.

Working Group Session 5

Wed, 09/16/2009 - 10:00am - 12:00pm
Longs Peak Chasm Lake
Judy Cushing