Conference Plan 2011

From Evolutionary Interoperability and Outreach
Jump to navigation Jump to search

iEvoBio

Further Steps to a MIAPA Protocol

lead: Jim

presenter: TBD (Arlin if necessary)

topic: status of MIAPA effort, including demo projects and survey .

Authors: Jim Leebens-Mack

Abstract

Publishing Re-useable Phylogenetic Trees, in Theory and Practice

lead: Arlin

presenter: TBD

topic: practices (not necessarily the best) for publishing a phylogenetic tree, based on the TDWG report and subsequent analyses.

Authors: Brian O'Meara, Jamie Whitacre, Ross Mounce, Dan Rosauer, Arlin Stoltzfus, Rutger Vos

Abstract Sharing and re-use of data are essential to the progressive and self-correcting nature of science. Recognizing this big-picture goal, journals and funding agencies have promulgated policies to encourage sharing of data in the sense of information, including observational data as well as computed inferences. Here we summarize an ongoing analysis of 1) current practices for publishing phylogenetic trees and associated data; 2) current barriers to effective sharing and reuse of such data; and 3) prospects for reducing these barriers to promote more widespread sharing and re-use. In current practice, the technical infrastructure to support rudimentary archiving is in place, even if limited. Yet, most published trees are not archived, and there is no community standard governing the format or content that an archived phylogenetic record should follow in the interest of making it reusable. Without a shift in emphasis toward re-useability, along with technology and standards to support such a shift, the archival value of many trees will be limited. Interviews with actual or potential secondary consumers of phylogenetic results suggest that there is a considerable market for re-use, but that most attempts end in disappointment. Phylogenetic results available via author requests, journal web sites, archival repositories and project web sites rarely include the critical information that secondary consumers seek, such as unique identifiers for biological sources (including species sources and accession numbers), indicators of quality, and documentation of the analytical methods used to obtain the results. Catalyzing greater re-use may depend on identifying -- and targeting with appropriate technology and standards -- the most promising circumstances for re-use, which may be the extraction of sub-trees from large trees (for use in reconciliation, classification, and comparative analysis); the re-use of seed alignments, sub-alignments and homologized characters; the re-use of workflow conditions; and the construction of supertrees and supermatrices. Enabling effective re-use of phylogenetic results for these and other circumstances will require the research community to commit at least three changes from current practice: 1) using globally unique identifiers (GUIDs) for informational and material entities referenced by phylogenetic results; 2) developing standards for documenting and exchanging the metadata necessary to support reuse; and 3) participating in efforts to develop a Minimal Information for a Phylogenetic Analysis (MIAPA) standard that governs what these metadata should at least comprise, and the quality with which they need to be provided.