Phylotastic2

From Evolutionary Interoperability and Outreach
Jump to: navigation, search

Note: hackathon 2 ended on Feb 1. Please bear with us while we work to refactor and update the material documenting the hackathon. See the PhylotasticGallery for pics of the event.


Initial Pitches

  • Tree Annotation (Arlin)
    • What is the provenance of the tree returned by Phylotastic? Metadata. What are the methods of analysis used?. etc.
      • start with trees, move to freetext annotations from manuscript, then formalize them according to an ontology and feed into TreeStore
    • This pitch was merged with the MIAPA ontology pitch (below)
  • First draft of MIAPA ontology (Hilmar)
    • follow from Arlin's idea
    • MIAPA = Minimum reporting standard for phylogenetic analysis
    • initial idea from Leebens-Mack
    • on and off ideas - Vocamp in October 2011 produced a first checklist
    • there has been a community survey that is now being processed
    • This pitch was merged with the Tree Annotation pitch (above)
  • TNRS - Name Validation (Gaurav)
    • easy to use interface
    • matching with Google Refine
    • not only match and recognize match, but suggest higher taxa
    • process easily long lists of names (1000)
  • Treestore (Ben)
    • implement the queries by Piel et al on the Treestore
  • Architecture and specifications (Rutger)
    • should provide a high level open specification of the interfaces
    • each group (TRNS, etc.) should be able to contribute to the interfaces
    • main goal is to derive the pruner interface in full detail as a demo
    • specification of interface and behavior
    • tie with MIAPA
  • Front End Versatility (Karen)
    • is this PhyloWS?
    • Ensure that Phylotastic can work with many different trees and treestores.
  • Use Cases (Brian S)
    • functional and exciting
    • alternative ways to create a list of input taxa
    • best phylogeny for a node in the tree of life
    • phylogeny for a complete museum
    • geographic search
    • ecological search
    • ...
    • incrementally growing list
    • maybe just get names automatically out of a PDF file
  • Kind of Users (Andrea)
    • what users and what kind of documentation is needed
    • e.g., why shouldn't be scary to use command-line args
    • focus on documentation development
  • Tests for Phylotastic (Greg)
    • metrics for testing and sets of benchmarks
    • automated testings for the components and for the infrastructure
    • link to the interface specification?
  • Connecting trees to tip data (Julie)
    • attach trees to other repositories (e.g., locations, etc.)
    • e.g., get physiological data from Dryad...
  • Common Names for TNRS (Naim)
    • for non scientists using the system
  • Authentication (Scott)

Hangout 1 Ideas

  • Support for common names in TNRS: NCBI provides English, EOL several other languages, wikipedia/ http://wiki.dbpedia.org/
  • Tree store implementation (support for DateLife)
  • Architecture:

From Brian O'Meara's email:

One thing that came up during today's hangout was architecture. At the last hackathon, there was a group dedicated to architecture (how all the components fit together). Some of their work is at http://www.evoio.org/wiki/Phylotastic/Architecture . Some of us on the call thought that this hackathon might work best if the architecture is mostly specified before we meet, with perhaps further improvements once we arrive. It's hard to code things to work together if you are still designing how that will happen. This wouldn't be set in stone, but would help guide development: how will metadata be passed, for example. Much of this work was apparently done by the group at the last hackathon, but we should discuss 1) whether trying to firm this up before the hackathon is a good idea (based on discussion today, I assume most people think yes, but there are probably counterarguments), and 2) how much of the current specification we should use, and what needs to be added/changed/deleted.

Hangout 3 Summary

  • In terms of overall goals for our week together, the people on the hangout were enthused about trying to roll out a full alpha or production version of Phylotastic, specifically a version of the website that uses all the pieces of the workflow in a coherent whole. In other words, people expressed primary interest in refining and connecting the components that we already have, as opposed to creating new components.
  • Several people were interested in ironing out details of architecture and specifications for how components exchange information, and it was suggested that we start an email thread on that topic before meeting in Tucson.
  • People were divided on the mobile app idea. Everyone likes it in concept, but some folks are enthusiastically behind developing it now, others suggested that development is premature until we have a fully functional basic service in place. We might want to coordinate with others who are already developing similar apps, such as the folks at iNaturalist (http://www.inaturalist.org/)
  • We talked at some length about coordination with Opentree, and generally agreed that this would be a great a source of big trees, though likely not the only source. We mentioned Arbor but didn't discuss it extensively.
  • Several people recommended devoting substantial person-hours to documentation, perhaps even having a documentation subgroup. The r-phylo wiki from the first NESCent hackathon might be a good model for how to start doing that. http://www.r-phylo.org/wiki/Main_Page
  • People are very excited about getting together in Tuscon, which is great!

Boot camps

  • Overview of the Phylotastic 1 (high level – Arlin if present)
  • Architecture and Interfaces – this could also mention some of the data standards in use, like NeXML (Rutger) File:Slides-rutger.pdf
  • TreeStore (Ben and Hilmar)
  • TNRS (User:Gaurav): slides
  • OpenTree and related efforts (Karen)
  • git and GitHub