Phylotastic

This is the public page for the Phylotastic hackathon (as distinct from the Leadership Team's planning page).

to do

best tag line:
- Phylotastic: a web-services infrastructure to make megatrees accessible for research use
- Phylotastic: the Tree of Life meets Phyloinformatics
- Phylotastic: trees when you need them
- Phylotastic: MegaTrees for everyone. Automagically.
We need to get the main flowchart in here.
Content below can be re-factored.
Create a layout that can be used for more detailed planning when the hackathon starts, e.g., the accepted participants and their areas of expertise.

Draft Plan

Please note that this page is a draft plan and a place to develop ideas. The overall target of the hackathon is fixed (build phylotastic), but no single aspect of the plan has been fixed. Participants will have the opportunity to re-think things on day 1 of the hackathon.

Overview

Statement of goals. 1. Build phylotastic, a collection of interoperable web services that collectively provide the means to extract a subtree (specified by tips) from any of several large species tree, and to supply branch lengths and provenance annotation. 2. For demonstration purposes, leverage these services within a graphical interface that also integrates the resulting species tree with the user's choice of several high-value types of data. Optionally, this may involve adapting an existing environment (e.g., Galaxy, Taverna) to manage a phylotastic workflow.

problem In the most typical use, what phylomatic does is this: starting with a huge topology for plant genera, and a user-supplied list of species, it grafts the species onto the tree wherever it can match the genus name, and it prunes away all the rest of the tree. This is just a topology, so often users find ways to add branch lengths to the resulting tree. The result is that the user, so long as she is only interested in plants, can get a phylogeny for an arbitrary list of named species.

Phylomatic rocks: its frequent use shows that big species trees are highly useful for applications in ecology, biodiversity, & trait analysis,when the interfaces that serve user needs— and the mega tree providing vast coverage— are available. But phylomatic would rock harder if:

the back-end data store were populated with large phylogenies available for fungi, fish, mammals and prokaryotes (not just plants)
the core functionality (name-matching, grafting & pruning) were modularized in an open-source bioinfo library
methods for adding branch lengths were easier and more generalized
all of the above were wrapped up as web services that could be invoked from computing environments

If this were a web service, we could plug it into Mesquite, and users could load up their species-based character matrix, then get a tree for it. In fact, lets go back a step, to consider users with only a list of species, and no data to compare: consider an even more open-ended discovery environment, which we could implement in Galaxy or Taverna (given that this is all based on web services). The user starts with a list of species (or a higher taxon), and a request for some useful types of data that could be obtained by querying various available sources, e.g., whether it has a cyt oxidase sequence in GenBank, whether it is found in California, where is the nearest specimen, etc.

approach

Hackathon agenda and guiding principles

create a demo implementation of a system based on open standards
allow alternative implementations, at least for some steps
allow flexibiilty for multiple use-cases

Architecture

Error creating thumbnail: Unable to save thumbnail to destination

scoping statements

In Scope

Populating data store of existing trees
Evolution of PhyloWS to support the needs of Phylomatic
Taxonomic name resolution (embedding existing TNRS capacities)
Pruning trees and grafting species on them
Branch length (existing methods for incorporating branch lengths)
Integration of data and trees (e.g., mashups) - species-wise integration
Display of resulting trees (using existing technologies)
Wrap all these existing tools as web services
NeXML syntax extensions if needed
If needed, determine methods for compressing NeXML representations
Simple user interface (web form)

Not In Scope

Constructing new input trees
New Data Generation
Arguing or evaluating the correctness of trees
Design of new TNRS systems
Debates about which naming system is best
Developing new techniques to derive branch lengths

Uncertain, depends on participant skills and perspectives

Phylo-referencing
MIAPA annotations of the steps; provenance annotations

approach

in addition to the basic functionality needed for power users, it would be helpful to have a graphical display to show off the results.

demos and other links

Rutger's proof-of-concept: https://github.com/HIP-WG/tolomatic/blob/master/README.pod
taxize (R package) http://ropensci.org/tutorials/r-taxize-tutorial/

Phylotastic

Contents

to do

Draft Plan

Overview

approach

Hackathon agenda and guiding principles

Architecture

scoping statements

approach

demos and other links

Navigation menu

Phylotastic

to do

Draft Plan

Overview

approach

Hackathon agenda and guiding principles

Architecture

scoping statements

approach

demos and other links

Navigation menu

Search