Bioinformatics Open Source Conference (BOSC) 2010: Day 2 afternoon

BOSC 2010 sadly wrapped up on Saturday afternoon after a great two days of talks, discussion and planning. Here are my notes from the afternoon sessions.

Simon Mercer — Microsoft Biology Framework

Simon will be presenting information about the Microsoft Biology Foundation and their new 1.0 release. Microsoft External Research brokers relationships between academic communities and Microsoft researchers. This collaboration process involves the development of reusable software that is often made available. Examples include the Ontology Add-in for Word, NodeXL that visualizes networks, 3D molecular viewer for PDB, Trident scientific workflow workbench that provides an interactive and commandline environment for developing workflows.

Goal was to develop together these and other collaborative tools within Microsoft into a framework: Microsoft Biology Foundation. This is reusable tools designed for the .NET platform. Looks like lots of useful stuff: standard representations, file parsing IO, algorithms and web services.

Clickframes — Clickframes: rapid, validated development for clinical informatics

William is from Children’s Hospital in Boston and Beacon 16 software. Clickframes provides a robust software modeling schema for MVC display, database access, user authentication: all of the nasty bits. Written in Java. Idea is to avoid large product requirement documents and take care of both modeling data, and generate code for some of the nasty details. XML based language that folks can write their actual specifications in. Specs turn into interactive web based previews. XML also generates a flow diagram of the application. Tests are automatically generated in Selenium. Really saves a lot of the have to do development things to help focus on the interesting parts.

Morris Swertz — molgenis: database at the push of a button

Molgenis provides models of the biology and tries to autogenerate the background bits. Models are specified in a domain specific language that produces code and magic. It’s Java based and has a XML language to specify what you want and are doing. Plugins can be used to add in java code to handle specific tasks. Generates java classes, tests, SQL and everything for web development on Tomcat. It has a nice interface to R which allows to retrieve data directly from the web form, uses a REST interface. Provides an RDF SPARQL query interface. Reuses models and tools from Galaxy under the covers for sharing.

Alexandros Kanterakis — MOLGENESIS and MAGE-TAB for microarrays

Idea is to use MOLGENESIS to build a database for microarray and GWAS analaysis: want to combine genotypic and phenotypic information for eQTL analysis. Data is stored in MAGE-TAB which provides a tab oriented form of microarray information. MAGE was translated into the MOLGENESIS XML data model. Used MOLGENESIS to produce a web based system for managing the database. Lots of endorsements for using MAGE-ML to model complicated experiment metadata.

Sebastian Schultheiss — Persistence of bioinformatics web services

Looked at 927 web services to see how many are still available. 17% of the original published services are no longer active. Problematic since your scripts are no longer reproducible and comparable. Over time the publishing policies have become stricter and things do seem to be improving. On average 45% of original services are available and still seem to work with test data. 58% of the services are developed on students who are graduating and moving on, 24% of the folks admitted that are not planning to maintain the service.

Lincoln Stein — Gbrowse2

GMOD provides the infrastructure and tools for model organism databases. Contains standard ontologies, schema, file formats, browsers and editors.

Gbrowse is the web-based genome browser part of GMOD. Image glyphs are configurable in the display which allows user to provide organism specific things like pictures of worms, haplotype displays, time course RNA data.

Version 2.0 contains a lot of AJAX and javascript: dragging, zooming, support for SAM/BAM, BED, GFF, WIG BigWig. Subtracks allow items to be organized into groups of tracks related to interesting top level items.

Behind the scenes, you can render tracks independently. JBrowse is the next generation Gbrowse.

Gary Bader — Cytoscape web

Web based component which provides a scaled down version of Cytoscape. Made up of Flash + Javascript and is client-side only. Full customization is possible, generally it looks like an awesome version of cytoscape functionality on the web. It is more suitable for medium sized networks (less than 2000 elements).

Being used for several different clients: GeneMania, iRefWeb, Pathguide. Webiste features online demos. Uses jQuery for interaction.

Nobuaki Kono — Pathway projector

A genome browser for pathway data in the style of google maps. Lots of google features: browsing, marking points, drawing graphs. This allows manual annotation with the Quikmaps javascript library. Info windows pop up while browsing with links to external resources.

James Morris — Evoker: a visualization tool for genotype intensity data

Genome wide association studies: associated SNP or other data with specific phenotypes, build up p-values based on allele differences hopefully identifying signals that are significantly different. Need good quality control in GWAS to avoid false positives from poor quality DNA, population structure or hidden confounding artifacts.

Evoker provides the visualization components to assess these issues, integrating with large data stores. It’s written in Java with perl helper scripts. Fully interactive for zooming in and out and what not. Provides statistical plots to confirm good genotype calls and identify false positives.

Pavel Tomancak — Fiji is just ImageJ

Fiji provides visualization of biological images and is a distribution of ImageJ. Two reasons for the project: first is that it’s needed in the community and has had big uptake, second is that it’s build around biological projects and provides community aspects. Fiji is targetted at Biologists, Bioinformaticians, Software developers and vision researchers. It’s batteries included to target it at Biologists, and includes documentation and tutorials. Includes an API accessible from any JVM language. Code is developed under Git and put an emphasis on communication between developers and users. Developed an image library that allows researchers to write algorithms in DSL and autogenerate into Fiji code. An auto push updater was developed last summer during GSoC.

Iddo Friedberg — IPRStats for visualization of InterProScan results

Use case for IPRScan: deal with the diversity of microorganisms and their health effects. Microbes live in complex communities which is what metagenomics studies. DNA isolated directly from environmental samples and annotating the samples is a problem. One approach is to use InterProScan, and then IPRStats provides visualization of InterProScan results.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s