Posts Tagged ‘visualization’

General SPARQL app for Cytoscape

Saturday, March 21st, 2015

We can now easily solve the problem of bioinformatics data integration. But how do we put that data in the hands of scientists?

At General Bioinformatics we put data in triple stores, and use SPARQL to query that data. Triple stores are great for data integration, but you still have to figure out how to put that data in the hands of scientists. Integrating data is only half of the problem, we also have to present that data. The problem isn’t that SPARQL is hard to use per se (it’s really rather plain and sensible). The problem is that SPARQL is supposed to be only a piece of plumbing at the bottom of a software stack. We shouldn’t expect scientists to write SPARQL queries anymore than we expect them to carry adjustable pliers to a restroom visit.

The General SPARQL app is one of the new ways to present triple data.

How do you use it?

The app lets you build a network step by step. Nodes and edges can be added to a network in a piecemeal fashion. Nodes can represent various biological entities, such as: a pathway, a protein, a reaction, or a compound. Edges can represent any type of relation between those entities.

For example, you can start by searching for a protein of interest. The app places a single node in your network. You can then right-click on this node to pull in related entities. For example, all the pathways that are related to your protein. Or all the Gene Ontology annotations. Or all the reactions that your protein is part of. Or the gene that encodes for your protein. And you can continue this process, jumping from one entity to the next.

Watch this screencast and it will start to make sense:

How does it work?

In the background, the General SPARQL app maintains a list of SPARQL queries. Each item in the search menu, and each item in the context (right-click) menu, is backed by one SPARQL query. When you click on them, a query is sent off in the background, and the result is mapped to your network according to certain rules.

When you first install the app, it comes pre-configured with a basic set of SPARQL queries, although it’s possible to provide your own set. The initial set is designed to work with public bioinformatics SPARQL endpoints provided by the EBI and Bio2RDF. But as great as these resources are, public triple stores can sometimes be overloaded. The app works with privately managed triple stores just as well.

Where can I find it?

The easiest way to get the app is simply from the Cytoscape App manager. Just install Cytoscape 3.0, start it, and go to menu->Apps->App Manager and search for “General SPARQL”. Or download it on from the app store website. What’s even better is that the source code is available on github.

Also, if you have a chance, come see my poster at Vizbi 2015 in Boston.

Notes from Vizbi: automation in Cytoscape

Monday, March 5th, 2012

Cytoscape is a popular network visualisation and analysis tool. It’s great because it’s so easy to create plug-ins. Today I was fortunate enough to be attending the Cytoscape developer workshop at Vizbi 2012, where I learned a few new things.

Firstly, one of my goals was to find out about the current state of Cytoscape development. Cytoscape is a great tool as long as you don’t look too closely at what’s going on inside. The upcoming third version promises to fix all the minor and major problems that exist under the hood. But Cytoscape 3 has been in the making for a long time. As a plug-in developer, you have to choose between something that works right now, but will go away eventually, or something that is clearly the future, but might take a long time to materialise.

The feeling I got from the workshop is that there is light at the end of the Cytoscape 3 tunnel. For a plug-in developer with a deadline, it’s probably best to stick with the current version for now. But if you’re not under pressure to release, it’s definitely possible to write for Cytoscape 3 and make use of a nicer and more pleasant working environment.

Besides that news, I learned some cool new tricks. Using Cytoscape Commands you can write simple macros for repetitive tasks. For example, to generate the network below, first you have to import a SIF (Simple Interaction Format) file, then import a file with node attributes, then apply a layout, and then apply a visual style. If you have to do this a couple of times it gets quite tedious. But here is how all that can be automated:

Take the following SIF data, and save it using a text editor as network.sif

Martijn is_involved_with    LibSBGN
Chaouiya    is_involved_with    SBML-qual
Martijn is_involved_with    SBML-qual
Martijn is_involved_with    BioPreDyn
Emanuel is_involved_with    LibSBGN
Emanuel is_funded_by    Erasmus
Martijn is_funded_by    FP7

Here are the Node attributes, saved it as node_types.txt

type
LibSBGN=Project
BioPreDyn=Project
Chaouiya=Collaborator
SBML-qual=Project
Martijn=Member
Emanuel=Member
FP7=Funding
Erasmus=Funding

For the visual style, I created one in Cytoscape and saved it as style.props, using Export->Vizmap property file. And here is the magic bit: If you save the above three files in your work directory, then you can generate that picture with the script below.

network import file=network.sif
layout force-directed
node import attributes file=node_types.txt
vizmap import file=style.props

Run it from within Cytoscape with Plugins->Command Tool->Run script…, or from the command line with

./cytoscape.sh -S scriptfile

Pathway Visualization to the next level

Friday, February 25th, 2011

The laboratory of bioinformatics of Wageningen University has put together some really cool hardware. In the picture below you see their tiled display, consisting of 12 high-resolution monitors, powered by a single workstation.

PathVisio on tiled display

PathVisio on a tiled display

This setup gives you a lot of resolution to play with. We managed to display all major metabolic pathways from WikiPathways simultaneously, at full resolution, and map microarray data as well. When you’re standing right next to the screens, it feels like the data is all around you. That really encourages you to explore, and make connections across the pathways. That’s just much harder to do on a single screen.