Posts Tagged ‘scripting’

Installing the R kernel for Jupyter on linux

Monday, March 7th, 2016

To paraphrase a saying:

Give a scientist a script and they will analyse data for a day. Teach a scientist to script and they won’t have any more time to do analysis for the rest of their lifetime.

Scripts are a great way to make reproducible workflows, but they are too technical for many situations where you have to report to scientists. Jupyter notebooks are a great way to do an analysis, and report the results at the same time. A Jupyter notebook can contain the analysis, the results, and the documentation that explains the results together in a single file, making it at once understandable and reproducible.

Jupyter started its life as IPython, or “interactive python”. But since they added support for other languages besides python, they had to rename. In principle you can install support for a wide range of scripting languages, but in practice it may be a little difficult to set up. Jupyter consists of multiple ‘kernels’, to get support for a different language you have to install that language, and then install the Jupyter kernel for it. It took me a while to get that working for the R scripting language. What follows are some notes I took during that process, in the hope that they are useful for anybody else trying to do the same thing.

So here is the ‘howto’

First you need to have R and Jupyter installed, but I’m assuming you already got that far. Anyway, that is the easy bit.

The R kernel for Jupyter is available here, and the installation instructions are on the same page. I’ve copied them here for convenience:

install.packages(c('rzmq','repr','IRkernel','IRdisplay'),
repos = c('http://irkernel.github.io/', getOption('repos')))
IRkernel::installspec()

When R is installing one of the dependencies, rzmq, on linux, you immediately run into Problem #1. It will complain:

missing header zmq.h

This happens quite often when installing R packages. It happens when the R package is a wrapper for a C library, and needs the development version of that C library to compile the wrapper. In general, the pattern to solve such a problem is simple. When you encounter:

Missing header xxx.h

the solution is to install the development version of the library with

apt-get install libxxx-dev

But then you get to Problem #2

sudo apt-get install libzmq-dev
... try installing the rzmq package again
interface.cpp:123:49: error: call to zmq::context_t::context_t(int, int) uses the default argument for parameter 2, which is not yet defined
zmq::context_t* context = new zmq::context_t(1);

Annoying! It turns out that there are multiple versions of zmq, and you need to install the right one:

sudo apt-get install libzmq3-dev

We’re not there yet. When trying to install rzmq again, we run into Problem #3:

g++ -I/usr/share/R/include -DNDEBUG -I../inst/cppzmq -fpic -O3 -pipe -g -c interface.cpp -o interface.o
interface.cpp:23:14: error: expected constructor, destructor, or type conversion before '(' token
static_assert(ZMQ_VERSION_MAJOR >= 3,"The minimum required version of libzmq is 3.0.0.");

Apparently there is a check for the version of libzmq, but it isn’t working. The installation doesn’t fail because we have the wrong version of libzmq. The installation fails because the R package doesn’t properly detect the version of libzmq. And the problem isn’t that libzmq is somehow misreporting its own version. The problem looks more like a syntax error, which is a weird internal error that shouldn’t occur in a published library. The syntax error is caused by the fact that rzmq uses C++11, (the 2011 version of C++), which is not the default version. We’ll have to fix rzmq. First get the source code:

git clone https://github.com/armstrtw/rzmq.git
cd rzmq

We have to edit one line of src/Makevars, as you can see from the following ‘patch’:

diff --git a/src/Makevars b/src/Makevars
index 7d6771c..c6fdce6 100644
--- a/src/Makevars
+++ b/src/Makevars
@@ -1,5 +1,5 @@
## -*- mode: makefile; -*-

CXX_STD = CXX11
-PKG_CPPFLAGS = -I../inst
+PKG_CPPFLAGS = -I../inst -std=c++11
PKG_LIBS = -lzmq

Now let’s install our own hacked package:

R CMD build .
R CMD INSTALL rzmq_0.8.0.tar.gz

(btw, why is build lowercase and INSTALL uppercase? this is exactly the type of thing why R is not my favourite scripting language)

We’re still not there.

Problem #4 – we installed the very latest rzmq fresh from source code, which now requires R version 3.1.0. I’m using a Long-Term-Support (LTS) version of Linux Mint and I don’t want to switch R versions as that could create a hassle elsewhere. Does rzmq really require 3.1.0, or does it merely say it does because it’s pretending to be cutting edge? Let’s hack it up some more.

git diff
diff --git a/DESCRIPTION b/DESCRIPTION
index ba50840..228e3e0 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -5,7 +5,7 @@ Maintainer: Whit Armstrong <armstrong.whit@gmail.com>
Author: Whit Armstrong <armstrong.whit@gmail.com>
Description: Interface to the ZeroMQ lightweight messaging kernel (see <http://www.zeromq.org/> for more information).
License: GPL-3
-Depends: R (>= 3.1.0)
+Depends: R (>= 3.0.0)
SystemRequirements: ZeroMQ >= 3.0.0 libraries and headers (see <http://www.zeromq.org/>; Debian packages libzmq3, libzmq3-dev, Fedora packages zeromq3, zeromq3-devel)
URL: http://github.com/armstrtw/rzmq
BugReports: http://github.com/armstrtw/rzmq/issues

And now the R kernel installs without further problems for me. I haven’t noticed any incompatibilities with earlier R versions yet, R 3.0.0 seems to be just fine.

Notes from Vizbi: automation in Cytoscape

Monday, March 5th, 2012

Cytoscape is a popular network visualisation and analysis tool. It’s great because it’s so easy to create plug-ins. Today I was fortunate enough to be attending the Cytoscape developer workshop at Vizbi 2012, where I learned a few new things.

Firstly, one of my goals was to find out about the current state of Cytoscape development. Cytoscape is a great tool as long as you don’t look too closely at what’s going on inside. The upcoming third version promises to fix all the minor and major problems that exist under the hood. But Cytoscape 3 has been in the making for a long time. As a plug-in developer, you have to choose between something that works right now, but will go away eventually, or something that is clearly the future, but might take a long time to materialise.

The feeling I got from the workshop is that there is light at the end of the Cytoscape 3 tunnel. For a plug-in developer with a deadline, it’s probably best to stick with the current version for now. But if you’re not under pressure to release, it’s definitely possible to write for Cytoscape 3 and make use of a nicer and more pleasant working environment.

Besides that news, I learned some cool new tricks. Using Cytoscape Commands you can write simple macros for repetitive tasks. For example, to generate the network below, first you have to import a SIF (Simple Interaction Format) file, then import a file with node attributes, then apply a layout, and then apply a visual style. If you have to do this a couple of times it gets quite tedious. But here is how all that can be automated:

Take the following SIF data, and save it using a text editor as network.sif

Martijn is_involved_with    LibSBGN
Chaouiya    is_involved_with    SBML-qual
Martijn is_involved_with    SBML-qual
Martijn is_involved_with    BioPreDyn
Emanuel is_involved_with    LibSBGN
Emanuel is_funded_by    Erasmus
Martijn is_funded_by    FP7

Here are the Node attributes, saved it as node_types.txt

type
LibSBGN=Project
BioPreDyn=Project
Chaouiya=Collaborator
SBML-qual=Project
Martijn=Member
Emanuel=Member
FP7=Funding
Erasmus=Funding

For the visual style, I created one in Cytoscape and saved it as style.props, using Export->Vizmap property file. And here is the magic bit: If you save the above three files in your work directory, then you can generate that picture with the script below.

network import file=network.sif
layout force-directed
node import attributes file=node_types.txt
vizmap import file=style.props

Run it from within Cytoscape with Plugins->Command Tool->Run script…, or from the command line with

./cytoscape.sh -S scriptfile