Archive for January, 2010

BridgeDb paper published

Tuesday, January 12th, 2010

I’m very happy that our paper on BridgeDb was accepted by BMC Bioinformatics. It’s open access so download it to your hearts content. BridgeDb is all about identifier mapping, which I blogged about before (here, here and here).

BridgeDb lets you find cross-references for identifiers, but BridgeDb is not simply a cross-reference database. BridgeDb provides a standard method to access other cross-reference databases. And because of that level of standardization, you can easily decide to switch to a different source of cross-references.

Deepak Singh uses the term “middleware”, which is a good way to explain it, if that sort of word means anything to you.

But let me try to explain in a different way. BirdgeDb is really a travel adapter. Suppose you’re in Japan and you’ve brought some gear like a laptop, cell phone and a nintendo DS (just in case you get stuck in a blizzard while transferring at CDG). Much to your dismay you discover, after checking into your hotel, that none of your plugs fit in Japanese electrical sockets. So what do you do? Do you go down to Akihabara and spend a grand on a new laptop, phone and portable video game unit? Or do you buy a travel adapter for $1.95?

Just like there are many different power plugs around the world, there are many databases that do identifier mapping. And just like travel adapters let you plug in your laptop anywhere, no matter what country, BridgeDb lets you use your favorite bioinformatics tool, no matter what the source of identifier mappings is (Provided that the tool uses BridgeDb).

Power plugs around the world

It’s important to realize that BridgeDb is simply a conduit of information. It does not calculate cross-references from scratch, nor does it give any guarantees about the validity of those cross-references. You shouldn’t ask if BridgeDb provides better identifier mappings. That is like asking if a travel adapter provides better electricity. You still depend on the power company to give you a stable source of electricity. The travel plug just gives you flexibility to adapt to different circumstances.

The relation between Garbage bags and Databases

Friday, January 1st, 2010

In my local supermarket, you can find two brands of garbage bags: There are the “A-brand” garbage bags, and there are the “house-brand” garbage bags. Both come in rolls of 20, each bag holds 60 liters, and both come with the same convenient closing strips. There is only one difference: each roll of A-brand garbage bags costs 40 eurocents more.

Vuilniszak

Garbage bag (*)

How could this situation exist? Why on earth would you pay anything more than the cheapest possible? They’re just garbage bags, for crying out loud! They’re probably even made in the same factory.

But for some reason, be it superior marketing, brand recognition, or some persistent belief that the more expensive bags really do hold garbage in a superior fashion, there are enough yuppies who dump the expensive brand in their trolleys without thinking twice.

But at the same time, my local supermarket would be foolish to abandon the cheap brand. There are plenty of cheapskate customers who do pay attention to price, and who are really not embarrassed to be seen with a no-brand garbage bag, every Tuesday when they put the trash out, in front the whole onlooking neighborhood.

In marketing terms this is called segmentation: by catering to each market segment (cheapskates and yuppies), the supermarket can make more total profit than if they had carried either only the cheap or only the expensive brand. As always, Joel explains it best

I’m sure this is also something the marketing geniuses of Oracle know. Oracle, well known producer of the enterprise level database product of the same name, is a marketing force to be reckoned with. You can’t just go to MediaMarkt and buy a box of Oracle. No, if you want Oracle, you have to call them. They’ll send a salesrep, who will drive to your office, show slick spreadsheets during expensive lunch while back at headquarters they calculate exactly how much you’re worth, and how much you can be squeezed for site-wide Oracle licenses.

In complete contrast, Sun is not a “marketing” company. Sun is a technology company. They’re the geeks behind the scenes, who have produced a long list of innovative server technology that you never heard about, but nonetheless powers an important fraction of the internet infrastructure. The list goes on: Java, OpenOffice, OpenSolaris, ZFS, Virtualbox and interestingly, also a database product named MySQL.

In the open source community, Sun is widely recognized as a company that really “gets” it. Indeed, all of the products just mentioned are open source in different degrees (and some, like Java, to the highest degree possible: full GPL v3).

It’s the suits versus the beards all over again. And they’re up in arms, because Oracle recently acquired Sun. What a shock: in one corner, Oracle, the most closed, most expensive, commercial database system imaginable, used by all the Fortune 500, and in the other corner MySQL, the cost-free, open-source upstart that powers small shops, blogs (including the one you’re reading) and WikiPathways, and now they’re both in the hands of a single company? Somebody check the temperature in hell!

A friend asked if I would sign a petition to stop Oracle’s acquisition of Sun, and thus also MySQL. I’m not in the habit of signing e-petitions, and I won’t sign this one either. First of all because I don’t think it will make a bit of difference, but also because I think it’s premature. This acquisition does not have to be the disaster that some make it out to be.

Just as there will remain plenty of brand-susceptible bioinformatics professors who will keep claiming, Oracle is way better than MySQL, no matter what the application, or how much we have to spend, there will also remain plenty of low-budget shops that won’t be able to pay Oracle licenses, but will happily settle for MySQL’s smaller feature set.

It’s market segmentation all over again, I don’t see why Oracle won’t be able to keep MySQL open and still have a nice profitable business model. All Oracle has to do is to make the upgrade path from MySQL to Oracle a little bit easier. MySQL could be branded as entry-level Oracle, a gateway drug for newcomers in the enterprise database world.

Of course they could also easily fuck it up, but it’s not like there aren’t any competitors: there is PostgreSQL, mSQL, and of course there is always the possibility of forking MySQL itself (which many groups are doing right now). Because once the source is open, it stays that way forever.

No, I’m not worried at all. Happy New Year!

* Image licensed cc-by-sa-2.5 by M. Minderhoud. Technorati Claim code: FXEQMSTQ5VPE