What does it take to be a computational biologist?

I would like to talk about what it takes to be a computational biologist, specifically in comparison to being an experimental biologist. If you’re wondering whether instead of becoming a computational biologist you should become a race-car driver, fighter pilot, or ballet dancer, this post probably won’t help you. But if you’re wondering whether a computational biology lab is a better choice for you than an experimental biology lab, this post should provide you with some useful guidelines. To cut right to the chase, here is the take-home message: To become a computational biologist, you need to want to become a computational biologist.

When I interview students who are potentially interested in joining my lab, the interviews inevitably take one of two routes, depending on the students’ background. The students with extensive computational experience usually make a point emphasizing all the techniques, systems, and languages they have learned, to showcase their technical expertise. By contrast, the students with little computational experience are usually rather timid. They say that they might be interested in computational biology but don’t really know much about it, and also that they don’t know if they would be any good at it [1].

The truth is, I don’t really worry much about pre-existing expertise [2]. Sure, I won’t be disappointed if a student knows a lot already, but it plays a rather minimal role in my decision of whether or not to take a particular student [3]. I think that computation can be learned relatively easily, if you really want to learn it, so what matters much more than your current knowledge is your intention [4]. For this reason, when students express any concern about their ability to be computational biologists, I usually ask them a simple question: Would you rather spend your day in front of a bench pipetting, or would you rather spend your day in front of a computer screen staring at symbols and numbers? Anybody who would rather stare at a computer screen is going to be fine in my lab, and anybody who would rather pipette is not going to have a good time.

In fact, too much pre-existing computational knowledge can be a disadvantage, when the students think they know things better than they actually do. At least the inexperienced students are a blank slate. They are willing to listen and they accept the conventions of the lab. The more experienced students may have idiosyncratic views on how things should be done, views that may make sense from their perspective but not from the perspective of the person running the lab (i.e., me). For example, a student who insists on using java for a project when the rest of the lab uses python is going to cause problems [5], even if he has lots of experience with java and none with python. With some experienced students, I spend as much time re-training them as I would have spent with less experienced students training them from scratch.

So, if you don’t have a lot of computational experience but you would like to do computational work, ask yourself whether you have the patience to hack away for hours in front of a computer screen, until you have solved a problem. If the answer is yes, you’ll be fine. And if you do already have a lot of computational experience, keep an open mind, realize that there is probably still a lot you can learn, and accept that some things are just conventions. Even if you don’t like a particular convention (such as “we use python”) you should accept it if you want to be successful in your lab [6].

Notes

[1] Actually, more recently there is also a third type, students who think they have computational experience but they really don’t. MS Word, Excel, or Facebook do not count as computational experience. If you have never written an actual program and never used a command line, you don’t have any meaningful experience.

[2] This statement applies to undergraduate researchers or prospective graduate students. I wouldn’t hire a postdoc without any computational expertise. If somebody has done a purely experimental PhD they are unlikely going to be a good match for my lab at the postdoc stage, since we don’t do any experiments whatsoever.

[3] I tend to evaluate students primarily on whether they appear to be motivated, whether they appear to be smart, whether I can connect with them, and whether I think they would fit into the lab.

[4] This point is also made here. As long as people give you a chance to try yourself at programming and you make an effort, you should be fine.

[5] As just one example, it will be more difficult for other students to take advantage of that student’s work and vice versa. Further, once the student leaves, the project may be abandoned or somebody else may have to re-engineer it using the lab’s preferred language.

[6] I acknowledge that there may be situations where the convention a lab has chosen is genuinely poor, and where the student truly knows better than the faculty member how to do things properly. However, I think these situations are rare, in particular for labs run by experienced computational biologists. And moreover, if you as a student really have so much computing experience that you can see all the poor decisions your PI is making, why did you join the lab in the first place? You should have seen these issues ahead of time. For example, if I were a prospective graduate student now, and I was told a lab did everything in perl and fortran, I’d run. By contrast, if they used languages I approve of, if they deposited their code on github, and if the code they deposited looked more or less decent, then I should be fine in that lab.