Field of Science

Speaking to computers

I am a Lab Rat. I work in laboratories, I try to figure out gels and results. I pipette tiny amounts of liquid into various locations. I try to see patterns and shapes and fit things together. This is what I do. I also occasionally venture into the world of Literature and try to find patterns there (but only as a hobby unfortunately).

What I don't do is computers. I can think of over twenty ways to re-phrase the instruction 'look for the comma' (probably over thirty ways if I'm allowed to use the 'synonyms' feature in word) but not one of those ways works for a computer.

Unfortunately there are some things that even Lab Rats need computers for. For instance, searching for a particular domain of DNA, pulling all the results out of a standard BLAST search, and then taking only the relevant information from that. To slightly clarify, BLAST is a bioinformatics programme with a huge database of information about every known and officially sequenced protein and DNA sequence. You type in your sequence (or your name, if you feel bored) and it shows you what proteins it matches on the database. Unfortunately it provides quite a large amount of information about each one, so this is where programming comes in, you tell the computer which bits of the database you want.

Now I did do computer science for IGCSE. I know what pseudocode is, and how to use it, and could probably stab a guess at setting commands up in the right sequence as well. Where I fall apart is the bit after that, translating the pseudocode into computer speak. There are some phrases I quite literally cannot do.

IF comma is present
THEN stop
concatenate new information to old (concatenate=attach or add to)

OK. Fine. That works. But how do you say 'comma is present' in computer? The comma is not equal to anything, so that's out. Nor is the comma in relation to anything, it's just a comma in the middle of the string of writing. I have no idea at all how to tell the computer to find me a comma, Perl (which I am using to programme) seems to have no random squiggle that means 'find' or 'look for'.

Another thing that flummoxed me was the concatenation. All the online tutorials showed you exactly how to concatenate, but only if you already knew what the phrases were:

Tutorial: To concatenate x and y type x.=y

Lab Rat: I DON'T KNOW WHAT X AND Y ARE!!! In fact I'm looking for them because I don't know what they are. I want to find that out!;

Tutorial: *is no help at all*

Lab Rat: *Kicks computer, then hold up a large picture of a comma in front of the screen* Just find this, OK? See this picture, find something that looks like this and then give all the writing in front of it to me;
(the semicolon is computer for 'end of line'. I do not know why computers do this when almost every living person uses a full stop).

Computer: *Is not impressed*

I did actually get there in the end, to the surprise and delight of both myself and my supervisor (and probably the computer as well) I managed to get it to do sort of what I wanted. Unfortunately when we looked back over the raw data from our BLAST results we realised that there was a lot more information we wanted, so a lot more code has to be written. And here we hit another problem. The BLAST databases are truly amazing but just not particularly well organised. Some of them have the important information stored under /notes, while others have a separate field called /function. One item we saw even had the full protein function listed under /name. This means that to get all the information we need, we'll have to pull out each of these fields for every single protein, which will provide us with a lot of useless notes that we don't really need.

Lab Rat: Just give me the useful stuff, OK?;

Computer: Variable 'useful' not defined. Random computer squiggles, out of cheese error.

Lab Rat: *gives up on computers*;

Computer: *gives up on Lab Rat*


John Farrell said...

Very informative, LR. You don't have an email listed on the site, so I thought I'd post here to see if you wouldn't mind a fellow science writer sending you some questions about life in the lab?

I'd appreciate it.

Lab Rat said...

Not at all, feel free to ask any questions you like. :) Obviously there may be some that I am unable to answer (for reasons either of legality or lack of knowledge) but I'll attempt to answer anything you'd like to know.

Posting questions in the comments would probably be the easiest way, then I can respond quickly.

Toby said...

I was trying to avoid mentioning anything, but I guess you'd help if I was ill at home with some phage destroying all my bacteria. Maybe.

You probably either want split or regexs (Regular expressions) in Perl. For regexs this tutorial seems to start from the beginning.