👓 Engineering bioinformatics in seconds, not hours | Ryan Barrett

Read Engineering bioinformatics in seconds, not hours by Ryan BarrettRyan Barrett (snarfed.org)

It was winter 2014. Pharrell had just dropped Happy, the Rosetta probe landed on a comet, President Obama was opening diplomatic relations with Cuba

…and here at Color, the bioinformatics team had a problem. Our pipeline — the data processing system that crunches raw DNA data from our lab into the variants we report to patients — was slow. 12 to 24 hours slow.

This wasn’t a problem in and of itself — bioinformatics pipelines routinely run for hours or even days — but it was a royal pain for development. We’d write new pipeline code, start it running, go home, and return the next morning to find it had crashed halfway through because we’d missed a semicolon. Argh. Or worse, since we hadn’t launched yet, our live pipeline would hit similar bugs in production R&D samples, which would delay them until we could debug, test, and deploy the fix. No good.