« Help a Scoble out. What should Robert ask in his scalability interview? | Main | Scalability for Startups: How to Grow Up without Blowing Up »
Monday
Oct062008

Paper: Scaling Genome Sequencing - Complete Genomics Technology Overview

Although the problem of scaling human genome sequencing is not exactly about building bigger, faster and more reliable websites it is most interesting in terms of scalability. The paper describes a new technology by the startup company Complete Genomics to sequence the full human genome for the fraction of the cost of earlier possibilities.

Complete Genomics is building the world’s largest commercial human genome sequencing center to provide turnkey, outsourced complete human genome sequencing to customers worldwide.
By 2010, their data center will contain approximately 60,000 processors with 30 petabytes of storage running their sequencing software on Linux clusters.

Do you find this interesting and relevant to HighScalability.com?

References (1)

References allow you to track sources for this article, as well as articles that were written in response to this article.

Reader Comments (6)

Interesting read, but not particularly easy to read without a background in Genomics. There's also not much said about the computing architecture. In fact, of a 14 page paper, the information about the datacentre takes up one heading, and about 18 lines. There's no mention of specific software packages, linux distributions, methodologies, technologies or anything, really.

To me, the entire article reads like advertising.

December 31, 1999 | Unregistered Commentertom_twinhelix

I understand your viewpoint. I studied genomics and bioinformatics at university quite recently. I just get the impression from what I've read on this site, that most people seem to concentrate on the computing architecture and infrastructure of their systems.

Without a doubt, the capacity for Bioinformatics and similar systems biology fields have greatly expanded, and a lot of this expansion wouldn't be possible if it weren't for faster computers and grid computing.
This paper makes an interesting read if you're into systems biology and performance computing: http://www.biomedcentral.com/1752-0509/1/S1/P53

December 31, 1999 | Unregistered Commentertom_twinhelix

Tom, the paper "Less is more: the battle of Moore's law against Bremermann's limit on the field of systems biology" was interesting, thank you!

December 31, 1999 | Unregistered Commentergeekr

Thanks for comments.

December 31, 1999 | Unregistered Commentergeekr

I had high hopes that this paper would have some serious technical information. And even though it did not, I still think it is worthwhile for us to start learning and thinking about how genomics data can be processed and stored. Ten years from now, data processing of genomics may be the focal point of technology.

December 31, 1999 | Unregistered CommenterHugh

Interesting news related to personal genomics:

The Personal Genome Project has released the data sets and descriptions of traits, ethnic background and other information of the http://www.personalgenomes.org/pgp10.html">first ten volunteers, which include the project director and nine other people with backgrounds in genetics, medicine, and biotechnology. While the human genome was first sequenced at the beginning of this decade, what's special about this project is these 10 participants are having their names, genome, and other personal data gleaned from questionnaires http://www.personalgenomes.org/public/">shared openly on the Web, where interested researchers can freely access them. One of the ultimate aims of the project is to create a public database of 100,000 volunteers that researchers and other parties can use to determine what traits, diseases or other characteristics are associated with specific genetic markers.

December 31, 1999 | Unregistered Commentergeekr

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>