Recommend The Data-Scope Project - 6PB storage, 500GBytes/sec sequential IO, 20M IOPS, 130TFlops (Email)

This action will generate an email recommending this article to the recipient of your choice. Note that your email address and your recipient's email address are not logged by this system.

EmailEmail Article Link

The email sent will contain a link to this article, the article title, and an article excerpt (if available). For security reasons, your IP address will also be included in the sent email.

Article Excerpt:

Data is everywhere, never be at a single location. Not scalable, not maintainable.–Alex Szalay

While Galileo played life and death doctrinal games over the mysteries revealed by the telescope, another revolution went unnoticed, the microscope gave up mystery after mystery and nobody yet understood how subversive would be what it revealed. For the first time these new tools of perceptual augmentation allowed humans to peek behind the veil of appearance. A new new eye driving human invention and discovery for hundreds of years.

Data is another material that hides, revealing itself only when we look at different scales and investigate its underlying patterns. If the universe is truly made of information, then we are looking into truly primal stuff. A new eye is needed for Data and an ambitious project called Data-scope aims to be the lens.

A detailed paper on the Data-Scope tells more about what it is:

The Data-Scope is a new scientific instrument, capable of ‘observing’ immense volumes of data from various scientific domains such as astronomy, fluid mechanics, and bioinformatics. The system will have over 6PB of storage, about 500GBytes per sec aggregate sequential IO, about 20M IOPS, and about 130TFlops. The Data-Scope is not a traditional multi-user computing cluster, but a new kind of instrument, that enables people to do science with datasets ranging between 100TB and 1000TB.  There  is a vacuum today in data-intensive scientific computations, similar to the one that lead to the development of the BeoWulf cluster: an inexpensive yet efficient template for data intensive computing in academic environments based on commodity components. The proposed Data-Scope aims to fill this gap.

A very accessible interview by Nicole Hemsoth with Dr. Alexander Szalay, Data-Scope team lead, is available at The New Era of Computing: An Interview with "Dr. Data". Roberto Zicari also has a good interview with Dr. Szalay in Objects in Space vs. Friends in Facebook.

The paper is filled with lots of very specific recommendations on their hardware choices and architecture, so please read the paper for the deeper details. Many BigData operations have the same IO/scale/storage/processing issues Data-Scope is solving, so it’s well worth a look. Here are some of the highlights:


Article Link:
Your Name:
Your Email:
Recipient Email:
Message: