Computational Observational Astronomy

Datawolves for the SDSS

The SDSS is constructing a 2.5 Terabyte dataset. We must explore machines that make possible efficient analysis of this data.

If Beowulfs are clusters optimized for efficient message passing parallel processing, Datawolves are clusters optimized for both high I/O rate scans through large data sets and and for bringing to bear high compute power onto large datasets.


GriPhyN and the SDSS

GriPhyN is a collaboration studying how to handle the science of large datasets: they have chosen to focus on the idea of virtual data.

We have written a draft document for GriPhyN. It outlines the SDSS project, the SDSS pipelines, how we approach turning the pipelines into a factory, and a suggestion for a SDSS problem that the Griphyn collboration could consider when designing thier tools, that of data management for the Southern Survey.


TAM

The Experimental Astrophysics Group is building a Terabyte Analysis Machine for its science analyses and as a research prototype for database testing and advanced filesystems.

The Terabyte Analysis Machine:

A research cluster aimed at exploring large distributed astronomical databases with a 7 dual node Linux cluster, 500 gig of local disk, 1 Terabyte of global disk and with SX, a distributed, spatially parititoned database designed for fast queries on a complicated Terabyte scale data. One aim: repartition and re-index the whole database onto local disk for specialized queries, e.g., kth nearest neighbors.

We are designing this system around four archetypal analysis tasks:

TAM As Built:


SX Bricks

Dell Poweredge 4400 make fine SX database bricks:

The first of ours is sdssdp5.


IDE Disk Farms

We need to hold the reduced frames and atlas image database.

EIDE disk are so cheap that putting Terabytes onto a single node is attractive, if read/write performance is not an issue. See: Working with Arrays of Inexpensive EIDE Disk Drives by Sanders, Riley, Cremaldi, Summers, and Petravick.

The first of ours is sdssdp6.


What is the image at bottom? Caustics in a pool of water. Remarkably similar to the large scale distribution of galaxies in the Universe that the SDSS is designed to study. Clusters of galaxies would be the bright knots; I wish to find them.

James Annis
June 1, 2000
Last Updated: Monday, 05-Jun-2006 07:50:02 CDT