As much as we like all of our vendors, we still would rather be safe than sorry. So we go through some very carefull testing of their machines. There are several phases of this testing, both before and after we make a purchase.
We don't send our bids our for farm workers to just any PC manufacturer. We have to evaluate them and see if their hardware will work up to our standards, and also if the vendors will give us what we asked for. We do this test about once every year and half or so.
So we start with a very broad and open list of vendors. We give them our criteria for a farm worker and have them send us an evaluation machine. We review the machine they submit to be sure it meets the specification that we provided. We then take this machine and run it through our test suite for seven days.
Each vendor submits a quote with their evaluation unit. This quote tells us what it would cost us to buy units configured in the same way as the the Eval. unit. We measure performance in what we call Fermi Units which roughly translates as 1 Fermi Unit is equivilant to the performance of 1 Pentium III 1GHz processor. It all comes down to cost per Fermi Unit.
We take the nodes which test as meeting our requirements and then we rank them based on price and performance. We then take the best 5 vendors and they become our qualified vendor list. We list the next 5 vendors in order based (on price/performance). Should there be a need to disqualify one of the top five approved vendors, the #6 ranked vendor from the evaluation is put onto the qualified vendor list. From here on, all bids are sent to the five vendors on this list. For the most part the lowest bidder wins the bid. The exception is that there are often quotes that don't match what we asked for.
One thing that suprised folk is that we don't have any of the very large name vendors on our list. This is primarily because they tend to tell us what we want instead of the other way around.
When an RFP goes out, the vendor is asked to provide a bid that includes not only the requested number of nodes, but also the rack, patch panels and other infrastructure needed to support the rack. Our specifications detail the exact requirements. A recent RPF sought 440 nodes mounted in 11 racks with Cyclades power management and console servers. The network infrastructure is already in place in the compute facility and is connected to these racks via trunk cables from our switches to the rack's patch panel.
When these machines are delivered and installed, we do a 30 day burn-in, where all of our testing programs run for the whole 30 days. There are all sorts of rules that determine what qualifies and what doesn't. It might seem extreme that we expect a 98% uptime (and any time down in a day counts as a day) but we've found that good vendors can meet this without too much worry.
Components of this test exercise and load test all aspects of the node. For example, we use the SETI@Home software to put a constant load on the processors. This application has very little I/O involved, so it does not suffer from waits for I/O which translate to CPU idle time. One of the biggest advantages of the SETI@Home software is that it is restartable. If a node needs to be shut down or rebooted, the SETI software picks up right where it left off when the node is available again. Personally, I also appreciate that the work done on these CPUs is productive research. It is not like a disk test where one simply makes hundreds of copies of the Linux kernel in order to fill a disk drive, and then you delete all those copies and repeat the test.
We have packaged our suite of tests into a user friendly, easy to install, burn in test. We put all of the packages into rpm format so that they can quickly installed and unistalled. We also made a few utilities that would made monitoring easier.
For those who want to try our tests to see if their hardware is able to run our tests here are the packages.
Many people come to this site because of our high standing in the SetiAtHome status. It's actually pretty amazing how high up in the standings we are when we only run these tests every few months. But what is even more amazing is how fast we drop in the rankings when we arn't running the tests.
With this last burst of strength we finally beat the average and move by default into the top spot. That doesn't make sense (and I'll probably rewrite that sentance) but take a look at this on our seti page
** The SetiAtHome rpm was pulled off this site as the License for SetiAtHome clearly states that Distribution is prohibited. If you are one of our test vendors and need this software, please contact us.
The URL for this page is
http://home.fnal.gov/~kschu/setiathome/farmtest.html
This page is currently maintained by: Ken Schumacher. Please send your constructive comments or suggestions. © Copyright 2004 by Ken Schumacher and Troy Dawson. The original version of this page was written by Troy Dawson
HTML v4.0
Transitional using Cascading
Style Sheet.