| This is the SDSS. |
| This is the SDSS during production. |
| Any questions? |
Sometimes you just have to strap yourself in and fly the thing.
| Survey Quality phone cons:  | Every other Tuesdays at 2:30 Central |
| phone number: 866-215-5503 | |
| code: 8328108 |
Spectroscopic Systems Quality During Operations
The Plan: Data Quality at the Mountain and at the FactoryThe Requirements: Spectroscopic Requirements During Survey Operations
The Guidelines: The Wolff Report
A start: Spectro Quality
Data Quality at the Mountain and at the Factory
Data Quality at the Mountain and at the Factory
Jim Annis SDSS/Fermilab
Scot Kleinman SDSS/Apache Point Observatory
February 20, 2003
I (JTA) was asked by John Peoples to look into
spectro quality assurance, and we (SJK and JTA) were asked by
Bill Boroski to work together to develop an plan to ensure the
quality of the spectroscopic system as a whole.
Our proposal is to organize the spectroscopic systems quality effort,
from data taking at the mountain top, to plate acceptance at
the factory around a spectroscopic systems requirements document
laying out exactly what survey quality data is.
=== Contents:
The Wolff Report:
Quality Assurance
Available Resources
Data Quality at the Mountain and the Factory
Unexpected Problems
Summary of the Plan
=== The Wolff Report:
The motivation for this proposal is to assure ourselves we
are obtaining data of the high quality we have come to
expect of the SDSS.
We take guidance from the Wolff report, a review of the SDSS
led by Sidney Wolff in 2000.
Their recommendations:
* Assess the then current performance of the various
telescope and instrument systems, reduction pipelines, etc.
* Determine what science can be done (and cannot be done) with
the already achieved level of performance.
* Decide what limited set of further improvements have the
highest priority and identify the resources to complete
those improvements on schedule.
* Freeze the systems to the maximum extent possible.
* Proceed to production.
They went on to say: "It is clear that absolutely
outstanding science can be achieved with today's performance
and must not be lost because of a continued search for improvements.
This is a case where, as engineers often remind us,
`better is the enemy of good enough.'"
That was in 2000: subsequent commissioning and testing
proved our data quality was high. Our goal should be that
the data obtained over the lifetime of the survey be as good
as those taken in 2002.
=== Quality Assurance
The basis for quality assurance -must be- a concrete set
of requirements that encapsulate the project goals.
=== Survey Requirements
The requirements in the SDSS Science Requirements document
of Michael Strauss are clearly the starting position.
We have examined existing spectroscopic requirements,
including Andy Connolly's "Requirements and Status of the SDSS
Spectroscopic Systems" and the later annotated version of
Michael Strauss dated 12/04/2000.
As we read these documents, we kept in mind the project management
wisdom of Jim Crocker, who argues that useful science
requirements have certain characteristics.
At heart, the existing documents are commissioning requirements,
and mostly lack the Crockerian structure we need. We found
it necessary to make another pass over the requirements
to emphasize the needs of production. A copy is attached,
and we have reproduced Crocker's list in it. (We note that
many of the numbers in it are working numbers, to be refined
as the tests are put into place.)
Our plan is to organize the spectroscopic systems quality effort
around the requirements, from mountain top data taking,
to plate acceptance at the factory, to working the problem
report system.
=== Available Resources
What resources do we have available?
1) Jim Annis's service time, as managed by Steve Kent
2) a few minutes per plate by the observing team, as guided by
Scot Kleinman
3) 1/2 FTE of Mark SubbaRao's time, as managed by Josh Frieman
4) 1/2 FTE of Nickolai Kuropatkine, the Fermilab computing
professional working with EAG devoted to factory
driving and instrumentation, as managed by Chris Stoughton.
5) the service time of Hubert Lampeitl, a new Fermilab postdoc,
as managed by Annis and Kent.
6) 1 day/dark-run of Jon Brinkmann's time, as managed by French Ledger
7) some fraction of Eric Nielsen's time, as managed by Bill Boroski
Our proposal is to coordinate these individuals and
their tasks, to form them into a coherent structure with
appropriate levels of backup and support. Our proposal is backed
by agreement with the group members.
=== Data Quality at the Mountain and the Factory
Our proposed system is the following:
1) The monthly checkout of the spectrographs starts the process.
During the bright run, the observers take a defined set of data
(see http://sdsshost.apo.nmsu.edu/sdssProcedures/sopMonthly.html).
Jon Brinkmann, as instrument scientist, is tasked to analyze
the data for bias levels, gain, and read noise in order to
a) make sure the spectrographs are in working order, and
b) evaluate the measurements in terms of the baseline established
by previous measurements, and c) to set the baseline for the dark run.
Here is a natural checkpoint for taking action to fix minor problems
with the spectrographs. The instrument scientist would place
the results on a web page which the observers and factory can check.
A report from the instrument scientist would appear in the
shakedown observing report.
2) During the dark run, the observers make a series of
quality checks on a per night basis, the heart of which
is a visual inspection of ~10% of the data.
The new steps, performed nightly:
1) first check of the correctness of the spectro report,
using the Declaring Spectro Data Bad guidelines.
2) a visual check of the bias frames
3) a visual check of one set of flats
4) a visual check of one set of arcs
5) a visual check of one exposure from each plate
6) examination of 5% of the spectra themselves
The astronomers at the factory provide backup support to
the observers, making the visual checks of the data if
the observers find it necessary to skip the step, and double
checking the spectroscopic report. Our plan is to emplace
a checkbox in the plate database noting that this ~10% data set
has been examined on the given MJD. We will build a "Spectro QA"
database, containing both good and bad images and good and
bad spectra, at APO to provide a baseline against which to
check suspect data. Kleinman has offered to ensure these inspections.
Kuropatkine will perform the factory backup support, with the
provision that training is provided by Annis.
These checks are in the class of sanity checks; one should
not underestimate the ability of experienced observers to
locate problems just by looking at the data. We are developing
a set of procedural guidelines on the inspection. Drafts exist;
the initial set of guidelines should be in place for the
February/March 2003 dark run and they should quickly coalesce
into their final form.
3) Automated monitoring of the data is performed by the
quick reduction software, SOS. Monitoring of the SOS results
is performed by the observers. The requirements document
provides the basis for evaluating the SOS outputs.
The spirit of the monitoring is captured in the way the
the observers watch the SOS for wsigma and xsigma measures:
if they see wsigma or xsigma out of bounds they pursue a collimation
of the spectrographs. The first step is to ensure monitoring
of the requirements via SOS outputs. This requires some work
to ensure correspondance between SOS outputs and the needed
requirement tests. The second step is evaluating all
of the SOS outputs and warning messages. We expect the latter to
happen, but the former is the high priority item. Lampeitl has
offered to work with Annis and Kleinman on working the SOS outputs.
4) At the factory, a different set of data quality checks is performed.
Here the emphasis is on quantitative data acceptance checks---
whether or not the plate meets survey requirements. We believe
all of the required testing may be performed on SOS/spectro2d/spectro1d
outputs. These may be performed automatically. Unfortunately,
the limitations of SOP are such that reading the observing logs
is required to ensure bad data is not mixed with good in the
reductions. The observers provide the first check, the factory
the second check, but in any case a human is required to sign
off that the reports have been read before a plate may be
declared done. This will happen on lunar cycles: the end of the
dark run marks the beginning of the final pass of the observing
logs, and thus of plate acceptance. Annis is tasked with declaring
plates done. In order to perform the necessary tests, we will
rsync the outputs of SOS back to Fermilab, and in order to compare
these over all plates we will find it necessary to run SOS
at Fermilab on the early plate data. We further believe it
will be necessary to learn how to explode the logs and postscript
output of spectro2d into manageable pieces for use in the tests.
Annis is tasked to put these tests into place; Lampeitl has offered
to take up the task of modifying SOS. The activities are reported
in the dark run report.
5) An important function of the factory is to provide redundancy
for the inspection work at the mountain, and to be a backstop
for issues flagged by the observers. If the inspection is not
performed at the mountain, the personnel of the factory can
perform it. A double check of the spectro report will always
be made. For these jobs, which are natural extensions of
production, Kuropatkine has offered to perform the work.
More extraordinary issues raised by the observers, such as serious
SOS warnings, must be followed up. Annis will be tasked with
ensuring that either the collaboration has picked up the issue,
usually via the mailing lists, or that a responsible person
is contacted. This line of contact will most often run through
Eric Neilsen as the developer-observer.
6) At the end of the dark run, a different set of data quality
checks kicks in. There are a set of survey requirements that
have to do with repeatability, and clearly we care about
spectrographic capabilities changing with time. We propose
to define a pair of tiles to serve as repeatability tiles,
most likely a centered at ra,dec = 269.5 66.5 (the NEP),
at the edge of stripe 45, and at ra,dec= 55,0 on the end
of stripe 82. During shake time, one of these plate will be
observed and reduced. The observations of these plates will
provide a time series of reproducibility against the same objects
in a variety of conditions. The quality assurance fibers of
the plates in the dark run will also be used; in number they
are about equal to a repeatability plate contribution.
The tests necessary are given in the requirements document.
The tests will be taken from existing tests used to verify
spectro1d performance on new reruns and modified to run
monthly. SubbaRao has offered to pick up the tests, modify
them as necessary, and run them. The results will be
put on a web page, and reported in the dark run report.
7) The final stage of astronomer inspection of the spectra
happens at Chicago. Between 5 and 10% of all spectra are
manually inspected by Mark SubbaRao in the normal
course of pipeline operations, based on a series of warning
flags output by spectro1d. While the primary purpose
of this inspection is to correct redshifts and classifications
when necessary, a clear side effect is an additional quality
data checking on a per plate basis. SubbaRao will extend
the normal spectro1d procedures to report problem plates to
the factory team to aid in a) assessing the acceptability
of a plate, and b) identifying hardware problems that
need to be addressed.
8) The closing of the quality loop happens when the production
team declares a plate done or rejected, and this information
is propagated to the plate data base on the mountain.
This is the verification or overriding of the declaration
of done by the observers based on mountain top indications.
The propagation will happen through Steve Kent, who
monitors the plate supply at the mountain, in transit,
and in storage.
9) There are a set of requirements having to do with the
analysis software of the form of "quasars will not be
misidentified more than 5% of the time". These requirements
have little to do with the acquisition of the data at the
mountain, but everything to do with the spectroscopic
system as a whole. The spectro1d team long ago set up tests
to check the spectra against the Connolly requirements on
classification and redshift using 45 testbed+validate
plates manually inspected several years ago. We will update
the program to reflect the new requirements and integrate this
quality program into our effort, reporting on them each time we
pursue a new rerun of the spectrographic data. SubbaRao has offered
to pick up the tests, modify them as necessary, and run them.
10) Issues raised in the observing reports that relate to
spectro quality must be followed up on on short time scales,
and this included critical high problem reports. Our system
will mirror that in place at the mountain top for the day crew.
We will designate one person who will, M-F and best effort
weekends, read the observing reports and ensure problems
are followed up on, either by the collaboration in the mailing
lists or by the designee following up by contacting the known
experts in the area, a well defined list. This followup should
be initiated by 1:00 PM Central of the morning following
observations. Annis is the primary designee, with Neilsen
the alternate.
11) Lastly, many or most of the issues raised by this
quality control system will be captured as problem reports, PRs.
We will monitor the Gnats database for PRs relevant to
spectroscopic systems, evaluate them in terms of the
requirements, consult with the observers and survey operations,
and devote resources to fixing the problems. We expect
Eric Nielsen to take the lead in this, in the forum of
a biweekly survey quality phone con where the spectro quality team
will meet and discuss the issues. We expect Bill Boroski to
attend initial meetings in order to coordinate Survey resources
in the context of the survey as a whole.
=== Unexpected Problems
One of the important tasks of the spectroscopic quality system
is to pick up unexpected problems soon after they arise.
While we cannot predict the problems in advance, we can examine
the previous problems to ask if the system would have caught them.
For obvious problems like the Red Monster and the internal
LEDs being on, the answer is that visual inspection would have
caught them. For a class of problems related to collimation,
the ``bad collimation data'' and the slit latches being
unlatched, the answer is that the SOS monitoring would catch
the symptom, though the actual cause of the latter problem would
have taken sleuthing. A problem like the wrong plugMapM file
being used would be caught if it was the wrong plugging, perhaps
missed if it was merely an older mapping of the current plugging.
(For this particular problem, a test being implemented at the
factory will now catch it.)
We conclude that most of the unexpected problems would have
been picked up. Our ability to do so in the future is related
to how complete the science requirements document is; we
believe it solid.
=== Summary of the Quality Plan
In place at the mountain are a set of procedures for accepting
and rejecting plates that are thought to produce survey quality
data. The requirements we have laid out form the basis
for a systematic verification of this belief. The plan we
propose makes the verification a part of the routine.
- Monthly Checkout:
who: observers and the instrument scientist
goal: instrument readiness
report: Survey Operations phone con and observing report
- Data Acquisition:
who: observers with Fermi backup
goal: real time quality checks
report: observing report
- Data Production:
who: data production team
goal: fine grained quality checks
report: dark run report, Survey Quality phone con
- Monthly:
who: observers, Fermi and Spectro1d
goal: repeatability tests
report: dark run report, Survey Quality phone cone
- Analysis:
who: Spectro1d team
goal: verification of science analysis code
report: rerun report
James Annis
February 2003, Chicago