Fermilab SDSS Data Distribution

Retrieving SDSS data from FNAL

As you might guess, this document describes several ways for you to retrieve SDSS data from Fermilab. First, let's look at what systems we'll be using.

Computer Systems

There are three computer systems or clusters dealt with here: the FNAL/SDSS cluster, the FNALU cluster (fsgi02 and fsgi03 for example), and finally your local machine or cluster.

How to Get the Data

There are four methods of retrieving SDSS data from Fermilab, listed here in approximate order of ease:

Each method is described below:


(1) Transmit:

After the processing of each run is completed, Fermilab will transmit a subset of this data to anyone who wants it. Fermilab will announce what is available and when the transfer will take place about a day ahead of time. You choose which files you want. Fermilab will then send the predetermined subset of the data to a predetermined directory in a predetermined account on a predetermined machine at your institution. This should be the easiest and fastest way for your institution to get fresh data.

The current method is via scp with encryption disabled and authentication by RSA. Although the channel is clear text, nothing passed over the network should allow anyone listening in to access the account. No passwords or keys are passed, just an encrypted challenge.

Setup:

First, you need to designate a machine, FNAL data account, and directory where Fermilab can scp the data, and a special port for ssh if you wish, and let the Fermilab "data guru guru" know! You will also need to decide which files you want -- the standard menu can be found at http://sdss.fnal.gov:8000/~bclee/guru/menu.txt. Please send all of that info to Brian Lee (bclee@fnal.gov, 630-840-6646).

machine: Full machine address.
account name: FNAL data account name. "sdssdp is the default, but it can be anything.
directory: Directory for the data (many GB) to be placed in.
port: Port for connection. Optional but recommended.

Second, on your machine, you will need a copy of sshd v1 running which allows "none" as an encryption option. (We need to scp -c none.)

There are two ways to do this, the more secure way, and the less secure way. We strongly recommend the more secure way, described in full detail below. However several sites have chosen the less secure but much easier method, so I will briefly describe it. If you choose the easy way out, please read through the more secure method first so that you know the risks.

Step 2 the easy way: recompile (or compile if you never had it) a copy of ssh that allows no encryption. In the directory with the ssh source code:

   > ./configure --with-none (plus any other options you want)
   > make
   > make install

That's it. No playing with config files, no multiple versions, etc. Please read the more secure method for more details (like where to get ssh) and so that you know what you're getting yourself into. We do not recommend the easy method.

Step 2 the more secure way: there are a lot of details in this method, but it all essentially comes down to this:

CHECKLIST:

Now the details. If your machine will be used by any other users, you should compile two versions of ssh. The first will be the default version used by all the normal users on that machine and compiled however you see fit. The second copy will be for the FNAL data transmissions. This is not necessary, but it will improve the security of your system and prevent other users from abusing the no encryption option.

The second copy, used for the FNAL data transmissions, needs to be version 1.x (version 1.2.27 is the newest at this time) and needs to be compiled to allow "none" as an encryption option. The later revisions of v1 don't allow this by default, so you will probably have to reconfigure and recompile a new copy. Be sure to install this in a separate directory from your default version so that it doesn't overwrite it! Briefly:

   You can get the source code from: ftp.cs.hut.fi:/pub/ssh
   > tar -xvzf ssh-1.2.27.tar.gz (above the directory you want it in)
   > cd ssh-1.2.27
   > ./configure --with-none (plus any other local options you want,
                              like where to install it so that it 
                              doesn't overwrite your default version!)
   > make
   > make install

We can now configure this copy of sshd with a separate (non-default) /etc/ssh_config file so that only the FNAL data account can use it. (You won't want to use this for your default config file, it will severely limit the functionality of ssh for other users!) This separate config file will prevent other users on your system from doing anything stupid like sending passwords in clear text using this second copy of ssh. With the default config file, a user would still have to maliciously use the "-c none" option to do this, so it couldn't happen by accident. The configuration presented here will stop even those pesky and determined yet foolish users from compromising your system.

A sample config file can be found at http://sdss.fnal.gov:8000/~bclee/guru/ssh_config_fnal. The suggested non-default choices are listed below, along with what they do, in particular with regard to users on your system other than the FNAL data account.

The simplest thing for you to do would be to copy ssh_config_fnal to /etc/ssh_config_fnal on your system. You will need to give sshd the name of this config file when you start it. For instance if you do name it /etc/ssh_config_fnal:

   > sshd -f /etc/ssh_config_fnal

Third and finally, you will need to copy the Fermilab SDSS public key: http://sdss.fnal.gov:8000/~bclee/guru/sdssdp_identity.pub

Copy the key into the ~(Fermilab_scp_account)/.ssh/authorized_keys file in the FNAL data account on your machine. (Create the file if it doesn't already exist.) Note that the from= option is used in this key so that it can only be used by a user at fsgi03.fnal.gov.

Some tips on adding keys: the key must be all one line. The home directory, .ssh sub-dir, and the authorized_keys files must be writable only by the user (unless specified otherwise in your configuration of sshd).

That should be it! Just sit back and wait for the data to flow in (and fill all your disk space!).

FYI, these are the commands the script currently issues:

ssh -p $port -l $usrID $address mkdir -p $remoteDir
scp $scpOptions $localDir $remote
    where scpOptions includes things like  -P $port -c none, etc.
Note that the no encryption option is only used for the scp commands. I may also occasionally ssh into your system to check on the data if things are not going well.

If you want something you didn't get from the scp, you have three options:


(2) From disk (if it's there already):

Log into fsgi03.fnal.gov and check the /usr/sdss/data01/imaging directories. If what you want is there already, it's yours to copy.

The machine fsgi03 is on the FNALU cluster. See the SDSS use of FNALU cluster document at http://sdss.fnal.gov:8000/~bclee/guru/sdss_guide.txt for instructions on getting an account and using these machines.


(3) Tape robot:

PLEASE NOTE: This section is somewhat out of date, and in particular roboRead may not perform as described here. You can find an update to this at http://sdss.fnal.gov:8000/~rspete/roboRead.html. Further, the tape robot described here is now full (as of the first quarter of 2000) and will no longer be used for backing up runs. We will be moving to a new tape robot, details will be provided when it's up and working.


If it's not on disk, it might be on the tape robot. To access the tape robot, you will need fmss and roboRead set up on your system. (Or you can spool the data off to a scratch disk on the FNALU cluster and move it to your local system from there.)

First, before you can access the tape robot, you need to be sure you (and your machine) are in the .access file on sdss. Here's a quick way to see if you are:

   > setup fmss
   > fmss ls mss:/sdss/sdssdp/imaging

If you are in the .access file, this should give you a list of all the run directories on the tape robot. If not, email Jen Adelman (jen_a@fnal.gov) and ask to be added.

The data directories on the tape robot look like this:

   /sdss/sdssdp/imaging/$RUN/$RERUN

Where $RUN is the run number (745, 94, etc.) and $RERUN (0, 1, 2, etc.) indicates the rerun number as the data is run through newer and newer versions of the pipelines.

The fmss web page describes the full functionality of fmss: http://fnhppc.fnal.gov/mss/fmss32/FMSSMAN.htm

Mostly what you will want, though, is ls (shown above) and cp:

   > fmss cp mss:/sdss/sdssdp/imaging/$RUN/$RERUN/filename disk:/your_disk_here

For instance, to copy a file to my current directory:

   > fmss cp mss:/sdss/sdssdp/imaging/94/0/astrom-000094.tar disk:.

We have written some wrappers for fmss to transfer multiple files, called roboRead. To get roboRead:

   > setup sdsscvs
   > cvs co roboRead

To use roboRead:

   > setup fmss
   > setup astrotools
   > astrotools
   > cd roboRead/etc
   > source restore.tcl

The README file roboRead/etc/README describes the use of two scripts: getFromFermi for retrieving tar files from the tape robot and restoreFromArchive for retrieving the specified files from the tar files. You will need to create a transfer.par file for getFromFermi, and a restore.par for restoreFromArchive, as described below. Be sure to look at the actual README file that comes with your distribution, as the specifics are subject to change.

---------- From roboRead/etc/README (edited) ----------

Instructions for using these scripts to transfer data from the Fermi
Tape Robot and for unpacking the tar files on a local machine.

The first thing that you need to do is install the FMSS product, then:

setup astrotools
setup fmss
source restore.tcl

(N.B. the script is done so that you don't actually need the
ASTROTOOLS or the Fermi Unix Environment.  Simple TCL and the FMSS
binary should do the trick).

A sample transfer and unpacking of the data would go like this:
1) edit a "transfer" param file (e.g. transfer_94_1.par for run 94, rerun 1)
   this transfers the data from the Fermi tape robot
2) edit a "restore" param file 
   this unpacks specified files from the archived Fermi TAR files
3) get the data from Fermi (e.g. getFromFermi transfer_94_1.par)
4) unpack specified files (e.g. restoreFromArchive restore_94_1.par)

As you can see above, there are 2 primary commands:

"getFromFermi" and "restoreFromArchive"

They both take 2 arguments, the first being a parameter file, the
second being the verbosity level (0,1,2)

In addition to the "restore.tcl" file, which contains all of the
necessary TCL scripts, one also needs the "transferDefault.par" and
"restoreDefault.par" files.  These "parameter" files look like this:

transferDefault.par
-------------------
run 0
rerun 0
version a
fmssDir /sdss/sdssdp/imaging/[format %06d $run]/$version
writeDir /data/aragorn/sdss/${run}
frameBegin 11
frameEnd 20
deltaFrame 10
fields  ALL
astrom  1
corr    1
cat     1

Run and rerun are self-explanatory, version will go away soon.  For
the directories, these can either be hardcoded or they can contain
formatting commands that allow for the run, rerun, and version values
to be used in the directory names.

frameBegin and frameEnd come from the "RUNS" file in /data/dp3.b/data



deltaFrame is how many fields have been put into the same tar file,
this also comes from "RUNS"

fields can be specified in a number of ways including "11,12,13",
"11-15,20", "11 12 13", or "ALL" (N.B. do NOT include the quotes).
ALL will use all the fields between (and including) frameBegin and
frameEnd.

astrom, corr, and cat are either 1 (yes, transfer these files), or 0
(no, don't transfer these files).

astrom contains everything that would have been in /data/dp3.b/data/$run/astrom

corr refers to all of the "fpC" files (i.e. the corrected frames)

cat is everything else in the /data/dp3.b/data/$run/objc/ files except
for the "fpC" files.

N.B. currently the tsObj files are NOT archived, though they should
(and will) be.

Also, it is currently not possible to specify only certain camera
columns (scan lines).  


restoreDefault.par
-------------------
run 0
rerun 0
readDir /data/aragorn/sdss/${run}
scrDir /scratch
writeDir /data/aragorn/sdss/${run}
makecopy 0
frameBegin 11
frameEnd 20
deltaFrame 10
fields ALL
columns ALL
filters ALL
astrom  1
objc    1
atlas   1
corr    1
BIN     0
Masks   0
fangs   0
wings   0

Run and rerun are self-explanatory.  For the directories, these can
either be hardcoded or they can contain formatting commands that allow
for the run and rerun values to be used in the directory names.

The scrDir is for those who cannot (or don't want to) unpack the TAR
files in place.  If the "scrDir" is specified and "makecopy" is set to
1, then a copies of the TAR files will be copied to "scrDir" before
the individual files are unpacked.  After the files are unpacked the
TAR files in "scrDir" are then deleted.  If you want to unpack the TAR
files in place, simply ignore the "scrDir" parameter and leave
"makecopy" as 0.

frameBegin and frameEnd come from the "RUNS" file in /data/dp3.b/data

deltaFrame is how many fields have been put into the same tar file,
this also comes from "RUNS"

fields can be specified in a number of ways including "11,12,13",
"11-15,20", "11 12 13", or "ALL" (N.B. do NOT include the quotes).
ALL will use all the fields between (and including) frameBegin and
frameEnd.

columns can be specified as either "1 2 3" or "ALL", where "ALL" is
equal to "1 2 3 4 5 6".  (Again, don't include the quotes, but DO
include the spaces).

filters can be specified as either "g r i" or "ALL", where "ALL" is
equal to "u g r i z".  (Again, don't include the quotes, but DO
include the spaces).

the rest are binary toggle switches will determine whether or not
certain files are unpacked and the values are either 1 (yes, unpack
these files), or 0 (no, don't unpack these files).

astrom contains everything that would have been in /data/dp3.b/data/$run/astrom

objc = fpObj files

atlas = fpAtlas files

corr = fpC files

BIN = fpBIN files

Masks = fpM files

fangs = fang files

wings = wing files

N.B. currently the tsObj files are NOT archived, though they should
(and will) be.


getFromFermi 
------------

Purpose: retrieve TAR files from tape robot.

In order to upload data from Fermilab, one edits a parameter file to
look like "transferDefault.par", and gives the name of this new file
as the first argument.

If no parameter file is given, the use will be queried to either 1)
enter the name of the proper parameter file, 2) create a new parameter
file by using the defaults in transferDefault.par, but giving the run,
rerun, and version values when prompted, 3) create a new parameter
file by answering a series of prompts.

The verbosity can be specified in the second argument.

e.g. getFromFermi transfer_94_0.par 2

to transfer the data from run 94, rerun 0 with maximum
verbosity, where you have created the file transfer_94_0.par
yourself.

restoreFromArchive
------------------

Purpose: extract specified data files from the TAR files retrieved by
getFromFermi.  (Each tar file contains several data files.)

In use this is very similar to the command above.  Again it takes a
parameter file.  It will then go through the newly copied TAR files
from Fermilab and unpack all of the files that you want (e.g. all the
fpAtlas files, but not the fpC files, etc.)

e.g. restoreFromArchive restore_94_0.par 2

Gordon Richards
10 June 1999

Edited by Brian Lee 
1999 July 26


(4) Remote tape read:

If all else fails, you can read the individual tapes.

This is a three step process: mount the tape on the FNAL/SDSS cluster, read the tape from where ever you want, and unmount the tape back on the FNAL/SDSS machine.

NOTE: BEFORE USING A TAPE DRIVE, YOU MUST FIRST GET JEN'S PERMISSION! You can contact Jen Adelman at jen_a@fnal.gov or 630-840-2929. We need people to refrain from using the tape drives when Fermilab is processing data -- when we get new data, we need all the drives. If you are using tape drives we need, it will hold up the entire processing pipeline. Thus you must call ahead and make sure the drives will not be needed by Fermilab before proceeding.

Setup:

Pick your machines, an FNAL/SDSS machine with tape drives (sdssdp3.fnal.gov for example) and your local machine (for this example, I'll use fsgi02.fnal.gov on the FNALU cluster).

Note that sdssdp2 and sdssdp3 are FNAL/SDSS cluster machines, and in general we don't want people using these machines. However they are the only machines with tape drives at the moment. If you need an account on these machines (which are separate from the FNALU cluster) contact Jen Adelman at jen_a@fnal.gov or 630-840-2929.

Place the local machine and user name in your .rhosts file on the FNAL/SDSS machine:

local_machine_address youruseridhere

For example:

fsgi02.fnal.gov youruseridhere

Hint: you may need to also add just the short machine name, for instance: fsgi02 youruseridhere

Pick a tape:

First, pick the tape you need. See: http://www-sdss.fnal.gov:8000/~sdssdp/tapedb/tapedb.html

Mount the tape on the FNAL/SDSS cluster:

Log onto an FNAL/SDSS machine with tape drives (sdssdp2, sdssdp3). Then type:

   <sdssdp3> setup ocs
   <sdssdp3> ocs_tape

This will tell you if any of the tape drives are free (unallocated). If they are, request one. The following command will give you one of the drives.

   <sdssdp3> ocs_allocate
   sdssdp3 sdss35

This tells me that I've been given drive sdss35 on machine sdssdp3. Next, ask that your chosen tape be mounted on that drive:

   > ocs_request -t tape_drive -v tape_label

For example, here I've been given tape drive sdss35 and I want tape JG0198:

   <sdssdp3> ocs_request -t sdss35 -v JG0198

Now, through the wonders of modern technology, your request will instantly be conveyed to the computer screen of a computer operator, who will read it, retrieve your tape from the vault, and put it in the right drive. Many requests are filled within 10 minutes, although during busy times it may take up to an hour. Be careful not to annoy the kind computer operators with repeated mistakes! You will then see:

   ocs_request: success

NOTE: Tape mounting is done by humans, and humans occasionally make mistakes. IF THE TAPE IS INCORRECTLY MOUNTED AS READ-ENABLED, YOU COULD EASILY WRITE OVER THE DATA! This appears to have already occurred at least once! Too keep the entire collaboration from showing up on your doorstep in a very unhappy state you should make sure the tape has been mounted correctly. The following allows you to check both if the correct tape was mounted and if it was mounted as read only:

   ocs_check_label -t tape_drive -v tape_label -r

Example:

   <sdssdp3> ocs_check_label -t sdss35 -v JG0198 -r

If you see one of these, try again:

   ocs_check_label: Tape is not write protected
   ocs_check_label: ftt_verify_vol_label: expected vol 'JG0198', but got 'JG0197'.

All is well if you instead see:

   ocs_check_label: Success

DO NOT PROCEED UNTIL YOU HAVE MADE THIS CHECK!

Read the tape on the local machine:

Log on to your local machine. If for some reason you can't get this to work with a local machine, you can use one of the FNALU machines and copy the data onto a scratch disk there. (And then figure out a good way to get it from the scratch disk to your disk before it is erased about a day later.) Please DO NOT read the tapes off onto an FNAL/SDSS machine, as drive space is tight and this will impeade data processing.

From there (assuming the .rhost file is set up on the FNAL/SDSS machine with the tape drive) you should be able to use tar and mt, knowing only two tricks:

First, for remote tar & mt, just add the user name & remote FNAL machine name like this:

   tar tvf username@fnal_machine.fnal.gov:/dev/tape_name
   mt -f username@fnal_machine.fnal.gov:/dev/tape_name

Otherwise, it's the same as if it were on your machine. (You may be able to omit the username and long machine name depending on the local system.)

Second: sdssXY = /dev/nrmtYh

For example, on sdssdp3, sdss35 = /dev/nrmt5h. On sdssdp2, sdss25 = /dev/nrmt5h. Etc.

And that's it, just access the tape as if it were on a local machine.

Here's an example. I'll use fsgi02.fnal.gov, and I've been assigned tape drive sdss35 = /dev/nrmt5h.


<fsgi02> cd /usr/scratch/sect1/bclee     
   (cd to a local scratch directory)
<fsgi02> mt -f bclee@sdssdp3.fnal.gov:/dev/nrmt5h fsf 1
   (skip the label)
<fsgi02> tar tvf bclee@sdssdp3.fnal.gov:/dev/nrmt5h
   (see what's in the tar file)
<fsgi02> mt -f bclee@sdssdp3.fnal.gov:/dev/nrmt5h rew
   (rewind the tape)
<fsgi02> tar xvf bclee@sdssdp3.fnal.gov:/dev/nrmt5h
   (get everything in the tar file)

Clean up on the FNAL/SDSS:

When you're all done, go back to the FNAL/SDSS machine and clean up.

   <sdssdp3> ocs_dismount -t sdss35 -o unload
   <sdssdp3> ocs_deallocate -t sdss35

(Or you can ocs_deallocate -a to deallocate all drives allocated to you.)


For more info:

FNALU:
http://cddocs.fnal.gov/cfdocs/productsDB/DocDetail.CFM?DocNum=GU0008
Also see the SDSS FNALU use document by following ths link or on the FNALU machines:

   ~stoughto/doc/sdss_guide

fmss:
http://fnhppc.fnal.gov/mss/fmss32/FMSSMAN.htm

ocs:
http://cddocs.fnal.gov/cfdocs/productsDB/ProdDetail.CFM?ProdNum=SU003

Princeton SDSS Data Guru Mailing List Archive:
http://www.astro.princeton.edu:81/sdss-gurus/INDEX.html


Brian Lee
1999 July 26
Please report mistakes, changes, etc. to Brian at: bclee@fnal.gov
Information will be posted at http://sdss.fnal.gov:8000/~bclee/ unless it finds a better home.

Brian Lee / bclee@fnal.gov / (630) 840-6646
Last modified: Tue Jun 27 15:48:58 GMT 2000