

Next:Introduction
Vtuple for passII
Vladimir Matveyev,
Nickolai Kuropatkin,
Soon Yung Jun, James Russ
ABSTRACT
In this note we have selected some information
concerning ROOT tools for the pass2 vtuple processing. It suppose to be
a part of a big H note that is under preparation. Keeping in mind that
preparation of the note is slow and people need tools to work with vtuples
now we put this document on the Web. Here we also provided a link to the
distribution file, so people who have no access to our CVS repository can
take the tools and install on their platforms.
Introduction
SELEX Pass2 output consists of two output streams and
five vtuple streams. Reconstructed output (raw data + savebk blocks) are
stripped to out1 and out2 - out1 for events with at least one charm reconstruction;
out2 for events with no charm reconstructions. Five vtuples contain variable-length
compressed information for charm mesons (vtup1), charm baryons (vtup2),
V0 reconstructions (vtup3), strange particles (vtup4), and Downstream reconstructions
(vtup5).
Utilization
of Vtuple
The data streams are voluminous, expected to total 12
TB, and are stored in the robotic store at the Pittsburgh Supercomputer
Center, with exabyte tape backup. These data will serve for detailed (re)analysis.
We expect that most analyses will begin with the vtuple outputs. The purpose
of this note is to describe the structure and format of vtuple and discuss
how that might be done.
Even the vtuples are quite voluminous, about 700 GB. They
will be stored on FMSS. The vtuples contain many events that will be rejected
by simple cuts, reducing the size of an analysis sample. Moreover, most
analyses will focus on one or a few types of charm. We therefore envision
an analysis procedure analogous to that in Pass 1. The vtuples will be
read in blocks from FMSS into scratch disk space on fn781f/a and processed
to create a set of different ntuples, akin to the charm and baryon ntuples
from Pass 1. For Pass2 the set of possibilities is larger because of the
VEE and KINK reconstructions. From initial studies based on p2z outputs,
an ntuple for one specific reconstruction, e.g.,
recn 430, will be about 40 MB - a small file, able to be stored anywhere
on disk. Other reconstructions will be of comparable size or smaller. We
might have active analysis on 25 recns at any one time, requiring about
1 GB of disk storage for the total.
Because most users have now switched from X-terminals
to PCs, it is possible to copy the interesting ntuples for any analysis
onto the local user disk of a PC and operate entirely with local resources,
using the versions of ROOT [1] or PAW
available on the Fermi Linux distribution. This seems to us to be the plan
of the future, when most analyses will be done off-site, rather than at
Fermilab. In such a model, reducing the vtuples to selected ntuples - taking
of order 1-2 weeks - would be done once (or maybe a few times) early in
the analysis lifetime. Further analyses that require new development would
be done using, say, one set of vtuples to establish a procedure. One set
is 70 GB or so, still too large for the available space on fn781f/a, but
conceivably small enough to be copied to a suitable PC workstation for
study. The same tools that are used to analyze the vtuples for the standard
modes can be adapted for specialized studies and can be run quite happily
on Linux systems.
This proposal has outlined a procedure to make the
voluminous vtuple set into a greatly-reduced set of specific charm reconstructions.
The available tools for doing this selection produce either a PAW-based
ntuple output, using a skeleton developed by V. Matveev for V0 selection,
or a ROOT-based ntuple output, using a package developed by N. Kuropatkin
to sample production quality. We can afford to do both, if there is demand
for both approaches.
Vtuple Structure
We have used an object oriented approach to build the
"mini dst" type of output in our off-line program. As a main object we
used to have "reconstruction". In our language reconstruction is a physical
state composed from other objects - tracks, vertexes, kinks etc. A reconstruction
can be composed from other reconstructions creating a complex object. This
structure was implemented in the SOAP "recon" package and it was natural
to keep the same structure in the FORTRAN output files - vtuples. Because
we have variable number of composing objects the output file should have
a variable record length, but elementary objects ( like a track) have fixed
length and represents a block in the record. The general structure of the
vtuple is shown of fig.1.
 |
Figure 1: The general structure
of the vtuple.
As one can see the vtuple consists of header and a number
of blocks. There is usually one beam track block, one primary vertex block
and different number of other blocks. Presence and number of other blocks
depends on the type of reconstruction the vtuple is build for. There could
be some secondary vertexes, some secondary tracks, some downstream blocks,
some V0 kink blocks, photon calorimeter blocks and user blocks. The last
one was introduced mainly to store Monte Carlo generated data for comparison
with reconstructed ones. Each of the mentioned blocks represent a physical
object and can be treated as such in corresponding software.
Vtuple Analysis
Here we will present only ROOT based tools.
Subsections
Root tools
A set of ROOT [1]
tools is proposed for the vtuple analyses. We do not consider the tools
as a universal ready for use software, but rather as a completely functioning
templates that users can easily adopt for their own needs. These tools
are developed on the bases of ROOT and hence C++. The object oriented approach
used in the design of these tools match well the object structure of the
vtuples.
The TRecon object
As a first step in implementing the vtuple structure in
ROOT we need to create a C++ object that would correspond to a record in
our vtuple file. This object naturally is a reconstruction and has name
TRecon. There are two parts of it TRecon.h - header file and TRecon.cxx
- an implementation of the object in C++ codes. We would encourage anyone
who is going to use these tools to give a look into the header file as
it is yet another vtuple documentation. Also, comments in proposed scripts
and programs suppose to be a sort of documentation for users and future
developers. The TRecon object reproduces the vtuple structure shown on
fig.1 with some additions like
,
,
scut, and some other calculated variables. The header file also contains
a description of methods one can use to extract an information from the
object. The ROOT permit us to write objects in a file creating analog of
the vtuple one. The ROOT file has many advantages over the vtuple file.
Among them are platform independent format and possibility to use compressed
input and output. The reconstruction is stored in the ROOT file in the
form of object tree giving the possibility to recreate the whole reconstruction
with all related objects when read. In spite of we are providing a tool
to convert vtuple file into a ROOT one ( vtup2root.cxx ) this operation
has sense only for a preselected reconstructions due to big size of the
output file. Better approach is, probably, creating the TRecon object virtually
during the vtuple processing and then extracting interesting information
and writing it in the form of the ROOT "tree branch". The last object is
an analogous of the PAW ntuple. We will discuss these tools in the following
sections. To perform the selection of interesting reconstructions we have
defined a simple Selector object in TRecon. The selection is made on the
bases of simple cuts for some set of parameters that is provided through
RecCut.dat file. At present following parameters are used for reconstruction
selection:
-
recID - the reconstruction ID range one want to work with.
-
mass - the reconstructed state mass range.
-
LoverS - range of the ratio of secondary vertex Z separation
to error in the separation measurement.
-
pvtx - point back Chi2.
-
run - a range of runs to select.
-
event - a range of events to select.
-
chi2 - Chi2 of the secondary vertex.
-
smin - second from maximal impact parameter.
-
scut - Chi2 of smin.
The object can be easily developed to perform more intelligent
selection.
The tools and environment
The set of tools we are going to describe consists of
two programs and a number of ROOT scripts. There is also one shell script
that organizes data processing directly from FMSS on fsgi03. Two main programs
are "vt2root" and "vt2rtup". Corresponding source files are vtup2root.cxx
and vtup2rtup.cxx. We also provided a Makefiles to build the executables
on IRIX 6.5 and LINUX platforms. To build the executables you need to have
ROOT version 2.23 or newer installed on your computer. At Fermilab on fsgi03
you have to setup root. On LINUX you have to have correctly defined $ROOTSYS
and $LD_LIBRARY_PATH environmental variables. To build executables you
have to copy Makefile.sgi or Makefile.lin, depending on your platform,
to Makefile, and run gnumake. Both executables are using following files
that should be in the same directory as the executable:
-
recdef.ocs - the table of reconstructions. It is used
to extract the name of the reconstruction from its ID.
-
part.ocs - the table of particles. It is used to extract
particle properties.
-
RecCut.dat - the list of cuts used for preliminary selection
of reconstructions.
The vtup2root program can be considered as an example
of how to process a list of compressed vtuple files that are stored on
some computer, reachable by the rcp command. It accept two files as input
parameters - vtup_list and output root file. The vtup_list should contain
a list of vtuple files that contains the whole path including the host.
For example: fn781f:/spool3/prod/pass2/vtup2/p2x01_charm_run01037_01085.vtp2.gz
The output file can be compressed with different compression
level. By default we are using the maximum possible compression. For the
details one should have a look on comments in the program source. This
program is functional, but we do not recommend to use it for real work.
Still it gives several useful examples on how to organize data processing
in the ROOT framework. It is why we have included it in the distribution.
The vtup2rtup program is an example of how to process
compressed vtuple files stored on a local disk creating an output ROOT
file that contains a simple one branch tree. The ROOT tree is an analogous
to PAW ntuple. The contents of the ntuple we are using is close to one
that was used in old ftuple analysis, and can be easily adopted to a user
needs. The output file is compressed and is very compact. In case of only
one reconstruction ID 400 the output file after processing of all vtuples
from p2z01 sample has size only about 2 M Bytes.
As our vtuples are stored on FMSS we need a tool to
process these files without copying all of them to local disks. This was
done by a special shell script "SODA". The script has a set of global variables
that should be adopted for each user. Comments in the script explicitly
documented these variables. The script accepts the list of files on FMSS
to be processed as an input parameter-"fmss_list". The work of the script
is organized in such way that it copies an FMSS tar file to a local scratch
disk, unwinds the tar file, creates a list of vtuple files and calls vt2rtup
program to process them. After processing, the vtuple files are deleted
to release the disk space. The output file is appended after processing
of each portion of vtuple files. One should be careful to not use the same
name for the output file to avoid appending new data to the old file. The
output file name can be introduced as an input parameter of the "SODA"
script.
There are several root scripts to show how to work
with created ROOT file.
-
ntup.C - the script showing how to work with ROOT ntuples.
-
recon_nt.C - the script showing how to process a ROOT
file with TRecon objects.
-
tree_fit.C - an example of extracting information from
the ROOT tree, building histograms and fitting.
The scripts could be run, for example, as follows: root[n].x
ntup.C In this case the script will be interpreted. If the script is organized
in a proper way it can be run as compiled one. In this case the execution
is much faster. To run the script as a compiled program use a command:
root[n].x ntup.C++
For details, please, see the root manual [2].
The block diagram of the recommended data processing procedure that can
be run on fsgi03 is shown on fig.2.
 |
Figure 2: The data processing
diagram.
It is relatively easy to adopt
this schema on any other computer that has ROOT and FMSS installed.
The package distribution
The package is currently distributed through E781 CVS
repository. On fsgi03 or fn781 cluster you need to setup cvs and then check
out "utility".
For example: "cvs co utility".
As a result the utility directory will be created.
All the root tools are in the "rootools" subdirectory. For users who have
no access to our CVS repository there is a tar
file that can be copied to local platform. The
contents of the file is the same as in the CVS repository.
Bibliography
-
1
-
"ROOT - An Object Oriented Data Analysis", Rene Brun and
Fons Rademakers Proceedings AIHENP'96 Workshop, Lausanne, Sep. 1996, Nucl.
Inst. & Meth. in Phys. Res. A 389 (1997) 81-86. See also http://root.cern.ch/.
-
2
-
"The ROOT User's Guide" Rene Brun, Fons Rademakers, Suzanne
Panacek, Damir Buskulic, Jorn Adamczewski, Marc Hemberger.
Nikolai Kuropatkin
2001-03-24