nextupprevious
Next:Introduction

Vtuple for passII

Vladimir  Matveyev,
Nickolai  Kuropatkin,
Soon Yung  Jun, James  Russ

ABSTRACT

In this note we have selected some information concerning ROOT tools for the pass2 vtuple processing. It suppose to be a part of a big H note that is under preparation. Keeping in mind that preparation of the note is slow and people need tools to work with vtuples now we put this document on the Web. Here we also provided a link to the distribution file, so people who have no access to our CVS repository can take the tools and install on their platforms.




Introduction

SELEX Pass2 output consists of two output streams and five vtuple streams. Reconstructed output (raw data + savebk blocks) are stripped to out1 and out2 - out1 for events with at least one charm reconstruction; out2 for events with no charm reconstructions. Five vtuples contain variable-length compressed information for charm mesons (vtup1), charm baryons (vtup2), V0 reconstructions (vtup3), strange particles (vtup4), and Downstream reconstructions (vtup5).


Utilization of Vtuple

The data streams are voluminous, expected to total 12 TB, and are stored in the robotic store at the Pittsburgh Supercomputer Center, with exabyte tape backup. These data will serve for detailed (re)analysis. We expect that most analyses will begin with the vtuple outputs. The purpose of this note is to describe the structure and format of vtuple and discuss how that might be done.

Even the vtuples are quite voluminous, about 700 GB. They will be stored on FMSS. The vtuples contain many events that will be rejected by simple cuts, reducing the size of an analysis sample. Moreover, most analyses will focus on one or a few types of charm. We therefore envision an analysis procedure analogous to that in Pass 1. The vtuples will be read in blocks from FMSS into scratch disk space on fn781f/a and processed to create a set of different ntuples, akin to the charm and baryon ntuples from Pass 1. For Pass2 the set of possibilities is larger because of the VEE and KINK reconstructions. From initial studies based on p2z outputs, an ntuple for one specific reconstruction, e.g., $\Lambda_c$ recn 430, will be about 40 MB - a small file, able to be stored anywhere on disk. Other reconstructions will be of comparable size or smaller. We might have active analysis on 25 recns at any one time, requiring about 1 GB of disk storage for the total.

Because most users have now switched from X-terminals to PCs, it is possible to copy the interesting ntuples for any analysis onto the local user disk of a PC and operate entirely with local resources, using the versions of ROOT  [1] or PAW available on the Fermi Linux distribution. This seems to us to be the plan of the future, when most analyses will be done off-site, rather than at Fermilab. In such a model, reducing the vtuples to selected ntuples - taking of order 1-2 weeks - would be done once (or maybe a few times) early in the analysis lifetime. Further analyses that require new development would be done using, say, one set of vtuples to establish a procedure. One set is 70 GB or so, still too large for the available space on fn781f/a, but conceivably small enough to be copied to a suitable PC workstation for study. The same tools that are used to analyze the vtuples for the standard modes can be adapted for specialized studies and can be run quite happily on Linux systems.

This proposal has outlined a procedure to make the voluminous vtuple set into a greatly-reduced set of specific charm reconstructions. The available tools for doing this selection produce either a PAW-based ntuple output, using a skeleton developed by V. Matveev for V0 selection, or a ROOT-based ntuple output, using a package developed by N. Kuropatkin to sample production quality. We can afford to do both, if there is demand for both approaches.

Vtuple Structure

We have used an object oriented approach to build the "mini dst" type of output in our off-line program. As a main object we used to have "reconstruction". In our language reconstruction is a physical state composed from other objects - tracks, vertexes, kinks etc. A reconstruction can be composed from other reconstructions creating a complex object. This structure was implemented in the SOAP "recon" package and it was natural to keep the same structure in the FORTRAN output files - vtuples. Because we have variable number of composing objects the output file should have a variable record length, but elementary objects ( like a track) have fixed length and represents a block in the record. The general structure of the vtuple is shown of fig.1.

Figure 1: The general structure of the vtuple.

As one can see the vtuple consists of header and a number of blocks. There is usually one beam track block, one primary vertex block and different number of other blocks. Presence and number of other blocks depends on the type of reconstruction the vtuple is build for. There could be some secondary vertexes, some secondary tracks, some downstream blocks, some V0 kink blocks, photon calorimeter blocks and user blocks. The last one was introduced mainly to store Monte Carlo generated data for comparison with reconstructed ones. Each of the mentioned blocks represent a physical object and can be treated as such in corresponding software.

Vtuple Analysis

Here we will present only ROOT based tools.

Subsections

Root tools

A set of ROOT  [1] tools is proposed for the vtuple analyses. We do not consider the tools as a universal ready for use software, but rather as a completely functioning templates that users can easily adopt for their own needs. These tools are developed on the bases of ROOT and hence C++. The object oriented approach used in the design of these tools match well the object structure of the vtuples.
 


The TRecon object

As a first step in implementing the vtuple structure in ROOT we need to create a C++ object that would correspond to a record in our vtuple file. This object naturally is a reconstruction and has name TRecon. There are two parts of it TRecon.h - header file and TRecon.cxx - an implementation of the object in C++ codes. We would encourage anyone who is going to use these tools to give a look into the header file as it is yet another vtuple documentation. Also, comments in proposed scripts and programs suppose to be a sort of documentation for users and future developers. The TRecon object reproduces the vtuple structure shown on fig.1 with some additions like $X_F$$P_t$, scut, and some other calculated variables. The header file also contains a description of methods one can use to extract an information from the object. The ROOT permit us to write objects in a file creating analog of the vtuple one. The ROOT file has many advantages over the vtuple file. Among them are platform independent format and possibility to use compressed input and output. The reconstruction is stored in the ROOT file in the form of object tree giving the possibility to recreate the whole reconstruction with all related objects when read. In spite of we are providing a tool to convert vtuple file into a ROOT one ( vtup2root.cxx ) this operation has sense only for a preselected reconstructions due to big size of the output file. Better approach is, probably, creating the TRecon object virtually during the vtuple processing and then extracting interesting information and writing it in the form of the ROOT "tree branch". The last object is an analogous of the PAW ntuple. We will discuss these tools in the following sections. To perform the selection of interesting reconstructions we have defined a simple Selector object in TRecon. The selection is made on the bases of simple cuts for some set of parameters that is provided through RecCut.dat file. At present following parameters are used for reconstruction selection: The object can be easily developed to perform more intelligent selection.

The tools and environment

The set of tools we are going to describe consists of two programs and a number of ROOT scripts. There is also one shell script that organizes data processing directly from FMSS on fsgi03. Two main programs are "vt2root" and "vt2rtup". Corresponding source files are vtup2root.cxx and vtup2rtup.cxx. We also provided a Makefiles to build the executables on IRIX 6.5 and LINUX platforms. To build the executables you need to have ROOT version 2.23 or newer installed on your computer. At Fermilab on fsgi03 you have to setup root. On LINUX you have to have correctly defined $ROOTSYS and $LD_LIBRARY_PATH environmental variables. To build executables you have to copy Makefile.sgi or Makefile.lin, depending on your platform, to Makefile, and run gnumake. Both executables are using following files that should be in the same directory as the executable: The vtup2root program can be considered as an example of how to process a list of compressed vtuple files that are stored on some computer, reachable by the rcp command. It accept two files as input parameters - vtup_list and output root file. The vtup_list should contain a list of vtuple files that contains the whole path including the host. For example: fn781f:/spool3/prod/pass2/vtup2/p2x01_charm_run01037_01085.vtp2.gz

The output file can be compressed with different compression level. By default we are using the maximum possible compression. For the details one should have a look on comments in the program source. This program is functional, but we do not recommend to use it for real work. Still it gives several useful examples on how to organize data processing in the ROOT framework. It is why we have included it in the distribution.

The vtup2rtup program is an example of how to process compressed vtuple files stored on a local disk creating an output ROOT file that contains a simple one branch tree. The ROOT tree is an analogous to PAW ntuple. The contents of the ntuple we are using is close to one that was used in old ftuple analysis, and can be easily adopted to a user needs. The output file is compressed and is very compact. In case of only one reconstruction ID 400 the output file after processing of all vtuples from p2z01 sample has size only about 2 M Bytes.

As our vtuples are stored on FMSS we need a tool to process these files without copying all of them to local disks. This was done by a special shell script "SODA". The script has a set of global variables that should be adopted for each user. Comments in the script explicitly documented these variables. The script accepts the list of files on FMSS to be processed as an input parameter-"fmss_list". The work of the script is organized in such way that it copies an FMSS tar file to a local scratch disk, unwinds the tar file, creates a list of vtuple files and calls vt2rtup program to process them. After processing, the vtuple files are deleted to release the disk space. The output file is appended after processing of each portion of vtuple files. One should be careful to not use the same name for the output file to avoid appending new data to the old file. The output file name can be introduced as an input parameter of the "SODA" script.

There are several root scripts to show how to work with created ROOT file.

The scripts could be run, for example, as follows: root[n].x ntup.C In this case the script will be interpreted. If the script is organized in a proper way it can be run as compiled one. In this case the execution is much faster. To run the script as a compiled program use a command:
root[n].x ntup.C++
For details, please, see the root manual  [2]. The block diagram of the recommended data processing procedure that can be run on fsgi03 is shown on fig.2.
Figure 2: The data processing diagram.
It is relatively easy to adopt this schema on any other computer that has ROOT and FMSS installed.

The package distribution

The package is currently distributed through E781 CVS repository. On fsgi03 or fn781 cluster you need to setup cvs and then check out "utility".
For example: "cvs co utility".
As a result the utility directory will be created. All the root tools are in the "rootools" subdirectory. For users who have no access to our CVS repository there is a tar file that can be copied to local platform. The contents of the file is the same as in the CVS repository.
 

Bibliography

1
"ROOT - An Object Oriented Data Analysis", Rene Brun and Fons Rademakers Proceedings AIHENP'96 Workshop, Lausanne, Sep. 1996, Nucl. Inst. & Meth. in Phys. Res. A 389 (1997) 81-86. See also http://root.cern.ch/.
2
"The ROOT User's Guide" Rene Brun, Fons Rademakers, Suzanne Panacek, Damir Buskulic, Jorn Adamczewski, Marc Hemberger.


Nikolai Kuropatkin

2001-03-24