Skip to content

Fork of the trackml-library repository to customise its csv reader utility for the PANDA experiment.

License

Notifications You must be signed in to change notification settings

n-idw/panda-csvReader

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PANDA CSV-Reader

A python library originally designed for the TrackML particle tracking challenge. It is currently being used to read the hit and truth information of the straw tube tracker (STT) extracted as CSV files from the PandaRoot simulation pipeline. The information of in the CSV files is then returned as pandas data frames.

Installation

First download the repository with git:

git clone https://github.com/n-idw/panda-csvReader.git

Then, the package can be installed as a user package:

pip install --user panda-csvReader

To make a local checkout of the repository available directly it can also be installed in development mode:

pip install --user --editable panda-csvReader

In both cases, the package can be imported via import trackml without additional configuration. In the later case, changes made to the code are immediately visible without having to reinstall the package.

Preparation

The simplest way to use this library is to write all relevant information of an event into four CSV files, systematically called

  • eventXXXXXXXXXX-hits.csv
  • eventXXXXXXXXXX-cells.csv
  • eventXXXXXXXXXX-truth.csv
  • eventXXXXXXXXXX-particles.csv

The Xs are placeholder for the event number meaning that the hits CSV file for the event with the event number 123 would be called event0000000123-hits.csv. In the following the information saved in the columns of each CSV file is listed and described.

Hits

  • hit_id : Identification number of a hit in the STT.
  • x,y,z : Coordinates of the hit position in cm.
  • volume_id : Identification number of the detector volume (currently always 9 for the STT).
  • layer_id : Identification number of the straw tube layer.
  • module_id : Identification number of the straw tube.

Cells

  • hit_id : Identification number of a hit in the STT.
  • depcharge : Number of electrons deposited by a hit in a straw tube.
  • energyloss : Approximatation of the measured energy loss in a straw tube calculated by dividing depcharge by $10^{6}$
  • volume_id : Identification number of the detector volume (currently always 9 for the STT).
  • layer_id : Identification number of the straw tube layer.
  • module_id : Identification number of the straw tube.
  • sector_id : Identification number of the STT sector.
  • isochrone : Radius of the isochrone in cm.
  • skewed : Indication of the polarity of a straw tube (0 = straight, 1 = +3°, -1 = -3°)

Truth

  • hit_id : Identification number of a hit in the STT.
  • tx,ty,tz,tT : True coordinates of the interaction point in cm and time in ns corresponding to a hit in the STT.
  • tpx,tpy,tpz : True momentum vector in GeV/c of the particle at the hit position.
  • weight : Weighting of the hit.
  • particle_id : Particle identification number of the particle responsible for the hit. This particle ID is used to get information about the particle from the particles CSV file.

Particles

  • particle_id : Same particle identification number as in the truth CSV file.
  • vx,vy,vz : Coordinates of the point of origin / vertex of the particle in cm.
  • px,py,pz : Initial momentum vector of the particle in GeV/c.
  • q : Currently not used.
  • nhits : Number of hits in all PANDA detector systems resulting from this particle.
  • pdgcode : Monte-Carlo particle identification number according to the PDG
  • start_time : Time of production of the particle in ns.
  • primary : Indication if the particle comes from the simulated decay chain (1) or originates for, e.g., detector interaction (0).

The truth and particles CSV files can only be created with the information provided in a Monte-Carlo simulation sample and consequently not exist for experimental data.

Additional columns as well as completely new CSV file types, following the template eventXXXXXXXXXX-{myType}.csv, can be easily read by the utilities provided in this library.

Usage

To read the data for one event from a MC sample simply use:

from trackml.dataset import load_event

hits, cells, particles, truth = load_event('path/to/event000000123')

For experimental data where only the hit information is available use:

from trackml.dataset import load_event

hits, cells = load_event('path/to/event000000456', parts=['hits', 'cells'])

To iterate over events in a dataset:

from trackml.dataset import load_dataset

for event_id, hits, cells, particles, truth in load_dataset('path/to/dataset'):
    ...

To read a single event and compute additional columns derived from the stored data:

from trackml.dataset import load_event
from trackml.utils import add_position_quantities, add_momentum_quantities, decode_particle_id

# get the particles data
particles = load_event('path/to/event000000123', parts=['particles'])

# decode particle id into vertex id, generation, etc.
particles = decode_particle_id(particles)

# add vertex rho, phi, r
particles = add_position_quantities(particles, prefix='v')

# add momentum eta, p, pt
particles = add_momentum_quantities(particles)

The dataset path can be the path to a directory or to a zip file containing the events .csv files. Each event is lazily loaded during the iteration. Options are available to read only a subset of available events or only read selected parts, e.g. only hits or only particles.

To generate a random test submission from truth information and compute the expected score:

from trackml.randomize import shuffle_hits
from trackml.score import score_event

shuffled = shuffle_hits(truth, 0.05) # 5% probability to reassign a hit
score = score_event(truth, shuffled)

All methods either take or return pandas.DataFrame objects. You can have a look at the function docstrings for detailed information.

Authors

The library was originally written by

  • Moritz Kiehn

with contributions from

  • Sabrina Amrouche
  • David Rousseau
  • Ilija Vukotic
  • Nimar Arora
  • Jon Nordby
  • Yerkebulan Berdibekov
  • Victor Estrade

and was forked by

  • Nikolai in der Wiesche

About

Fork of the trackml-library repository to customise its csv reader utility for the PANDA experiment.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%