Eur. Phys. J. C (2020) 80 :58 Page 3 of 15 58
to apply these algorithms in the early stages of LHC real-
time event processing, i.e. the trigger system. For example,
Ref. [24] focuses on converting these models into firmware
for field programmable gate arrays (FPGAs) optimized for
low latency (less than 1 µs). If successful, such a program
could allow for a more resource-efficient and effective event
selection for future LHC runs.
Graph neural networks have also been considered as jet
tagging algorithms [25,26] as a way to circumvent the spar-
sity of image-based representations of jets. These approaches
demonstrate remarkable categorization performance. Moti-
vated by the early results of Ref. [25], graph networks have
been also applied to other high energy physics tasks, such
as event topology classification [27,28], particle tracking in
a collider detector [29], pileup subtraction at the LHC [30],
and particle reconstruction in irregular calorimeters [31].
3 Data set description
This study is based on a data set consisting of simulated jets
with an energy of p
T
≈ 1 TeV, originating from light quarks
q, gluons g, W and Z bosons, and top quarks produced in
√
s = 13 TeV proton-proton collisions. The data set was
created using the configuration and parametric description of
an LHC detector described in Refs. [24,32], and is available
on the Zenodo platform [33–36].
Jets are clustered from individual reconstructed particles,
using the anti-k
T
algorithm [3,37] with jet-size parameter
R = 0.8. Three different jet representations are considered:
– A list of 16 HLFs, described in Ref. [24], given as input
to a DNN. The 16 distributions are shown in Fig. 2 for
the five jet classes.
– An image representation of the jet, derived by consid-
ering a square with pseudorapidity and azimut distances
Δη = Δφ = 2R, centered along the jet axis. The image
is binned into 100 × 100 pixels. Such a pixel size is
comparable to the cell of a typical LHC electromagnetic
calorimeter, but much coarser than the typical angular
resolution of a tracking device for the p
T
values relevant
to this task. Each pixel is filled with the scalar sum of
the p
T
of the particles in that region. These images are
obtained by considering the 150 highest-p
T
constituents
for each jet. This jet representation is used to train a CNN
classifier. The average jet images for the five jet classes
are shown in Fig. 3. For comparison, a randomly chosen
set of images is shown in Fig. 4.
– A constituent list for up to 150 particles, in which each
particle is represented by 16 features, computed from the
particle four-momenta: the three Cartesian coordinates
of the momentum ( p
x
, p
y
, and p
z
), the absolute energy
E, p
T
, the pseudorapidity η, the azimuthal angle φ,the
distance ΔR =
Δη
2
+ Δφ
2
from the jet center, the rel-
ative energy E
rel
= E
particle
/E
jet
and relative transverse
momentum p
rel
T
= p
particle
T
/ p
jet
T
defined as the ratio of the
particle quantity and the jet quantity, the relative coordi-
nates η
rel
= η
particle
− η
jet
and φ
rel
= φ
particle
− φ
jet
defined with respect to the jet axis, cos θ and cos θ
rel
where θ
rel
= θ
particle
−θ
jet
is defined with respect to the
jet axis, and the relative η and φ coordinates of the parti-
cle after applying a proper Lorentz transformation (rota-
tion) as described in Ref. [38]. Whenever less than 150
particles are reconstructed, the list is filled with zeros.
The distributions of these features considering the 150
highest- p
T
particles in the jet are shown in Fig. 5 for the
five jet categories. This jet representation is used for a
RNN with a GRU layer and for JEDI-net.
4 JEDI-net
In this work, we apply an IN [5] architecture to learn a repre-
sentation of a given input graph (the set of constituents in a
jet) and use it to accomplish a classification task (tagging the
jet). One can see the IN architecture as a processing algorithm
to learn a new representation of the initial input. This is done
replacing a set of input features, describing each individual
vertex of the graph, with a set of engineered features, specific
of each vertex but whose values depend on the connection
between the vertices in the graph.
The starting point consists of building a graph for each
input jet. The N
O
particles in the jet are represented by the
vertices of the graph, fully interconnected through directional
edges, for a total of N
E
= N
O
×(N
O
−1) edges. An exam-
ple is shown in Fig. 6 for the case of a three-vertex graph.
The vertices and edges are labeled for practical reasons, but
the network architecture ensures that the labeling convention
plays no role in creating the new representation.
Once the graph is built, a receiving matrix (R
R
) and a
sending matrix (R
S
) are defined. Both matrices have dimen-
sions N
O
× N
E
. The element (R
R
)
ij
is set to 1 when the i
th
vertex receives the j
th
edge and is 0 otherwise. Similarly, the
element (R
S
)
ij
is set to 1 when the i
th
vertex sends the j
th
edge and is 0 otherwise. In the case of the graph of Fig. 6,
the two matrices take the form:
R
S
=
⎛
⎜
⎜
⎝
E
1
E
2
E
3
E
4
E
5
E
6
O
1
000110
O
2
100001
O
3
011000
⎞
⎟
⎟
⎠
(1)
R
R
=
⎛
⎜
⎜
⎝
E
1
E
2
E
3
E
4
E
5
E
6
O
1
110000
O
2
001100
O
3
000011
⎞
⎟
⎟
⎠
. (2)
123