constraining the candidate to originate from its most likely production vertex [41]. In
the kinematic fit of candidates with q
2
in the J/ψ mass window, the dimuon pair is also
constrained to the known J/ψ mass. This mass constraint improves the resolution in
m(K
−
π
+
µ
+
µ
−
) for candidates involving an intermediate J/ψ resonance decay by a factor
of two.
Signal candidates are further classified using an artificial neural network [42]. The
neural network is trained using a sample of simulated B
0
→ K
∗0
µ
+
µ
−
decays as a proxy
for the signal decay. Candidates in data with m(K
−
π
+
µ
+
µ
−
) > 5670 MeV/c
2
are used as a
background sample. This sample is predominantly comprised of combinatorial background,
where uncorrelated tracks from the event are mistakenly combined. The neural network
uses the following variables related to the topology of the B
0
(s)
meson decay: the angle
between the reconstructed momentum vector of the B
0
(s)
meson and the vector connecting
the PV and the decay vertex of the B
0
(s)
candidate; the IP, p
T
and proper decay time of
the B
0
(s)
candidate; the vertex fit quality of the B
0
(s)
decay vertex and of the dimuon pair;
the minimum and maximum p
T
of the final-state particles and for the Run 1 data set a
measure of the isolation of the final-state particles in the detector. It has been verified
that the distribution of the variables used as input to, and the output distribution from,
the classifier agree between the simulation and the data. The output of the neural network
is transformed such that it is uniform in the range 0–1 on the signal proxy. Candidates
with neural network response below 0.05 are rejected in the subsequent analysis. This
requirement removes a background-dominated part of the data sample. The neural network
response is validated on simulated B
0
→ K
∗0
µ
+
µ
−
and B
0
s
→ K
∗0
µ
+
µ
−
decays to ensure
that it does not introduce any bias in m(K
−
π
+
µ
+
µ
−
).
Finally, a number of vetoes are applied to reject specific sources of background. Signal
candidates are rejected if the pion candidate has a nonnegligible probability to be a kaon
and if the K
+
K
−
invariant mass, after assigning the kaon mass to the pion candidate, is
consistent within 10 MeV/c
2
of the known φ(1020) meson mass. This veto removes 98% of
B
0
s
→ φµ
+
µ
−
decays inside the φ(1020) mass window. Candidates are also rejected if the
kaon or pion are identifiable as a muon and the K
−
µ
+
or π
+
µ
−
mass, after assigning the
muon mass hypothesis to the kaon or pion candidate, are consistent with that of a J/ψ or
ψ(2S) meson (within ±60 MeV/c
2
of their known masses).
4 Signal yields
In order to maximise sensitivity to a B
0
s
→ K
∗0
µ
+
µ
−
signal, candidates are divided
into regions of neural network response. The candidates are also divided based on the
two data-taking periods, Run 1 and Run 2. Four regions of neural network response
are selected for each data-taking period, each containing an equal amount of expected
signal decays. The yield of the B
0
s
→ K
∗0
µ
+
µ
−
decay is determined by performing a
simultaneous unbinned maximum likelihood fit to the m(K
−
π
+
µ
+
µ
−
) distribution of the
eight resulting subsets of the data.
In the likelihood fit, the signal lineshape of both the B
0
and the B
0
s
→ K
∗0
µ
+
µ
−
decays
is described by the sum of three functions: a Gaussian function with a power-law tail on the
– 4 –