4 Backgrounds
The dominant background processes are Z + jets and tt + jets. The contributions from
other processes, including diboson+jets and multijet, are small. The samples used for all
background processes are generated at tree level with the MadGraph program, inter-
faced with pythia v6.4 [33] for showering and hadronization. The Z + jets background is
normalized to the next-to-next-to-leading-order cross section using fewz v2.1 [34]. The
tt + jets and tW+jets (single top quark) backgrounds are normalized to the next-to-next-
to-leading-logarithm cross sections [35, 36]. The diboson+jets backgrounds are normalized
to the next-to-leading-order (NLO) cross section from the mcfm v5.8 [37] calculation. Full
simulation of the CMS detector is implemented using the Geant4 package [38].
4.1 Z + jets
More than half of the background events come from Drell-Yan (DY) production in asso-
ciation with jets. There are two points in estimating the background that require special
attention. First, care must be taken in estimating the yield of Z+ ≥3 jets events as the sim-
ulation may not correctly model the kinematic properties of multijets. Second, the event
yield and shape in the Z + b-jets process, where the kinematic properties of b jet pairs and
the fraction of heavy-flavor jets in inclusive Z + jets might also suffer from mis-modeling
in the simulation. The following paragraphs describe how the data are used to improve on
the estimates from simulations.
To address the first point, the event yield from the simulation is multiplied by a
correction factor to normalize it to data in a control region. This control region is defined
such that the dilepton mass is between 80 and 100 GeV and there are at least three jets,
none of which is b tagged. The correction factor is found to be 0.98 ± 0.12 (0.91 ± 0.12)
for the electron (muon) channel. The correction is applied to all signal rectangle regions.
The 12% uncertainty comes from b tagging, JES, and pileup systematic uncertainties in
the control region, added in quadrature, while the statistical uncertainty is negligible. The
Z + jet mass in the Z + ≥3 jets (one b tag) control region is plotted in figure 4 after the
correction factor has been applied. Agreement between the data and simulated sample is
observed in the Z + ≥3 jets (no b tag) and Z + ≥3 jets (one b tag) control regions.
The second point requires a different approach since control regions that include two
or more b jets may suffer from signal contamination. We take a two-step approach. First,
the simulated Z+ b-jets events are weighted by the ratio of the k-factor (NLO cross section
divided by LO cross section) for Z + 2 b-jets to that of Z+2 jets, using mcfm [39]. The
ratios vary from 1.09 to 2.56 in the b jet pair mass range of 20 GeV to 1.8 TeV. In the
simulation, about 20% of Z+jets events in the signal region have at least two jets originating
from b quarks. In the second step, the remaining difference between data and simulation is
evaluated as a function of b jet pair mass in the non-signal, off-diagonal regions (sidebands)
in the b jet pair mass and Z + jet mass plane. The uncertainty in the Z+heavy-flavor jets
processes is taken from this difference or from the uncertainty in the CMS cross section
measurement [40], whichever is larger. The uncertainty varies from 20 to 50%.
– 6 –