eling and analysis, and in particular, the measurement of
the second-order statistics (variance and autocorrelation) in
various time scales using packet traces. That area, which
started with the seminal Bellcore work [9], revealed that
the traffic count process at a network link is asymptotically
self-similar. The reader is referred to [12] for a survey of the
related literature. Our approach and objectives in this work
are significantly different. First, instead of passive traffic
measurements at a single link we are interested in the active
estimation of the avail-bw in an end-to-end path. Second,
instead of focusing on the scaling properties of the avail-bw
process, we focus on the variability of the marginal distri-
bution at a given time scale. Third, our high-level goal is
to develop tools that can be used in practice to measure
important path characteristics, rather than to statistically
characterize or model network traffic.
1.3 Main Contributions and Overview
In this paper, we first present a measurement technique,
referred to as percentile sampling, that can associate a given
probing rate with a percentile of the avail-bw distribution.
We then use percentile sampling to design two estimation
algorithms for the avail-bw variation range.
The first algorithm is iterative in nature. We refer to it as
non-parametric, because it does not assume a specific avail-
bw distribution. The non-parametric algorithm is more ap-
propriate for very short values of the measurement time scale
(typically less than 100msec) or in bottlenecks with limited
flow multiplexing, where the avail-bw distribution may be
non-Gaussian.
The second algorithm is parametric, because it assumes
that the avail-bw follows the Gaussian distribution. This as-
sumption is typically valid when τ >100-200msec and when
the tight link carries a significant amount of aggregated traf-
fic [8]. The parametric algorithm can produce an estimate
faster than the non-parametric algorithm because it is not
iterative.
The two estimation algorithms have been implemented
in a measurement tool called Pathvar. We have validated
Pathvar with simulations and testbed experiments using re-
alistic Internet traffic. Pathvar can track the actual avail-bw
variation range within 10-20%, even under non-stationary
conditions.
Finally, we focus on four factors that can significantly af-
fect the variation range of the avail-bw. These factors are
the traffic load, number of competing flows, rate of com-
peting flows, as well as the measurement time scale. The
results of that study explain why the avail-bw appears as
less or more variable depending on the load conditions and
the degree of statistical multiplexing at the tight link.
The rest of the paper is structured as follows. The per-
centile sampling technique is described in §2. §3 presents the
non-parametric estimation algorithm, while §4 presents the
parametric algorithm. The implementation of Pathvar, and
a few typical validation results, are summarized in §5. Fi-
nally, we examine the four factors that affect the variability
of the avail-bw process in §6. We conclude in §7.
2. PERCENTILE SAMPLING
In this section, we first describe the basic technique of
percentile sampling, which forms the basis of the proposed
estimation algorithms in the next two sections. A number
N of probing streams of duration τ and rate R are sent to
a path. Each stream provides an indication of whether the
avail-bw in the corresponding time interval is higher than
R. The resulting N binary samples are used to estimate the
percentile of the avail-bw distribution that corresponds to
rate R. We also derive the required number of samples N
for a given maximum error, assuming independent sampling.
2.1 Basic idea
Consider a network path. The avail-bw random process
measured in time scale τ is A
τ
(t). As mentioned in the Intro-
duction, we assume that this process is stationary and iden-
tically distributed. Given the previous assumptions, we can
focus on the random variable A
τ
and on its time-invariant
marginal distribution F
τ
.
The sender transmits a probing packet stream of rate R
and duration τ during (t, t + τ) to the receiver. If M is the
packet size, then the interarrival between successive pack-
ets is M/R and the number of probing packets is d
τ R
M
e.
The avail-bw during (t, t + τ) is given by a realization of
the random variable A
τ
. The receiver classifies the stream
as type-G if it infers that the probing rate R is greater (or
equal) than A
τ
. Otherwise, the stream is classified as type-
L (for “lower”). The exact algorithm for the classification
of a stream as type-G or type-L is described in the techni-
cal report [7]. At a high level, the algorithm is based on
the statistical analysis of the one-way delays of the stream’s
probing packet, similar to the approach taken in [5].
We use the indicator variable I(R) to represent whether
a stream is of type-G (I(R) = 1) or type-L (I(R) = 0). If
F
τ
(a) is the Cumulative Distribution Function (CDF) of A
τ
,
we have that
I(R) =
1 with probability F
τ
(R)
0 with probability 1 − F
τ
(R)
So, the expected value of I(R) is E[I(R)] = F
τ
(R).
A single probing stream can only tell us if the probing
rate R is greater than the realization of the avail-bw random
variable in the corresponding time interval. To accumulate
N such samples, the sender transmits N identical probing
streams
1
. The indicator variable for each stream is denoted
by I
i
(R). Because different streams will sample different
realizations of A
τ
, some streams may be classified as type-G
and others as type-L. Let I(R, N) be the number of streams
of type-G, i.e., I(R, N ) =
P
N
i=1
I
i
(R). The expected value
of I(R, N) is
E[I(R, N)] =
N
X
i=1
E[I
i
(R)]
= F
τ
(R)N (4)
The following proposition summarizes the basic idea of
percentile sampling:
Proposition 1. For a stationary avail-bw process, the
fraction I(R, N)/N of type-G probing streams of rate R is
an unbiased estimator of p = F
τ
(R).
Proposition 1 provides us with a mapping from a given
probing rate R to the corresponding cumulative probability
in the avail-bw distribution. Since our goal is to estimate
1
The time period between streams should be sufficiently
long for the streams to not get queued behind each other
while in transit.