3
0 1 2 3 4 5 6 7
0
1
2
3
4
fare distance/unit time (m/s)
cruising distance/unit time (m/s)
density−→
r = 0. 0874
Fig. 2. Density scatter of cruising distance/unit time w.r.t.
profit
Trajectories
Parking Detection
Parking places
Segmentation
Trips
Map-MatchingStatistic learning
Knowledge of
road segments
Offline Mining
Knowledge of
parking places
Taxi Recommender Passenger Recommender
Online Recommendation
Fig. 3. System Overview
outperforms other approaches for low-sampling-rate GPS
trajectories. Later, we utilize the detected parking places
and the mapped trajectories to learn the time-dependent
statistical results based on a probabilistic model (Section 4).
To tackle the data sparseness problem, we devise a road
segment clustering method and perform statistical learning on
each road segment cluster instead of a single road segment
(Section 5). The above processes are performed offline and
will be repeated only when the trajectory data is updated
(e.g., once a month). Based on this model, we perform
recommendations to taxi drivers and passengers, given their
locations and current times (Section 6).
3 PARKING PLACES DETECTION
This section details the process for detecting parked status
from a non-occupied trip and accordingly finding out the
parking places in the urban area of a city based on a collection
of taxi trajectories.
3.1 Candidates Detection
Figure 4 demonstrates the parking candidate detection ap-
proach, given a non-occupied trip p
1
→ p
2
→ · · · → p
7
.
We first keep checking the distance (Great-circle distance)
between the current point and the latter point until the
distance is smaller than a threshold. As depicted in Figure
4 B), since dist(p
1
, p
2
) exceeds the distance threshold δ,
we move next, fixing p
2
as the “pivot” point and find that
dist(p
2
, p
3
) < δ, dist(p
2
, p
4
) < δ while dist(p
2
, p
5
) > δ
p
1
p
7
p
1
p
7
p
1
p
2
p
3
p
4
p
5
p
6
p
7
p
1
p
2
p
3
p
4
p
5
p
6
p
7
(A)
(B)
(C)
(D)
p
3
p
4
p
5
p
6
p
3
p
4
p
5
p
6
p
2
δ
p
2
Fig. 4. Detection of candidate parking places
(Figure 4 C). If the time interval between p
2
.t and p
4
.t is
larger than the time threshold τ , the three points that form
a small cluster represent a possible parking candidate. Next,
we fix p
3
as the pivot point and keep on the procedure to
check latter points. Finally, as shown in Figure 4 D), we
detect (p
2
, p
3
, p
4
, p
5
, p
6
) as a parking candidate because we
cannot expand this group any further, i.e., all the points in this
group have a distance farther than δ to p
7
. The pseudocode
is provided in Algorithm 1.
3.2 Filtering
Essentially, the candidate detection algorithm finds out the lo-
cations where the GPS points of a taxi are densely clustered,
with spatial and temporal constraints. However, a parking
candidate could sometimes be generated by taxis stuck in a
traffic jam, or waiting for signals at a traffic light, instead of
a real parking. To reduce such false selections, we design a
supervised model for picking out the true parked status from
the candidate set, using the following features:
• Spatial-Temporal features including 1) Minimum
Bounding Ratio (MBR). As shown in Figure 5(A),B)),
MBR is the area ratio between the bounding box of
the road segment (MBRr) and the bounding box of
the GPS points (MBRc) in the candidate set. 2) Av-
erageDistance. The average distance d
c
between points
in the candidate set and their nearest road segments, as
Algorithm 1: ParkingCandidateDetection
Input: A road network G, a trajectory J, distance threshold δ, time
threshold τ
Output: A set of parking candidates P = {P }
1 i ← 0, M ← kJk, P ← ∅, P ← ∅;
2 while i < (M − 1) do
3 j ← i + 1; flag ←false;
4 while j < M do
5 dist ←Distance (p
i
, p
j
);
6 if dist < δ then j ← j + 1;flag =true;
7 else break;
8 if p
j−1
.t − p
i
.t > τ and flag =true then
9 foreach point p ∈ J[i, j) and p /∈ P do
10 P .Add(p);/
*
build a candidate
*
/
11 if i = j − 1 then
12 P.Add(MB (P)); P ← ∅;
/
*
add the minimum bounding box of P
into P
*
/
13 i ← i + 1;
14 return P