7306 IEEE SENSORS JOURNAL, VOL. 16, NO. 20, OCTOBER 15, 2016
owns K radio resources (termed access resources (ARs)),
which are assigned to the MTCD-to-MTCG and MTCD-to-
eNB links. Each MTCD-to-MTCG link and MTCD-to-eNB
link can consume u ARs of the connected MTCG and eNB,
where u ∈{1,...,U} and U ≤ K . Moreover, data packets
transmitted from an MTCD follow a Poisson distribution with
the mean rate of λ
m
, and the MTC packet size follows an
exponential distribution with a mean size of ρ
m
. Therefore,
the total arrival rate of the MTC data packets to the network
is equal to Mλ
m
.
When an MTCD is connected to an MTCG, the MTCG
performs as a one-hop relay to forward data packets to the eNB
via the MTCG-to-eNB link. It is assumed that the MTCG-to-
eNB link employs the 3GPP LTE/LTE-A cellular communica-
tions technique, which supports high bandwidth transmission
and operates in a different frequency band from NB-IoT.
We suppose that the eNB owns L radio resources (termed
backhaul resources (BRs)) for the MTCG-to-eNB links, and
the eNB has many more BRs than ARs, i.e., K < L.
Each MTCG-to-eNB link occupies w BRs of the eNB, where
w ∈{1,...,W}, W ≤ L. Meanwhile, the eNB is also
responsible for supporting H2H communications. H HTC
devices (HTCDs) coexist with MTCDs in the serving areas of
the network, and interact with the eNB through the HTCD-to-
eNB links. The HTCD-to-eNB links and MTCG-to-eNB links
share the common BRs of the eNB, which allocates v BRs
for each HTCD-to-eNB link, where v ∈{1,...,V }, V ≤ L.
Assume that each HTCD sending data packets also follows
an independent Poisson process with a mean rate of λ
h
,and
the size of the HTC packets also follows an exponential
distribution with a mean size of ρ
h
. Each MTCG-to-eNB
link and HTCD-to-eNB link also can support l
h
transmission
modes with different transmission rates C
h
(·).Let{{θ
h,i
}
l
h
+1
i=1
}
be the set of SNR boundary points, where θ
h,l
h
+1
=∞.
When SNR ξ ∈[θ
h,i
,θ
h,i+1
), the transmission rate C
h
(ξ) of
transmission mode i is equal to μ
h,i
,wherei ∈{1,...,l
h
}.
To maximize network revenue, the network must decide the
access strategy for each MTCD (i.e., through the eNB or an
MTCG), and optimize radio resource allocation for each
wireless link. Thus, we formulate the radio resource allocation
model in the software-defined IoT network as an SMDP
problem. In SDN, all the decision-making procedures are
carried out in the SDN controller. When a new MTC or HTC
data packet arrives, the SDN controller first evaluates the
expected system gain and system expense based on the current
status information of the eNB and MTCGs. Then, the SDN
controller decides whether to accept or reject the data packets,
which MTCGs or eNB should be selected for MTC traffic
transmission, and how to allocate radio resources to each
wireless link. In the SMDP framework, the decisions adopted
by the SDN controller are called actions, and the moments
when decisions are made are termed decision epochs.The
action chosen is based on the current system state of the net-
work, which includes the current traffic load on each MTCG
and the eNB. To make the optimal decision for the reward
model, the SDN controller needs to obtain the system reward
for each action before making any decision. The long-term
expected average reward per unit time of the network is
considered as the optimal criterion for the SMDP.
B. Problem Formulation
This subsection aims to formulate the considered optimized
problem as an SMDP. The corresponding system states, actions
based on each state, and the reward model are described as
follows.
1) System States: The system state S of the software-
defined IoT network can be represented by the number of
current wireless links in the network with different numbers
of radio resources occupied, as well as an event occurred in
the system, which could be either the arrival or departure of
a data packet. The system state space S can be denoted as
follows, i.e.,
S =
s | s = (g
1
, g
2
,...,g
N
, s
1
, s
2
,...,s
N
, s
m
, s
h
, e)
,
(1)
where g
i
, s
i
, s
m
, s
h
, i ∈{1,...,N} are defined as
g
i
= (g
i,1
, g
i,2
,...,g
i,U
),
s
i
= (s
i,1
, s
i,2
,...,s
i,W
),
s
m
= (s
m,1
, s
m,2
,...,s
m,U
),
s
h
= (s
h,1
, s
h,2
,...,s
h,V
). (2)
The above symbols are explained in detail below
• g
i
: a vector of g
i,u
,whereu ∈{1,...,U}. g
i,u
represents
the number of wireless links between the MTCDs and
the ith MTCG that occupy u ARs. The total num-
ber of allocated ARs of the ith MTCG should satisfy
U
u=1
(ug
i,u
) ≤ K ;
• s
m
: a vector of s
m,u
,whereu ∈{1,...,U}. Similarly,
s
m,u
represents the number of MTCD-to-eNB links that
occupy u ARs. The total number of allocated ARs of the
eNB should also be subjected to
U
u=1
(us
m,u
) ≤ K ;
• s
i
: a vector of s
i,w
,wherew ∈{1,...,W}. s
i,w
is the
number of the wireless links between the ith MTCG and
the eNB that occupy w BRs;
• s
h
: a vector of s
h,v
,wherev ∈{1,...,V }. s
h,v
indi-
cates the number of HTCD-to-eNB links that occupy
v BRs. Thus, the total number of allocated BRs
of the eNB should satisfy
N
i=1
W
w=1
(ws
i,w
)
+
V
v=1
(vs
h,v
) ≤ L;
• e: An event in the event set E , i.e., e ∈ E .
The event set E is denoted by
E ={A
m
, A
h
}∪D
1
∪ D
2
∪ ...∪ D
N
∪ D
m
∪ D
h
, (3)
where D
i
, D
m
, D
h
, i ∈{1,...,N} are defined as
D
i
={D
i,u,w
| u ∈{1,...,U},w∈{1,...,W}},
D
m
={D
m,u
| u ∈{1,...,U}},
D
h
={D
h,v
| v ∈{1,...,V }}. (4)
Each event is detailed as follows
• A
m
: the network receives a data packet from an
MTCD. A
m
can be further denoted as A
m
=
(μ
1
,μ
2
,...,μ
N
,μ
N+1
),where{μ
i
}
N
i=1
represent the
transmission rate of the wireless link between the MTCD