没有合适的资源?快使用搜索试试~ 我知道了~
首页最新《智能交通系统的深度强化学习》综述论文
最新《智能交通系统的深度强化学习》综述论文
需积分: 0 695 浏览量
更新于2023-05-29
评论
收藏 837KB PDF 举报
最新的技术进步提高了交通运输的质量。新的数据驱动方法为所有基于控制的系统(如交通、机器人、物联网和电力系统)带来了新的研究方向。
资源详情
资源评论
资源推荐

1
Deep Reinforcement Learning for Intelligent
Transportation Systems: A Survey
Ammar Haydari, Student Member, IEEE, Yasin Yilmaz, Member, IEEE
Abstract—Latest technological improvements increased the
quality of transportation. New data-driven approaches bring
out a new research direction for all control-based systems, e.g.,
in transportation, robotics, IoT and power systems. Combining
data-driven applications with transportation systems plays a key
role in recent transportation applications. In this paper, the
latest deep reinforcement learning (RL) based traffic control
applications are surveyed. Specifically, traffic signal control (TSC)
applications based on (deep) RL, which have been studied exten-
sively in the literature, are discussed in detail. Different problem
formulations, RL parameters, and simulation environments for
TSC are discussed comprehensively. In the literature, there are
also several autonomous driving applications studied with deep
RL models. Our survey extensively summarizes existing works
in this field by categorizing them with respect to application
types, control models and studied algorithms. In the end, we
discuss the challenges and open questions regarding deep RL-
based transportation applications.
Index Terms—Deep reinforcement learning, Intelligent trans-
portation systems, Traffic signal control, Autonomous driving,
Multi-agent systems.
I. INTRODUCTION
With increasing urbanization and latest advances in au-
tonomous technologies, transportation studies evolved to more
intelligent systems, called intelligent transportation systems
(ITS). Artificial intelligence (AI) tries to control systems with
minimal human intervention. Combination of ITS and AI
provides effective solutions for the 21st century transportation
studies. The main goal of ITS is providing safe, effective
and reliable transportation systems to participants. For this
purpose, optimal traffic signal control (TSC), autonomous
vehicle control, traffic flow control are some of the key
research areas.
The future transportation systems are expected to include
full autonomy such as autonomous traffic management and
autonomous driving. Even now, semi-autonomous vehicles
occupy the roads and the level of autonomy is likely to increase
in near future. There are several reasons why authorities want
autonomy in ITS such as time saving for drivers, energy saving
for environment, and safety for all participants. Travel time
savings can be provided by coordinated and connected traffic
systems that can be controlled more efficiently using self-
autonomous systems. When vehicles spend more times on traf-
fic, fuel consumption increases, which has environmental and
economic impacts. Another reason why human intervention is
tried to be minimized is the unpredictable nature of human
behavior. It is expected that autonomous driving will decrease
traffic accidents and increase the quality of transportation. For
all the reasons stated above, there is a high demand on various
aspects of autonomous controls in ITS. One popular approach
is to use experience-based learning models, similar to human
learning.
Growing population in urban areas causes a high volume of
traffic, supported by the fact that the annual congestion cost
for a driver in the US was 97 hours and $1,348 in 2018 [1].
Hence, controlling traffic lights with adaptive modules is a
recent research focus in ITS. Designing an adaptive traffic
management system through traffic signals is an effective
solution for reducing the traffic congestion. The best approach
for optimizing traffic lights is still an open question for
researchers, but one promising approach for optimum TSC
is to use learning-based AI techniques.
There are three main machine learning paradigms. Super-
vised learning makes decision based on the output labels
provided in training. Unsupervised learning works based on
pattern discovery without having the pre-knowledge of output
labels. The third machine learning paradigm is reinforcement
learning (RL), which takes sequential actions rooted in Markov
Decision Process (MDP) with a rewarding or penalizing cri-
terion. RL combined with deep learning, named deep RL, is
currently accepted as the state-of-the art learning framework
in control systems. While RL can solve complex control
problems, deep learning helps to approximate highly nonlinear
functions from complex dataset.
Recently, many deep RL based solution methods are pre-
sented for different ITS applications. There is an increasing
interest on RL based control mechanisms in ITS applications
such as traffic management systems and autonomous driving
applications. Gathering all the data-driven ITS studies related
to deep RL and discussing such applications together in a
paper is needed for informing ITS researchers on deep RL, as
well as deep RL researchers on ITS.
In this paper, we review the deep RL applications proposed
for ITS problems, predominantly for TSC. Different RL ap-
proaches from the literature are discussed. TSC solutions based
on standard RL techniques have already been studied before
the invention of deep RL. Hence, we believe standard RL
techniques also have high importance in reviewing the deep
RL solutions for ITS, in particular TSC. Since traffic inter-
section models are mainly connected and distributed, multi-
agent dynamic control techniques, which are also extensively
covered in this survey, play a key role in RL-based ITS
applications.
A. Contributions
This paper presents a comprehensive survey on deep RL
applications for ITS by discussing a theoretical overview of
arXiv:2005.00935v1 [cs.LG] 2 May 2020

2
deep RL, different problem formulations for TSC, various
deep RL applications for TSC and other ITS topics, and
finally challenges with future research directions. The targeted
audience are the ITS researchers who want to have a jump
start in learning deep RL techniques, and also deep RL
researchers who are interested in ITS applications. We also
believe that this survey will serve as “a compact handbook of
deep RL in ITS” for more experienced researchers to review
the existing methods and open challenges. Our contributions
can be summarized as follows.
• The first comprehensive survey of RL and deep RL based
applications in ITS is presented.
• From a broad concept, theoretical background of RL and
deep RL models, especially those which are used in the
ITS literature, are explained.
• Existing works in TSC that use RL and deep RL are dis-
cussed and clearly summarized in tables for appropriate
comparisons.
• Similarly, different deep RL applications in other ITS
areas, such as autonomous driving, are presented and
summarized in a table for comparison.
B. Organization
The paper organized as follows.
• Section II: Related Work
• Section III: Deep RL: An Overview
– Section III-A: Reinforcement Learning
– Section III-B: Deep Reinforcement Learning
– Section III-C: Summary of Deep RL
• Section IV: Deep RL Settings for TSC
– Section IV-A: State
– Section IV-B: Action
– Section IV-C: Reward
– Section IV-D: Neural Network Structure
– Section IV-E: Simulation Environments
• Section V: Deep RL Applications for TSC
– Section V-A: Standard RL Applications
– Section V-B: Deep RL Applications
• Section VI: Deep RL for Other ITS Applications
– Section VI-A: Autonomous Driving
– Section VI-B: Energy Management
– Section VI-C: Road Control
– Section VI-D: Various ITS Applications
• Section VII: Challenges and Open Research Questions
II. RELATED WORK
The earliest work summarizing AI models for TSC includ-
ing RL and other approaches dates back to 2007 [2]. At that
time, fuzzy logic, artificial neural networks and RL was three
main popular AI methods researchers applied on TSC. Due to
the connectedness of ITS components, such as intersections,
multi-agent models provide a more complete and realistic
solution than single-agent models. Hence, formulating the TSC
problem as a multi-agent system has a high research potential.
The opportunities and research directions of multi-agent RL
for TSC is studied in [3]. [4] discusses the popular RL methods
in the literature from an experimental perspective. Another
comprehensive TSC survey for RL methods is presented in
[5]. A recent survey presented in [6] studies AI methods in
ITS from a broad perspective. It considers applications of
supervised learning, unsupervised learning and RL for vehicle
to everything communications.
Abduljabbar et al. summarizes the literature of AI based
transportation applications in [7] with three main topics: trans-
portation management applications, public transportation, and
autonomous vehicles. In [8], authors discuss the TSC meth-
ods in general, including classical control methods, actuated
control, green-wave, max-band systems, and RL based control
methods. Veres et al. highlights the trends and challenges of
deep learning applications in ITS [9]. Deep learning models
play a significant role in deep RL. Nonlinear neural networks
overcome traditional challenges such as scalability in the
data-driven ITS applications. Lately, a survey of deep RL
applications for autonomous vehicles is presented in [10],
where authors discuss recent works with the challenges of
real-world deployment of such RL-based autonomous driving
methods. In addition to autonomous driving, in this survey we
discuss a broad class of ITS applications where deep RL is
gaining popularity, together with a comprehensive overview of
the deep RL concept.
There is no survey in the literature dedicated to the deep RL
applications for ITS, which we believe is a very timely topic
in the ITS research. Thus, this paper will fill an important
gap for ITS researchers and deep RL researchers interested in
ITS.
III. DEEP RL: AN OVERVIEW
Deep RL is one of the most successful AI models and
the closest machine learning paradigm to human learning. It
combines deep neural networks and RL for more efficient
and stabilized function approximations especially for high-
dimensional and infinite-state problems. This section describes
the theoretical background of traditional RL and major deep
RL algorithms implemented in ITS applications.
A. Reinforcement Learning
RL is a general learning tool where an agent interacts with
the environment to learn how to behave in an environment
without having any prior knowledge by learning to maximize
a numerically defined reward (or to minimize a penalty). After
taking an action, RL agent receives a feedback from the
environment at each time step t about the performance of its
action. Using this feedback (reward or penalty) it iteratively
updates its action policy to reach to an optimum control policy.
RL learns from experiences with the environment, exhibiting a
trial-and-error kind of learning, similar to human learning [11].
The fundamental trade-off between exploration and exploita-
tion in RL strikes a balance between new actions and learned
actions. From a computational perspective, RL is a data-driven
approach which iteratively computes an approximate solution
to the optimum control policy. Hence, it is also known as
approximate dynamic programming [11] which is one type

3
of sequential optimization problem for dynamic programming
(DP).
In a general RL model, an agent controlled with an algo-
rithm, observes the system state s
t
at each time step t and
receives a reward r
t
from its environment/system after taking
the action a
t
. After taking an action based on the current
policy π, the system transitions to the next state s
t+1
. After
every interaction, RL agent updates its knowledge about the
environment. Fig 1 depicts the schematic of the RL process.
Action a
t
State s
t
Reward r
t
Environment
Agent
Fig. 1: Reinforcement learning control loop.
1) Markov Decision Process: RL methodology formally
comes from a Markov Decision Process (MDP), which is a
general mathematical framework sequential decision making
algorithms. MDP consist of 5 elements in a tuple:
• A set of states S,
• A set of actions A,
• Transition function T (s
t+1
|s
t
, a
t
) which maps a state-
action pair for each time t to a distribution of next state
s
t+1
,
• Reward function R(s
t
, a
t
, s
t+1
) which gives the reward
for taking action a
t
from state s
t
when transitioning to
the next state s
t+1
,
• Discount factor γ between 0 and 1 for future rewards.
The essential Markov property is that given the current state
s
t
, the next state s
t+1
of system is independent from the
previous states (s
0
, s
1
, ..., s
t−1
). In control systems including
transportation systems, MDP models are mostly episodic in
which the system has a terminal point for each episode based
on the end time T or the end state s
T
. The goal of an MDP
agent is to find the best policy π
∗
that maximizes the expected
cumulative reward E[R
t
|s, π] for each state s and cumulative
discounted reward (i.e., return)
R
t
=
T −1
X
i=0
γ
i
r
t+i
, (1)
with the discount parameter γ which reflects the importance
of future rewards. Choosing a larger γ value between 0 and 1
means that agent’s actions have higher dependency on future
reward. Whereas, a smaller γ value results in actions that
mostly care about the instantaneous reward r
t
.
In general, RL agent can act in two ways: (i) by know-
ing/learning the transition probability T from state s
t
to s
t+1
,
which is called model-based RL, (ii) and by exploring the
environment without learning a transition model, which is
called model-free RL. Model-free RL algorithms are also
divided into two main groups as value-based and policy-based
methods. While in value-based RL, the agent at each iteration
updates a value function that maps each state-action pair to
a value, in policy-based methods, policy is updated at each
iteration using policy gradient [11]. We next explain the value-
based and policy-based RL methods in detail.
2) Value-based RL: Value function determines how good
a state is for the agent by estimating the value (i.e., expected
return) of being in a given state s under a policy π as
V
π
(s) = E[R
t
|s, π]. (2)
The optimum value function V
∗
(s) describes the maximized
state value function over the policy for all states:
V
∗
(s) = max
π
V
π
(s), ∀s S. (3)
Adding the effect of action, state-action value function
named as quality function (Q-function) is commonly used to
reflect the expected return in a state-action pair:
Q
π
(s, a) = E[R
t
|s, a, π]. (4)
Optimum action value function (Q-function) is calculated
similarly to the optimum state value function by maximizing
its expected return over all states. Relation between the opti-
mum state and action value functions is given by
V
∗
(s) = max
a
Q
∗
(s, a), ∀s S. (5)
Q-function Q
∗
(s, a) provides the optimum policy π
∗
by
selecting the action a that maximizes the Q-value for the state
s:
π
∗
(s) = argmax
a
Q
∗
(s, a), ∀s S. (6)
Based on the definitions above, there are two main value-
based RL algorithms: Q-learning [12] and SARSA [13], which
are classified as off-policy RL algorithm, and on-policy RL
algorithm, respectively. In both algorithms, the values of state-
action pairs (Q-value) are stored in a Q-table, and are learned
via the recursive nature of Bellman equations utilizing the
Markov property:
Q
π
(s
t
, a
t
) = E
π
[r
t
+ γQ
π
(s
t+1
, π(s
t+1
)]. (7)
In practice, Q
π
estimates are updated with a learning rate
α to improve the estimation as follows
Q
π
(s
t
, a
t
) ← Q
π
(s
t
, a
t
) + α(y
t
− Q
π
(s
t
, a
t
)) (8)
where y
t
is the temporal difference (TD) target for Q
π
(s
t
, a
t
).
The TD step size is a user-defined parameter and determines
how many experience steps (i.e., actions) to consider in
computing y
t
, the new instantaneous estimate for Q
π
(s
t
, a
t
).
The rewards R
(n)
t
=
P
n−1
i=0
γ
i
r
t+i
in the predefined number of

4
n TD steps, together with the Q-value Q
π
(s
t+n
, a
t+n
) after n
steps give y
t
. The difference between Q-learning and SARSA
becomes clear in this stage. Q-learning is an off-policy model,
in which actions of the agent are updated by maximizing Q-
values over the action, whereas SARSA is an on-policy model,
in which actions of the agent are updated according to the
policy π derived from the Q-function:
y
Q−learning
t
= R
(n)
t
+ γ
n
max
a
t+n
Q
π
(s
t+n
, a
t+n
), (9)
y
SARSA
t
= R
(n)
t
+ γ
n
Q
π
(s
t+n
, a
t+n
). (10)
While Q-learning follows a greedy approach to update its Q-
value estimates, SARSA follows the same policy for both
updating Q-values and taking actions. To encourage exploring
new states usually an -greedy policy is used for taking actions
in both Q-learning and SARSA. In the -greedy policy, a
random action is taken with probability , and the best action
with respect to the current policy defined by Q(s, a) is taken
with probability 1 − .
In both Q-learning and SARSA, the case with maximum
TD steps, typically denoted with n = ∞ to express the end
of episode, corresponds to a fully experience-based technique
called Monte-Carlo RL, in which the Q-values are updated
only once at the end of each episode. This means the same
policy is used without any updates to take actions throughout
an episode. The TD(λ) technique generalizes TD learning
by averaging all TD targets with steps from 1 to ∞ with
exponentially decaying weights, where λ is the decay rate [11].
3) Policy-based RL: Policy-based RL algorithms treat the
policy π
θ
as a probability distribution over state-action pairs
parameterized by θ. Policy parameters θ are updated in order
to maximize an objective function J(θ), such as the expected
return E
π
θ
[R
t
|θ] = E
π
θ
[Q
π
θ
(s
t
, a
t
)|θ]. The performance of
policy-based methods are typically better than that of value-
based methods on continuous control problems with infinite-
dimensional action space or high-dimensional problems since
policy does not require to explore all the states in a large and
continuous space and store them in a table. Although there are
some effective gradient-free approaches in the literature for
optimizing policies in non-RL methods [14], gradient-based
methods are known to be more useful for policy optimization
in all types of RL algorithms.
Here, we briefly discuss the policy gradient-based RL
algorithms, which select actions using the gradient of objective
function J(θ) with respect to θ, called the policy gradient. In
the well-known policy gradient algorithm REINFORCE [15],
the objective function is the expected return, and using the log-
derivative trick ∇ log π
θ
=
∇π
θ
π
θ
the policy gradient is written
as
∇
θ
J(θ) = E
π
θ
[Q
π
θ
(s, a)∇
θ
log π
θ
]. (11)
Since computing the entire gradient is not efficient, REIN-
FORCE uses the popular stochastic gradient descent technique
to approximate the gradient in updating the parameters θ.
Using the return R
t
at time t as an estimator of Q
π
θ
(s
t
, a
t
)
in each Monte-Carlo iteration it performs the update
θ ← θ + α∇
θ
log π
θ
R
t
, (12)
where α is the learning rate. Specifically, θ is updated in the
∇
θ
log π
θ
direction with weight R
t
. That is, if the approximate
policy gradient corresponds to a high reward R
t
, this gradient
direction is reinforced by the algorithm while updating the
parameters.
One problem with the Monte-Carlo policy gradient is its
high variance. To reduce the variance in policy gradient
estimates Actor-Critic algorithms use the state value function
V
π
θ
(s)
as a baseline. Instead of Q
π
θ
(s, a), the advantage
function [16] A
π
θ
(s, a) = Q
π
θ
(s, a) − V
π
θ
(s) is used in the
policy gradient
∇
θ
J(θ) = E
π
θ
[A
π
θ
(s, a)∇
θ
log π
θ
]. (13)
The advantage function, being positive or negative, determines
the update direction: go in the same/opposite direction of
actions yielding higher/lower reward than average. Actor-
Critic method is further discussed in Section III-B3 within
the Deep RL discussion.
4) Multi-Agent RL: Many real world problems require
interacting multiple agents to maximize the learning perfor-
mance. Learning with multiple agents is a challenging task
since each agent should consider other agents’ actions to
reach a globally optimum solution. Increasing the number
of agents also increases the state-action dimensions, thus
decomposing the tasks between agents is a scalable approach
for large control systems. There are two main issues with high-
dimensional systems in multi-agent RL in terms of state and
actions: stability and adaptation of agents to the environment
[17]. When each agent optimizes its action without considering
close agents, the optimal learning for overall system would be-
come non-stationary. There are several approaches to address
this problem in multi-agent RL systems such as distributed
learning, cooperative learning and competitive learning [17].
B. Deep Reinforcement Learning
In high-dimensional state spaces, standard RL algorithms
cannot efficiently compute the value functions or policy func-
tions for all states. Although some linear function approx-
imation methods are proposed for solving the large state
space problem in RL, their capabilities are still up to a
certain point. In high-dimensional and complex systems, stan-
dard RL approaches cannot learn informative features of the
environment for effective function approximation. However,
this problem can be easily handled by deep learning based
function approximators, in which deep neural networks are
trained to learn the optimal policy or value functions. Different
neural network structures such as convolutional neural network
(CNN) and recurrent neural network (RNN) are used for
training RL algorithms in large state spaces [18].
The main concept of deep learning is to extract useful
patterns from data. Deep learning models are roughly inspired
by the multi-layered structure of human neural system. Today,
deep learning has applications in a wide spectrum of areas, in-
cluding computer vision, speech recognition, natural language
processing, and the deep RL applications.

5
1) Deep Q-Network: Since value-based RL algorithms
learn the Q-function by populating a Q-table, it is not feasible
to visit all the states and actions in a large state space and con-
tinuous action problems. The leading approach to this problem,
called Deep Q-Network (DQN) [19], is to approximate the Q-
function with deep neural networks. Original DQN receives
raw input image as state, and estimates Q-values from them
using CNNs. Denoting the neural network parameters with
θ the Q-function approximation is written as Q(s, a; θ). The
output of neural network is the best action selected according
to (6) using a discrete set of approximate action values.
The major contribution of Mnih et al. [19] was two novel
techniques to stabilize learning with deep neural networks:
target network and experience replay. The original DQN
algorithm is shown to significantly outperform the expert
human performance on several classic Atari video games. The
complete DQN algorithm with experience replay and target
network is given by Algorithm 1.
Target Network: One of the main parts of DQN that stabilize
learning is the target network. DQN has two separate networks
denoted as the main network that approximates the Q-function,
and the target network that gives the TD target for updating the
main network. In the training phase, while the main network
parameters θ are updated after every action, target network
parameters θ
–
are updated after a certain period of time.
The reason why target network is not updated after every
iteration is that it adjusts the main network updates to keep
the value estimations in control. If both networks were updated
at the same time, the change in the main network would be
exaggerated due to the feedback loop by the target network,
resulting in an unstable network. Similar to (9), 1-step TD
target y
t
is written as
y
DQN
t
= r
t
+ γ max
a
t+1
Q
π
(s
t+1
, a
t+1
; θ
–
t
), (14)
where Q
π
(s
t+1
, a
t+1
; θ
–
t
) denotes the target network.
Experience Replay: DQN introduces another distinct fea-
ture called experience replay which stores recent experiences
(s
t
, a
t
, r
t
, s
t+1
) in replay memory, and samples batches uni-
formly from the replay memory for training neural network.
There are two main reasons why experience replay is used in
DQN. Firstly, it prevents the agent from getting stuck into the
recent trajectories by doing random sampling since RL agents
are prone to temporal correlations in the consecutive samples.
Furthermore, instead of learning over full observations, DQN
agent learns over mini-batches that increases the efficiency
of the training. In a fixed-size memory defined for experience
replay, the memory stores only recent M samples by removing
the oldest experience for allocating a space to the latest sample.
The same technique is applied in other deep RL algorithms
[20], [21].
Prioritized Experience Replay: Experience replay technique
samples experiences uniformly from the memory, however,
some experiences has more impact on learning than the
others. A new approach prioritizing significant actions over
other actions is proposed in [22] by changing the sampling
distribution of DQN algorithm. The overall idea for prioritized
experience replay is that the samples with higher TD error,
y
DQN
t
− Q
π
(s
t
, a
t
; θ
−
t
), receives higher ranking in terms of
probability than the other samples by applying a stochastic
sampling with proportional prioritization or rank-based prior-
itization. The experiences are sampled based on the assigned
probabilities.
Algorithm 1 DQN algorithm
1: Input Replay memory size M , batch size d, number of
episodes E, and number of time steps T
2: Inititalize Main network weights θ
3: Inititalize Target network weights θ
−
4: Inititalize Replay memory
5: for e = 1, . . . , E do
6: Inititalize state s
1
, and action a
1
7: for t = 1, . . . , T do
8: Take action a
t
= argmax
a
Q
π
(s
t
, a; θ) with prob-
ability 1 − or a random action with probability
9: Get reward r
t
and observe next state s
t+1
10: if Replay capacity M is full then
11: Delete the oldest tuple in memory
12: end if
13: Store the tuple (s
t
, a
t
, r
t
, s
t+1
) to replay memory
14: Sample random d tuples from replay memory
15: y
t
=
(
r
t
, if t = T .
r
t
+ γ max
a
Q
π
(s
t+1
, a
t+1
; θ
–
t
), otherwise.
16: Perform policy gradient using y
t
for updating θ
17: Update target network every N step, θ
–
= θ
18: end for
19: end for
2) Double Dueling DQN: DQN is the improved version
of the standard Q-Learning algorithm with a single estimator.
Both DQN and Q-Learning overestimates some actions due
to having single Q function estimations. Authors in [23] pro-
poses doubling the estimators for action selection with main
network and action evaluation with target network separately
in loss minimization similar to the tabular double Q-learning
technique [24]. Instead of selecting the Q value that maximizes
future reward using the target network (see Eq. (14)), double
DQN network selects the action using the main network
and evaluates it using the target network. Action selection is
decoupled with target network for better Q-value estimation:
y
DDQN
t
= r
t
+ γQ
π
(s
t+1
, argmax
a
t+1
Q
π
(s
t+1
, a
t+1
; θ); θ
–
t
).
(15)
Another improved version of DQN is a dueling network
architecture which estimates state value function V
π
(s) and
advantage function A
π
(s, a) separately for each action [25].
Output of the combination of these two networks is a Q-value
for a discrete set of actions through an aggregation layer. This
way dueling DQN learns the important state values without
their corresponding effects on the actions since state value
function V
π
(s) is an action-free estimation.
These two doubling and dueling models on DQN algorithm
with prioritized experience replay are accepted as the state-of-
剩余21页未读,继续阅读



















syp_net
- 粉丝: 147
- 资源: 1197
上传资源 快速赚钱
我的内容管理 收起
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助

会员权益专享
最新资源
- Xilinx SRIO详解.pptx
- Informatica PowerCenter 10.2 for Centos7.6安装配置说明.pdf
- 现代无线系统射频电路实用设计卷II 英文版.pdf
- 电子产品可靠性设计 自己讲课用的PPT,包括设计方案的可靠性选择,元器件的选择与使用,降额设计,热设计,余度设计,参数优化设计 和 失效分析等
- MPC5744P-DEV-KIT-REVE-QSG.pdf
- 通信原理课程设计报告(ASK FSK PSK Matlab仿真--数字调制技术的仿真实现及性能研究)
- ORIGIN7.0使用说明
- 在VMware Player 3.1.3下安装Redhat Linux详尽步骤
- python学生信息管理系统实现代码
- 西门子MES手册 13 OpcenterEXCR_PortalStudio1_81RB1.pdf
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈



安全验证
文档复制为VIP权益,开通VIP直接复制

评论0