没有合适的资源？快使用搜索试试~ 我知道了~

首页最新《智能交通系统的深度强化学习》综述论文

# 最新《智能交通系统的深度强化学习》综述论文

需积分: 0 695 浏览量
更新于2023-05-29
评论
收藏 837KB PDF 举报

最新的技术进步提高了交通运输的质量。新的数据驱动方法为所有基于控制的系统(如交通、机器人、物联网和电力系统)带来了新的研究方向。

资源详情

资源评论

资源推荐

1

Deep Reinforcement Learning for Intelligent

Transportation Systems: A Survey

Ammar Haydari, Student Member, IEEE, Yasin Yilmaz, Member, IEEE

Abstract—Latest technological improvements increased the

quality of transportation. New data-driven approaches bring

out a new research direction for all control-based systems, e.g.,

in transportation, robotics, IoT and power systems. Combining

data-driven applications with transportation systems plays a key

role in recent transportation applications. In this paper, the

latest deep reinforcement learning (RL) based trafﬁc control

applications are surveyed. Speciﬁcally, trafﬁc signal control (TSC)

applications based on (deep) RL, which have been studied exten-

sively in the literature, are discussed in detail. Different problem

formulations, RL parameters, and simulation environments for

TSC are discussed comprehensively. In the literature, there are

also several autonomous driving applications studied with deep

RL models. Our survey extensively summarizes existing works

in this ﬁeld by categorizing them with respect to application

types, control models and studied algorithms. In the end, we

discuss the challenges and open questions regarding deep RL-

based transportation applications.

Index Terms—Deep reinforcement learning, Intelligent trans-

portation systems, Trafﬁc signal control, Autonomous driving,

Multi-agent systems.

I. INTRODUCTION

With increasing urbanization and latest advances in au-

tonomous technologies, transportation studies evolved to more

intelligent systems, called intelligent transportation systems

(ITS). Artiﬁcial intelligence (AI) tries to control systems with

minimal human intervention. Combination of ITS and AI

provides effective solutions for the 21st century transportation

studies. The main goal of ITS is providing safe, effective

and reliable transportation systems to participants. For this

purpose, optimal trafﬁc signal control (TSC), autonomous

vehicle control, trafﬁc ﬂow control are some of the key

research areas.

The future transportation systems are expected to include

full autonomy such as autonomous trafﬁc management and

autonomous driving. Even now, semi-autonomous vehicles

occupy the roads and the level of autonomy is likely to increase

in near future. There are several reasons why authorities want

autonomy in ITS such as time saving for drivers, energy saving

for environment, and safety for all participants. Travel time

savings can be provided by coordinated and connected trafﬁc

systems that can be controlled more efﬁciently using self-

autonomous systems. When vehicles spend more times on traf-

ﬁc, fuel consumption increases, which has environmental and

economic impacts. Another reason why human intervention is

tried to be minimized is the unpredictable nature of human

behavior. It is expected that autonomous driving will decrease

trafﬁc accidents and increase the quality of transportation. For

all the reasons stated above, there is a high demand on various

aspects of autonomous controls in ITS. One popular approach

is to use experience-based learning models, similar to human

learning.

Growing population in urban areas causes a high volume of

trafﬁc, supported by the fact that the annual congestion cost

for a driver in the US was 97 hours and $1,348 in 2018 [1].

Hence, controlling trafﬁc lights with adaptive modules is a

recent research focus in ITS. Designing an adaptive trafﬁc

management system through trafﬁc signals is an effective

solution for reducing the trafﬁc congestion. The best approach

for optimizing trafﬁc lights is still an open question for

researchers, but one promising approach for optimum TSC

is to use learning-based AI techniques.

There are three main machine learning paradigms. Super-

vised learning makes decision based on the output labels

provided in training. Unsupervised learning works based on

pattern discovery without having the pre-knowledge of output

labels. The third machine learning paradigm is reinforcement

learning (RL), which takes sequential actions rooted in Markov

Decision Process (MDP) with a rewarding or penalizing cri-

terion. RL combined with deep learning, named deep RL, is

currently accepted as the state-of-the art learning framework

in control systems. While RL can solve complex control

problems, deep learning helps to approximate highly nonlinear

functions from complex dataset.

Recently, many deep RL based solution methods are pre-

sented for different ITS applications. There is an increasing

interest on RL based control mechanisms in ITS applications

such as trafﬁc management systems and autonomous driving

applications. Gathering all the data-driven ITS studies related

to deep RL and discussing such applications together in a

paper is needed for informing ITS researchers on deep RL, as

well as deep RL researchers on ITS.

In this paper, we review the deep RL applications proposed

for ITS problems, predominantly for TSC. Different RL ap-

proaches from the literature are discussed. TSC solutions based

on standard RL techniques have already been studied before

the invention of deep RL. Hence, we believe standard RL

techniques also have high importance in reviewing the deep

RL solutions for ITS, in particular TSC. Since trafﬁc inter-

section models are mainly connected and distributed, multi-

agent dynamic control techniques, which are also extensively

covered in this survey, play a key role in RL-based ITS

applications.

A. Contributions

This paper presents a comprehensive survey on deep RL

applications for ITS by discussing a theoretical overview of

arXiv:2005.00935v1 [cs.LG] 2 May 2020

2

deep RL, different problem formulations for TSC, various

deep RL applications for TSC and other ITS topics, and

ﬁnally challenges with future research directions. The targeted

audience are the ITS researchers who want to have a jump

start in learning deep RL techniques, and also deep RL

researchers who are interested in ITS applications. We also

believe that this survey will serve as “a compact handbook of

deep RL in ITS” for more experienced researchers to review

the existing methods and open challenges. Our contributions

can be summarized as follows.

• The ﬁrst comprehensive survey of RL and deep RL based

applications in ITS is presented.

• From a broad concept, theoretical background of RL and

deep RL models, especially those which are used in the

ITS literature, are explained.

• Existing works in TSC that use RL and deep RL are dis-

cussed and clearly summarized in tables for appropriate

comparisons.

• Similarly, different deep RL applications in other ITS

areas, such as autonomous driving, are presented and

summarized in a table for comparison.

B. Organization

The paper organized as follows.

• Section II: Related Work

• Section III: Deep RL: An Overview

– Section III-A: Reinforcement Learning

– Section III-B: Deep Reinforcement Learning

– Section III-C: Summary of Deep RL

• Section IV: Deep RL Settings for TSC

– Section IV-A: State

– Section IV-B: Action

– Section IV-C: Reward

– Section IV-D: Neural Network Structure

– Section IV-E: Simulation Environments

• Section V: Deep RL Applications for TSC

– Section V-A: Standard RL Applications

– Section V-B: Deep RL Applications

• Section VI: Deep RL for Other ITS Applications

– Section VI-A: Autonomous Driving

– Section VI-B: Energy Management

– Section VI-C: Road Control

– Section VI-D: Various ITS Applications

• Section VII: Challenges and Open Research Questions

II. RELATED WORK

The earliest work summarizing AI models for TSC includ-

ing RL and other approaches dates back to 2007 [2]. At that

time, fuzzy logic, artiﬁcial neural networks and RL was three

main popular AI methods researchers applied on TSC. Due to

the connectedness of ITS components, such as intersections,

multi-agent models provide a more complete and realistic

solution than single-agent models. Hence, formulating the TSC

problem as a multi-agent system has a high research potential.

The opportunities and research directions of multi-agent RL

for TSC is studied in [3]. [4] discusses the popular RL methods

in the literature from an experimental perspective. Another

comprehensive TSC survey for RL methods is presented in

[5]. A recent survey presented in [6] studies AI methods in

ITS from a broad perspective. It considers applications of

supervised learning, unsupervised learning and RL for vehicle

to everything communications.

Abduljabbar et al. summarizes the literature of AI based

transportation applications in [7] with three main topics: trans-

portation management applications, public transportation, and

autonomous vehicles. In [8], authors discuss the TSC meth-

ods in general, including classical control methods, actuated

control, green-wave, max-band systems, and RL based control

methods. Veres et al. highlights the trends and challenges of

deep learning applications in ITS [9]. Deep learning models

play a signiﬁcant role in deep RL. Nonlinear neural networks

overcome traditional challenges such as scalability in the

data-driven ITS applications. Lately, a survey of deep RL

applications for autonomous vehicles is presented in [10],

where authors discuss recent works with the challenges of

real-world deployment of such RL-based autonomous driving

methods. In addition to autonomous driving, in this survey we

discuss a broad class of ITS applications where deep RL is

gaining popularity, together with a comprehensive overview of

the deep RL concept.

There is no survey in the literature dedicated to the deep RL

applications for ITS, which we believe is a very timely topic

in the ITS research. Thus, this paper will ﬁll an important

gap for ITS researchers and deep RL researchers interested in

ITS.

III. DEEP RL: AN OVERVIEW

Deep RL is one of the most successful AI models and

the closest machine learning paradigm to human learning. It

combines deep neural networks and RL for more efﬁcient

and stabilized function approximations especially for high-

dimensional and inﬁnite-state problems. This section describes

the theoretical background of traditional RL and major deep

RL algorithms implemented in ITS applications.

A. Reinforcement Learning

RL is a general learning tool where an agent interacts with

the environment to learn how to behave in an environment

without having any prior knowledge by learning to maximize

a numerically deﬁned reward (or to minimize a penalty). After

taking an action, RL agent receives a feedback from the

environment at each time step t about the performance of its

action. Using this feedback (reward or penalty) it iteratively

updates its action policy to reach to an optimum control policy.

RL learns from experiences with the environment, exhibiting a

trial-and-error kind of learning, similar to human learning [11].

The fundamental trade-off between exploration and exploita-

tion in RL strikes a balance between new actions and learned

actions. From a computational perspective, RL is a data-driven

approach which iteratively computes an approximate solution

to the optimum control policy. Hence, it is also known as

approximate dynamic programming [11] which is one type

3

of sequential optimization problem for dynamic programming

(DP).

In a general RL model, an agent controlled with an algo-

rithm, observes the system state s

t

at each time step t and

receives a reward r

t

from its environment/system after taking

the action a

t

. After taking an action based on the current

policy π, the system transitions to the next state s

t+1

. After

every interaction, RL agent updates its knowledge about the

environment. Fig 1 depicts the schematic of the RL process.

Action a

t

State s

t

Reward r

t

Environment

Agent

Fig. 1: Reinforcement learning control loop.

1) Markov Decision Process: RL methodology formally

comes from a Markov Decision Process (MDP), which is a

general mathematical framework sequential decision making

algorithms. MDP consist of 5 elements in a tuple:

• A set of states S,

• A set of actions A,

• Transition function T (s

t+1

|s

t

, a

t

) which maps a state-

action pair for each time t to a distribution of next state

s

t+1

,

• Reward function R(s

t

, a

t

, s

t+1

) which gives the reward

for taking action a

t

from state s

t

when transitioning to

the next state s

t+1

,

• Discount factor γ between 0 and 1 for future rewards.

The essential Markov property is that given the current state

s

t

, the next state s

t+1

of system is independent from the

previous states (s

0

, s

1

, ..., s

t−1

). In control systems including

transportation systems, MDP models are mostly episodic in

which the system has a terminal point for each episode based

on the end time T or the end state s

T

. The goal of an MDP

agent is to ﬁnd the best policy π

∗

that maximizes the expected

cumulative reward E[R

t

|s, π] for each state s and cumulative

discounted reward (i.e., return)

R

t

=

T −1

X

i=0

γ

i

r

t+i

, (1)

with the discount parameter γ which reﬂects the importance

of future rewards. Choosing a larger γ value between 0 and 1

means that agent’s actions have higher dependency on future

reward. Whereas, a smaller γ value results in actions that

mostly care about the instantaneous reward r

t

.

In general, RL agent can act in two ways: (i) by know-

ing/learning the transition probability T from state s

t

to s

t+1

,

which is called model-based RL, (ii) and by exploring the

environment without learning a transition model, which is

called model-free RL. Model-free RL algorithms are also

divided into two main groups as value-based and policy-based

methods. While in value-based RL, the agent at each iteration

updates a value function that maps each state-action pair to

a value, in policy-based methods, policy is updated at each

iteration using policy gradient [11]. We next explain the value-

based and policy-based RL methods in detail.

2) Value-based RL: Value function determines how good

a state is for the agent by estimating the value (i.e., expected

return) of being in a given state s under a policy π as

V

π

(s) = E[R

t

|s, π]. (2)

The optimum value function V

∗

(s) describes the maximized

state value function over the policy for all states:

V

∗

(s) = max

π

V

π

(s), ∀s S. (3)

Adding the effect of action, state-action value function

named as quality function (Q-function) is commonly used to

reﬂect the expected return in a state-action pair:

Q

π

(s, a) = E[R

t

|s, a, π]. (4)

Optimum action value function (Q-function) is calculated

similarly to the optimum state value function by maximizing

its expected return over all states. Relation between the opti-

mum state and action value functions is given by

V

∗

(s) = max

a

Q

∗

(s, a), ∀s S. (5)

Q-function Q

∗

(s, a) provides the optimum policy π

∗

by

selecting the action a that maximizes the Q-value for the state

s:

π

∗

(s) = argmax

a

Q

∗

(s, a), ∀s S. (6)

Based on the deﬁnitions above, there are two main value-

based RL algorithms: Q-learning [12] and SARSA [13], which

are classiﬁed as off-policy RL algorithm, and on-policy RL

algorithm, respectively. In both algorithms, the values of state-

action pairs (Q-value) are stored in a Q-table, and are learned

via the recursive nature of Bellman equations utilizing the

Markov property:

Q

π

(s

t

, a

t

) = E

π

[r

t

+ γQ

π

(s

t+1

, π(s

t+1

)]. (7)

In practice, Q

π

estimates are updated with a learning rate

α to improve the estimation as follows

Q

π

(s

t

, a

t

) ← Q

π

(s

t

, a

t

) + α(y

t

− Q

π

(s

t

, a

t

)) (8)

where y

t

is the temporal difference (TD) target for Q

π

(s

t

, a

t

).

The TD step size is a user-deﬁned parameter and determines

how many experience steps (i.e., actions) to consider in

computing y

t

, the new instantaneous estimate for Q

π

(s

t

, a

t

).

The rewards R

(n)

t

=

P

n−1

i=0

γ

i

r

t+i

in the predeﬁned number of

4

n TD steps, together with the Q-value Q

π

(s

t+n

, a

t+n

) after n

steps give y

t

. The difference between Q-learning and SARSA

becomes clear in this stage. Q-learning is an off-policy model,

in which actions of the agent are updated by maximizing Q-

values over the action, whereas SARSA is an on-policy model,

in which actions of the agent are updated according to the

policy π derived from the Q-function:

y

Q−learning

t

= R

(n)

t

+ γ

n

max

a

t+n

Q

π

(s

t+n

, a

t+n

), (9)

y

SARSA

t

= R

(n)

t

+ γ

n

Q

π

(s

t+n

, a

t+n

). (10)

While Q-learning follows a greedy approach to update its Q-

value estimates, SARSA follows the same policy for both

updating Q-values and taking actions. To encourage exploring

new states usually an -greedy policy is used for taking actions

in both Q-learning and SARSA. In the -greedy policy, a

random action is taken with probability , and the best action

with respect to the current policy deﬁned by Q(s, a) is taken

with probability 1 − .

In both Q-learning and SARSA, the case with maximum

TD steps, typically denoted with n = ∞ to express the end

of episode, corresponds to a fully experience-based technique

called Monte-Carlo RL, in which the Q-values are updated

only once at the end of each episode. This means the same

policy is used without any updates to take actions throughout

an episode. The TD(λ) technique generalizes TD learning

by averaging all TD targets with steps from 1 to ∞ with

exponentially decaying weights, where λ is the decay rate [11].

3) Policy-based RL: Policy-based RL algorithms treat the

policy π

θ

as a probability distribution over state-action pairs

parameterized by θ. Policy parameters θ are updated in order

to maximize an objective function J(θ), such as the expected

return E

π

θ

[R

t

|θ] = E

π

θ

[Q

π

θ

(s

t

, a

t

)|θ]. The performance of

policy-based methods are typically better than that of value-

based methods on continuous control problems with inﬁnite-

dimensional action space or high-dimensional problems since

policy does not require to explore all the states in a large and

continuous space and store them in a table. Although there are

some effective gradient-free approaches in the literature for

optimizing policies in non-RL methods [14], gradient-based

methods are known to be more useful for policy optimization

in all types of RL algorithms.

Here, we brieﬂy discuss the policy gradient-based RL

algorithms, which select actions using the gradient of objective

function J(θ) with respect to θ, called the policy gradient. In

the well-known policy gradient algorithm REINFORCE [15],

the objective function is the expected return, and using the log-

derivative trick ∇ log π

θ

=

∇π

θ

π

θ

the policy gradient is written

as

∇

θ

J(θ) = E

π

θ

[Q

π

θ

(s, a)∇

θ

log π

θ

]. (11)

Since computing the entire gradient is not efﬁcient, REIN-

FORCE uses the popular stochastic gradient descent technique

to approximate the gradient in updating the parameters θ.

Using the return R

t

at time t as an estimator of Q

π

θ

(s

t

, a

t

)

in each Monte-Carlo iteration it performs the update

θ ← θ + α∇

θ

log π

θ

R

t

, (12)

where α is the learning rate. Speciﬁcally, θ is updated in the

∇

θ

log π

θ

direction with weight R

t

. That is, if the approximate

policy gradient corresponds to a high reward R

t

, this gradient

direction is reinforced by the algorithm while updating the

parameters.

One problem with the Monte-Carlo policy gradient is its

high variance. To reduce the variance in policy gradient

estimates Actor-Critic algorithms use the state value function

V

π

θ

(s)

as a baseline. Instead of Q

π

θ

(s, a), the advantage

function [16] A

π

θ

(s, a) = Q

π

θ

(s, a) − V

π

θ

(s) is used in the

policy gradient

∇

θ

J(θ) = E

π

θ

[A

π

θ

(s, a)∇

θ

log π

θ

]. (13)

The advantage function, being positive or negative, determines

the update direction: go in the same/opposite direction of

actions yielding higher/lower reward than average. Actor-

Critic method is further discussed in Section III-B3 within

the Deep RL discussion.

4) Multi-Agent RL: Many real world problems require

interacting multiple agents to maximize the learning perfor-

mance. Learning with multiple agents is a challenging task

since each agent should consider other agents’ actions to

reach a globally optimum solution. Increasing the number

of agents also increases the state-action dimensions, thus

decomposing the tasks between agents is a scalable approach

for large control systems. There are two main issues with high-

dimensional systems in multi-agent RL in terms of state and

actions: stability and adaptation of agents to the environment

[17]. When each agent optimizes its action without considering

close agents, the optimal learning for overall system would be-

come non-stationary. There are several approaches to address

this problem in multi-agent RL systems such as distributed

learning, cooperative learning and competitive learning [17].

B. Deep Reinforcement Learning

In high-dimensional state spaces, standard RL algorithms

cannot efﬁciently compute the value functions or policy func-

tions for all states. Although some linear function approx-

imation methods are proposed for solving the large state

space problem in RL, their capabilities are still up to a

certain point. In high-dimensional and complex systems, stan-

dard RL approaches cannot learn informative features of the

environment for effective function approximation. However,

this problem can be easily handled by deep learning based

function approximators, in which deep neural networks are

trained to learn the optimal policy or value functions. Different

neural network structures such as convolutional neural network

(CNN) and recurrent neural network (RNN) are used for

training RL algorithms in large state spaces [18].

The main concept of deep learning is to extract useful

patterns from data. Deep learning models are roughly inspired

by the multi-layered structure of human neural system. Today,

deep learning has applications in a wide spectrum of areas, in-

cluding computer vision, speech recognition, natural language

processing, and the deep RL applications.

5

1) Deep Q-Network: Since value-based RL algorithms

learn the Q-function by populating a Q-table, it is not feasible

to visit all the states and actions in a large state space and con-

tinuous action problems. The leading approach to this problem,

called Deep Q-Network (DQN) [19], is to approximate the Q-

function with deep neural networks. Original DQN receives

raw input image as state, and estimates Q-values from them

using CNNs. Denoting the neural network parameters with

θ the Q-function approximation is written as Q(s, a; θ). The

output of neural network is the best action selected according

to (6) using a discrete set of approximate action values.

The major contribution of Mnih et al. [19] was two novel

techniques to stabilize learning with deep neural networks:

target network and experience replay. The original DQN

algorithm is shown to signiﬁcantly outperform the expert

human performance on several classic Atari video games. The

complete DQN algorithm with experience replay and target

network is given by Algorithm 1.

Target Network: One of the main parts of DQN that stabilize

learning is the target network. DQN has two separate networks

denoted as the main network that approximates the Q-function,

and the target network that gives the TD target for updating the

main network. In the training phase, while the main network

parameters θ are updated after every action, target network

parameters θ

–

are updated after a certain period of time.

The reason why target network is not updated after every

iteration is that it adjusts the main network updates to keep

the value estimations in control. If both networks were updated

at the same time, the change in the main network would be

exaggerated due to the feedback loop by the target network,

resulting in an unstable network. Similar to (9), 1-step TD

target y

t

is written as

y

DQN

t

= r

t

+ γ max

a

t+1

Q

π

(s

t+1

, a

t+1

; θ

–

t

), (14)

where Q

π

(s

t+1

, a

t+1

; θ

–

t

) denotes the target network.

Experience Replay: DQN introduces another distinct fea-

ture called experience replay which stores recent experiences

(s

t

, a

t

, r

t

, s

t+1

) in replay memory, and samples batches uni-

formly from the replay memory for training neural network.

There are two main reasons why experience replay is used in

DQN. Firstly, it prevents the agent from getting stuck into the

recent trajectories by doing random sampling since RL agents

are prone to temporal correlations in the consecutive samples.

Furthermore, instead of learning over full observations, DQN

agent learns over mini-batches that increases the efﬁciency

of the training. In a ﬁxed-size memory deﬁned for experience

replay, the memory stores only recent M samples by removing

the oldest experience for allocating a space to the latest sample.

The same technique is applied in other deep RL algorithms

[20], [21].

Prioritized Experience Replay: Experience replay technique

samples experiences uniformly from the memory, however,

some experiences has more impact on learning than the

others. A new approach prioritizing signiﬁcant actions over

other actions is proposed in [22] by changing the sampling

distribution of DQN algorithm. The overall idea for prioritized

experience replay is that the samples with higher TD error,

y

DQN

t

− Q

π

(s

t

, a

t

; θ

−

t

), receives higher ranking in terms of

probability than the other samples by applying a stochastic

sampling with proportional prioritization or rank-based prior-

itization. The experiences are sampled based on the assigned

probabilities.

Algorithm 1 DQN algorithm

1: Input Replay memory size M , batch size d, number of

episodes E, and number of time steps T

2: Inititalize Main network weights θ

3: Inititalize Target network weights θ

−

4: Inititalize Replay memory

5: for e = 1, . . . , E do

6: Inititalize state s

1

, and action a

1

7: for t = 1, . . . , T do

8: Take action a

t

= argmax

a

Q

π

(s

t

, a; θ) with prob-

ability 1 − or a random action with probability

9: Get reward r

t

and observe next state s

t+1

10: if Replay capacity M is full then

11: Delete the oldest tuple in memory

12: end if

13: Store the tuple (s

t

, a

t

, r

t

, s

t+1

) to replay memory

14: Sample random d tuples from replay memory

15: y

t

=

(

r

t

, if t = T .

r

t

+ γ max

a

Q

π

(s

t+1

, a

t+1

; θ

–

t

), otherwise.

16: Perform policy gradient using y

t

for updating θ

17: Update target network every N step, θ

–

= θ

18: end for

19: end for

2) Double Dueling DQN: DQN is the improved version

of the standard Q-Learning algorithm with a single estimator.

Both DQN and Q-Learning overestimates some actions due

to having single Q function estimations. Authors in [23] pro-

poses doubling the estimators for action selection with main

network and action evaluation with target network separately

in loss minimization similar to the tabular double Q-learning

technique [24]. Instead of selecting the Q value that maximizes

future reward using the target network (see Eq. (14)), double

DQN network selects the action using the main network

and evaluates it using the target network. Action selection is

decoupled with target network for better Q-value estimation:

y

DDQN

t

= r

t

+ γQ

π

(s

t+1

, argmax

a

t+1

Q

π

(s

t+1

, a

t+1

; θ); θ

–

t

).

(15)

Another improved version of DQN is a dueling network

architecture which estimates state value function V

π

(s) and

advantage function A

π

(s, a) separately for each action [25].

Output of the combination of these two networks is a Q-value

for a discrete set of actions through an aggregation layer. This

way dueling DQN learns the important state values without

their corresponding effects on the actions since state value

function V

π

(s) is an action-free estimation.

These two doubling and dueling models on DQN algorithm

with prioritized experience replay are accepted as the state-of-

剩余21页未读，继续阅读

syp_net

- 粉丝: 147
- 资源: 1197

上传资源 快速赚钱

- 我的内容管理 收起
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助

#### 会员权益专享

### 最新资源

- Xilinx SRIO详解.pptx
- Informatica PowerCenter 10.2 for Centos7.6安装配置说明.pdf
- 现代无线系统射频电路实用设计卷II 英文版.pdf
- 电子产品可靠性设计 自己讲课用的PPT，包括设计方案的可靠性选择，元器件的选择与使用，降额设计，热设计，余度设计，参数优化设计 和 失效分析等
- MPC5744P-DEV-KIT-REVE-QSG.pdf
- 通信原理课程设计报告（ASK FSK PSK Matlab仿真--数字调制技术的仿真实现及性能研究）
- ORIGIN7.0使用说明
- 在VMware Player 3.1.3下安装Redhat Linux详尽步骤
- python学生信息管理系统实现代码
- 西门子MES手册 13 OpcenterEXCR_PortalStudio1_81RB1.pdf

资源上传下载、课程学习等过程中有任何疑问或建议，欢迎提出宝贵意见哦~我们会及时处理！
点击此处反馈

安全验证

文档复制为VIP权益，开通VIP直接复制

信息提交成功

## 评论0