深度学习：推动现代科技的核心驱动力与应用突破

需积分: 9 143 浏览量更新于2024-09-07 收藏 1.52MB PDF 举报

深度学习是现代人工智能领域的重要分支，它使计算机模型能够通过多层处理单元构建数据的多级抽象表示。自深度学习方法问世以来，其在语音识别、视觉对象识别、对象检测以及药物发现、基因组学等众多领域取得了显著的进步，极大地推动了技术前沿的发展。这些进步主要归功于深度卷积神经网络（Deep Convolutional Neural Networks, DCNN）在图像、视频、语音和音频处理方面的突破，以及循环神经网络（Recurrent Neural Networks, RNN）在序列数据如文本和语音处理中的应用。深度学习的核心在于利用反向传播算法（Backpropagation）来调整机器内部参数，这些参数用于计算每一层的表示，基于前一层的表示。通过这种方式，模型能够自动学习到大量数据中的复杂结构，无需显式编程。例如，Facebook AI Research和纽约大学的研究团队，以及Google和多伦多大学的计算机科学部门，都在深度学习研究中发挥了关键作用，他们的工作涵盖了理论研究到实际应用的方方面面。传统的机器学习技术在处理自然数据，如图像、语音和文本等非结构化数据时，受限于它们处理原始形式数据的能力。深度学习的出现，尤其是在卷积神经网络中引入的局部连接和权值共享机制，使得模型能够有效地捕捉数据中的空间和时间特征，从而在诸如图像分类、物体检测和自然语言理解等任务上取得显著提升。此外，深度学习还被广泛应用于现实生活中的各种场景，如社交媒体上的内容过滤、电子商务网站的产品推荐、搜索引擎的结果排序等。智能手机和相机等消费电子产品中的智能功能也越来越多地依赖深度学习技术，实现了从图像识别、语音转录到用户兴趣匹配的自动化处理。然而，深度学习并非没有挑战。模型的训练需要大量的标注数据和计算资源，而且模型的解释性较差，往往难以直观理解其内部决策过程。随着研究的深入，研究人员正致力于开发更高效的学习算法、优化模型结构，并探索如何提高深度学习的可解释性和泛化能力，使其在不断扩展的应用领域中保持领先地位。未来，深度学习将继续引领AI技术的发展，推动科技进步和社会变革。

Facebook AI Research, 770 Broadway, New York, New York 10003 USA.

New York University, 715 Broadway, New York, New York 10003, USA.

Department of Computer Science and Operations

Research Université de Montréal, Pavillon André-Aisenstadt, PO Box 6128 Centre-Ville STN Montréal, Quebec H3C 3J7, Canada.

Google, 1600 Amphitheatre Parkway, Mountain View, California

94043, USA.

Department of Computer Science, University of Toronto, 6 King’s College Road, Toronto, Ontario M5S 3G4, Canada.

achine-learning technology powers many aspects of modern

society: from web searches to content filtering on social net-

works to recommendations on e-commerce websites, and

it is increasingly present in consumer products such as cameras and

smartphones. Machine-learning systems are used to identify objects

in images, transcribe speech into text, match news items, posts or

products with users’ interests, and select relevant results of search.

Increasingly, these applications make use of a class of techniques called

deep learning.

Conventional machine-learning techniques were limited in their

ability to process natural data in their raw form. For decades, con-

structing a pattern-recognition or machine-learning system required

careful engineering and considerable domain expertise to design a fea-

ture extractor that transformed the raw data (such as the pixel values

of an image) into a suitable internal representation or feature vector

from which the learning subsystem, often a classifier, could detect or

classify patterns in the input.

Representation learning is a set of methods that allows a machine to

be fed with raw data and to automatically discover the representations

needed for detection or classification. Deep-learning methods are

representation-learning methods with multiple levels of representa-

tion, obtained by composing simple but non-linear modules that each

transform the representation at one level (starting with the raw input)

into a representation at a higher, slightly more abstract level. With the

composition of enough such transformations, very complex functions

can be learned. For classification tasks, higher layers of representation

amplify aspects of the input that are important for discrimination and

suppress irrelevant variations. An image, for example, comes in the

form of an array of pixel values, and the learned features in the first

layer of representation typically represent the presence or absence of

edges at particular orientations and locations in the image. The second

layer typically detects motifs by spotting particular arrangements of

edges, regardless of small variations in the edge positions. The third

layer may assemble motifs into larger combinations that correspond

to parts of familiar objects, and subsequent layers would detect objects

as combinations of these parts. The key aspect of deep learning is that

these layers of features are not designed by human engineers: they

are learned from data using a general-purpose learning procedure.

Deep learning is making major advances in solving problems that

have resisted the best attempts of the artificial intelligence commu-

nity for many years. It has turned out to be very good at discovering

intricate structures in high-dimensional data and is therefore applica-

ble to many domains of science, business and government. In addition

to beating records in image recognition

1–4

and speech recognition

5–7

, it

has beaten other machine-learning techniques at predicting the activ-

ity of potential drug molecules

, analysing particle accelerator data

9,10

reconstructing brain circuits

, and predicting the effects of mutations

in non-coding DNA on gene expression and disease

12,13

. Perhaps more

surprisingly, deep learning has produced extremely promising results

for various tasks in natural language understanding

, particularly

topic classification, sentiment analysis, question answering

and lan-

guage translation

16,17

We think that deep learning will have many more successes in the

near future because it requires very little engineering by hand, so it

can easily take advantage of increases in the amount of available com-

putation and data. New learning algorithms and architectures that are

currently being developed for deep neural networks will only acceler-

ate this progress.

Supervised learning

The most common form of machine learning, deep or not, is super-

vised learning. Imagine that we want to build a system that can classify

images as containing, say, a house, a car, a person or a pet. We first

collect a large data set of images of houses, cars, people and pets, each

labelled with its category. During training, the machine is shown an

image and produces an output in the form of a vector of scores, one

for each category. We want the desired category to have the highest

score of all categories, but this is unlikely to happen before training.

We compute an objective function that measures the error (or dis-

tance) between the output scores and the desired pattern of scores. The

machine then modifies its internal adjustable parameters to reduce

this error. These adjustable parameters, often called weights, are real

numbers that can be seen as ‘knobs’ that define the input–output func-

tion of the machine. In a typical deep-learning system, there may be

hundreds of millions of these adjustable weights, and hundreds of

millions of labelled examples with which to train the machine.

To properly adjust the weight vector, the learning algorithm com-

putes a gradient vector that, for each weight, indicates by what amount

the error would increase or decrease if the weight were increased by a

tiny amount. The weight vector is then adjusted in the opposite direc-

tion to the gradient vector.

The objective function, averaged over all the training examples, can

Deep learning allows computational models that are composed of multiple processing layers to learn representations of

data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech rec-

ognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep

learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine

should change its internal parameters that are used to compute the representation in each layer from the representation in

the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and

audio, whereas recurrent nets have shone light on sequential data such as text and speech.

Deep learning

Yann LeCun

1,2

, Yoshua Bengio

& Geoffrey Hinton

4,5

436 | NATURE | VOL 521 | 28 MAY 2015

REVIEW

doi:10.1038/nature14539

下载后可阅读完整内容，剩余8页未读，立即下载

xinghaoyan

粉丝: 11
资源: 79

深度学习：推动现代科技的核心驱动力与应用突破

NatureDeepReview(Yann LeCun+Yoshua Bengio+Geoffrey Hinton).pdf

Deep learning REVIEW。深度学习三大牛在《Nature》杂志首次合作的发表综述文章《Deep Learning》。

Nature杂志

Deep Learning——Nature2015.pdf

The Nature of Code 无水印pdf

go 生成基于 graphql 服务器库.zip

基于JAVA+SpringBoot+Vue+MySQL的社区物资交易互助平台 源码+数据库+论文(高分毕业设计).zip

法研杯2021类案检索赛道三等奖方案源码+项目说明+数据.zip

基于Cesium实现的对倾斜摄影模型的单体化分层方案源码.zip

Go 的 PostgreSQL 驱动程序和工具包.zip

最新资源

基于JAVA+SpringBoot+Vue+MySQL的社区物资交易互助平台源码+数据库+论文(高分毕业设计).zip