机器学习中的张量网络：进展与未来趋势

下载需积分: 48 | PDF格式 | 65.38MB | 更新于2024-07-15 | 12 浏览量 | 举报

张量网络机器学习：最近的进展和前沿在当前的机器学习领域，张量网络作为一种强大的工具正在崭露头角，它们通过组合多个低阶张量来高效表示高阶数据结构，从而在处理多模态、复杂结构的数据时展现出独特的优势。随着大数据时代的到来，对数据表示的需求日益增长，张量网络因其在数据表示、特征提取和模型构建中的灵活性，得到了广泛关注。首先，我们回顾一下机器学习的成功因素，其中包括大数据的积累、计算能力的提升以及模型与算法的进步。在这个背景下，张量网络扮演了关键角色，它们不仅源于量子物理学的理论基础，也在数学上提供了新的视角。张量网络的应用涵盖了从医学知识图谱（实体-实体-关系）、高光谱图像（空间-空间-光谱）、视频数据（空间-空间-时间）到功能性磁共振成像（空间-空间-空间）等多种复杂数据类型的处理。在数据表示部分（Part I），张量网络被用于高效编码和解析多维数据，如医学领域的实体关系数据，可以通过将实体视为张量的一维，关系作为连接，构建出表达丰富信息的网络结构。同样，高维图像数据和视频帧可以通过张量分解捕捉空间和时间维度上的特性。神经网络中的特征映射，如卷积神经网络（CNN）中的滤波器，也可以看作是张量网络形式，捕捉局部空间和频率模式。张量被视为向量和矩阵的推广，其图形表示方式有助于直观理解和操作数据的各个模式（Mode）。例如，在模式-2分解中，张量可以被分割成独立的子张量，每个子张量对应于数据的一个特定方面，这在分析和压缩数据时极具价值。深度学习模型中，张量网络的应用不断拓展，不仅限于基础的数据表示，还包括了模型构建和优化。张量网络模型能够有效地减少参数数量，提高模型的可解释性和计算效率，特别是在处理如量子力学问题和大规模数据集时，展现了潜在的优势。当前，张量网络机器学习的前沿包括但不限于：更深层次的理解和理论探索、高效算法的设计与实现、跨领域融合（如结合自然语言处理和计算机视觉）、以及在新兴技术如量子计算中的潜在应用。未来趋势可能涉及到对张量网络的自动化设计、自适应学习和更广泛的实时数据处理场景。张量网络机器学习已经从理论走向实践，并在实际问题中展现出了强大的潜力。随着研究的深入和应用场景的拓宽，我们可以期待张量网络在机器学习领域的更多突破和创新。

Applications to Hyperspectral Image Denoising

Non-Local Meets Global: An Integrated Paradigm for Hyperspectral Denoising (He et al., CVPR 2019)

Low-rank Tensor

Approximation

Tensor Completion Under Multiple Transformation

‣

Image is not always globally low-rank

‣

Non-local similar patches are often low-rank

‣

Lack of theoretical analysis

Low-rankness under linear transformation

An example – Non-local Trick in Image Restoration

Group matching

Unfolding

Concatenating

A linear transformation from high-rank to low-rank structure

Summary

A signiﬁcant low-rank structure appears under some transformations.

Problem

The conventional theoretical analysis for guarantee is no longer suitable.

Zhun S un (RIKEN-AIP) CVPR2019 June 20 3 / 7

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

CVPR

#****

CVPR

#****

CVPR 2019 Submission #****. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

We can see that (2) degenerates (1) when K =1and Q

is the identical function. But the difference with NNM is

that MCMT looks for the low-rank solution under linear

transformations rather than the matrix itself. It implies that

(2) can be used to complete the matrix that has a high-rank

structure.

Comparison with matrix sensing. Matrix sensing is to re-

cover the original matrix from the Gaussian measurements.

The model is formalized as

min

X2R

⇥m

kXk

⇤

,s.t.kQ(X)  Q ( Y) k

< , (3)

where the entries of Q follows the i.i.d. Gaussian distribu-

tion. Compared with (2), (3) only consider the linear trans-

formation Q in the constraint term. Furthermore, matrix

sensing also exploit the low-rank structure of the original

matrix like NNM, while MCMT takes into account the ad-

ditional low-rank structures under linear transformations.

comparison with CTD. As mentioned in the related

works, CTD is to seek for the approximation of a tensor

with multi-linear low-rank structures. For a K-th order ten-

sor and its perturbed variant Y, CTD is given by [?]

min

X2R

⇥m

i2[K]



[X ]

(i)



⇤

s.t. kP

⌦

(X )  P

⌦

(Y)k

< ,

(4)

where [X ]

(i)

denotes unfolding the tensor X along i-th

order [?]. Due to the fact that the unfolding operations

are linear functions, (4) is a special case of MCMT when

(·)=[· ]

(i)

. It is worthwhile to mention that tensor

unfolding only rearrange the tensor into different shapes,

but MCMT can use more general linear functions like re-

sampling, rotation and stretching in the linear space to dig

more structures of the matrix.

2.4. Examples of Q

in MCMT

In MCMT, the linear transformations Q

, 8 i can be used

to formulate speciﬁc operations in various CV applications.

Here we show some examples.

Example 1 (non-local image restoration). To exploit the

non-local similarity of the images, the methods usually split

the whole matrix into many “non-local groups”, and each

group is a concatenation of similar patches of the image.

We can see that such grouping operation is mathematically

a down-sampling (deﬁnitely linear) function from the image

to the non-local group. Therefore, each Q

(X),i 2 [K] in

(2) corresponds to K non-local groups, and solving (2) is

to ﬁnd the optimal low-rank approximation for each non-

local group and then merge the approximations back to the

global image.

Example 2 (occlusion removal). In the occlusion removal

problem, the original image is generally covered by some

other objects, and the aim of this application is to recover

the hidden part of the image. To solve this problem, the pre-

vious study [?] assume that both the original image and the

covered part have the low-rank structures. By using MCMT,

we can specify K =2, set Q

to be the identical function

to catch the low-rank structure of the image, and set Q

obtain the covered sub-image with the low-rank structures.

Besides these examples, we can also specify Q

as the

2-D wavelet ﬁlters to catch the short-term ﬂuctuation of

the image under multiple resolutions or even random shuf-

ﬂing [?].

3. Identiﬁablity

One of the advantage of LRMC is that the completion

performance is theoretically guaranteed. In this section,

we theoretically analyze the reconstruction error of MCMT,

and reveal what conditions Q

, 8 i should satisfy for exact

recovery.

In the rest of this section, we ﬁrst establish an upper

bound of MCMT under a single linear transformation, i.e.

K =1. After that, we extend the results to the case of

multiple transformations.

3.1. Single linear transformation

Assume that M

2 R

⇥m

denotes the “true” ma-

trix that we want to recover, and its rank equals R . The

noised variant of M

is generated by Y = M

+ H where

the entries of H obey the i.i.d. Gaussian distribution, i.e.

H(i, j) ⇠ N(0, 

) for all i 2 [m

],j 2 [m

]. With the

single linear transformation, we simplify (2) as

min

X2R

⇥m

kQ(X)k

⇤

s.t. kP

⌦

(X)  P

⌦

(Y)k

 ,

(5)

where the subscript of Q 2 R

⇥m

⇥n

is removed

for brevity. Let Q(M

)=UDV

be the truncated

singular value decomposition (SVD), in which only the

singular vectors with respect to non-zero singular val-

ues are kept. Furthermore, we deﬁne a linear space

T =

+ YV

|X 2 R

⇥R

, Y 2 R

⇥R

, which

reﬂects the properties of the neighborhood around M

. Let

denote the orthogonal complement to T. Based on the

dual theory, we deﬁne the dual certiﬁcate for unique solu-

tion of (5) as follow:

Deﬁnition 2 (Dual certiﬁcate). A matrix ⇤ 2 R

⇥m

deﬁned as a dual certiﬁcate of (5), if P

⌦

(⇤)=⇤ and ⇤

can be decomposed as

⇤ = Q

⇣

+ R

⇤

⌘

, (6)

where R

⇤

= P

(⇤), P

denotes the projection to T

and kR

⇤

 1.

Linear transformation

Guaranteed Matrix Completion under Multiple Linear Transformations (Li et al, CVPR 2019)

剩余108页未读，继续阅读

syp_net

粉丝: 158

机器学习中的张量网络：进展与未来趋势

Python_变压器状态的心脏机器学习Pytorch TensorFlow和JAX.zip

人工智能与机器学习：研究前沿与最新论文资源

2024 PyTorch深度学习实践教程：最新进展与应用

Pytorch深度学习入门与实战：掌握2024前沿技术

张量网络与张量分解：量子计算新工具的实战指导

揭秘张量分解：机器学习中的强大武器及其在深度学习中的实际应用

张量分解技术最新进展：算法创新与应用拓展全解析

YOLO无监督目标检测前沿技术：研究进展与趋势

【张量分析与矩阵论】：高维数据分析基础，机器学习中的强力工具

YOLOv8模型融合与集成的创新实践：深度学习的前沿探索

最新资源