关系数据库索引设计与优化

5星 · 超过95%的资源需积分: 22 180 浏览量更新于2024-07-17 收藏 7.43MB PDF 举报

"关系型数据库索引设计与优化器" 在关系型数据库中，索引设计是提高查询性能的关键因素，而优化器则是决定查询执行效率的核心组件。这本书"Relational Database Index Design and the Optimizers"深入探讨了DB2、Oracle、SQL Server等主流数据库系统中的索引设计和查询优化策略。索引设计： 1. **类型选择**：索引有多种类型，如B树（B-Tree）、哈希索引、位图索引和全文索引等。B树索引适用于范围查询和排序，哈希索引则适用于等值查询，位图索引适合于低基数（少量不同值）的列，全文索引用于全文搜索。 2. **复合索引**：当查询涉及多个字段时，可以创建复合索引来提高查询效率，索引的顺序应根据查询条件的频率和选择性来确定。 3. **唯一性**：索引可以是唯一的，确保索引项的唯一性，这有助于提高查询速度，但也可能增加数据插入时的开销。 4. **覆盖索引**：如果索引包含查询需要的所有列，这样的索引被称为覆盖索引，使用覆盖索引可以避免回表操作，显著提升查询速度。优化器： 1. **成本模型**：优化器通过计算不同执行计划的成本来选择最佳路径。成本通常基于扫描次数、连接操作、排序和索引查找等因素。 2. **统计信息**：优化器依赖于准确的统计信息来估计行数和选择性，以便计算查询成本。定期更新统计信息能帮助优化器做出更精确的决策。 3. **查询重写**：优化器可能会对查询进行重写，例如，通过合并连接操作，或者将子查询转换为连接操作，以减少执行时间和资源消耗。 4. **并行查询**：优化器还可以决定是否利用多核处理器的并行执行能力，通过并行化查询来加速处理。 5. **内存管理**：优化器会考虑内存资源的分配，以平衡I/O和CPU的使用，确保高效的查询执行。 6. **索引选择**：优化器会考虑使用哪些索引，甚至在某些情况下决定不使用索引，以避免因索引维护而产生的额外开销。书籍可能详细讲解了如何根据特定数据库系统的特性来设计和调整索引，以及如何分析和改进优化器的性能。同时，还可能涵盖了如何理解和使用查询执行计划，以及如何通过监控和调整数据库参数来进一步优化性能。对于Java开发人员来说，理解这些概念和技术可以帮助他们编写更高效的SQL查询，设计出更适合应用需求的数据库架构，从而提升整个系统的性能。

Preface

Relational databases have been around now for more than 20 years. In their

early days, performance problems were widespread due to limited hardware

resources and immature optimizers, and so performance was a priority consid-

eration. The situation is very different nowadays; hardware and software have

advanced beyond all recognition. It’s hardly surprising that performance is now

assumed to be able to take care of itself! But the reality is that despite the

huge growth in resources, even greater growth has been seen in the amount of

information that is now available and what needs to be done with this infor-

mation. Additionally, one crucial aspect of the hardware has not kept pace with

the times: Disks have certainly become larger and incredibly cheap, but they are

still relatively slow with regards to their ability to directly access data. Conse-

quently many of the old problems haven’t actually gone away—they have just

changed their appearance. Some of these problems can have enormous implica-

tions—stories abound of “simple” queries that might have been expected to take

a fraction of a second appear to be quite happy to take several minutes or even

longer; this despite all the books that tell us how to code queries properly and

how to organize the tables and what rules to follow to put the right columns into

the indexes. So it is abundantly clear that there is a need for a book that goes

beyond the usual boundaries and really starts to think about why so many people

are still having so many problems today.

To address this need, we believe we must focus on two issues. First, the

part of the relational system (called the SQL optimizer) that has to decide how

to ﬁnd the required information in the most efﬁcient way, and secondly how

the indexes and tables are then scanned. We want to try to put ourselves in the

optimizer’s place; perhaps if we understood why it might have problems, we

might be able to do things differently. Fortunately it is quite surprising how little

we really need to understand about the optimizers, but what there is though is

remarkably important. Likewise, a very important way in which this book differs

from other books in its ﬁeld, is that we will not be providing a massive list of

rules and syntax to use for coding SQL and designing tables or even indexes.

This is not a reference book to show exactly which SQL WHERE clause should

be used, or what syntax should be employed, for every conceivable situation. If

we tried to follow a long list of complicated, ambiguous, and possibly incomplete

instructions, we would be following all the others who have already trod the same

path. If on the other hand we appreciate the impact of what we are asking the

relational system to undertake and how we can inﬂuence that impact, we will be

able to understand, control, minimize, or avoid the problems being encountered.

xvi Preface

The second objective of this book is to show how we can use this knowledge

to quantify the work being performed in terms of CPU and elapsed time. Only in

this way can we truly judge the success of our index and table design; we need

to use actual ﬁgures to show what the optimizer would think, how long the scans

would take, and what modiﬁcations would be required to provide satisfactory

performance. But most importantly, we have to be able to do this quickly and

easily; this in turn means that it is vital to focus on the few really major issues,

not on the relatively unimportant detail under which many people drown. This is

key—to focus on a very few, crucially important areas—and to be able to say

how long it would take or how much it would cost.

We have also one further advantage to offer, which again arises as a result

of focusing on what really matters. For those who may be working with more

than one relational product (even from the same vendor), instead of reading and

digesting multiple sets of widely varying rules and recommendations, we are

using a single common approach which is applicable to all relational products.

All “genuine” relational systems have an optimizer that has the same job to do;

they all have to make decisions and then scan indexes and tables. They all do

these things in a startlingly similar way (although they have their own way of

describing them). There are, of course, some differences between them, but we

can handle this with little difﬁculty.

The audience for which this book is intended, is quite literally, anyone who

feels it is to his or her beneﬁt to know something about SQL performance or

about how to design tables and indexes effectively, as well as those having a

direct responsibility for designing indexes, anyone coding SQL statements as

queries or as part of application programs, and those who are responsible for

maintaining the relational data and the relational environment. All will beneﬁt to

a varying degree if they feel some responsibility for the performance effects of

what they are doing.

Finally, a word regarding the background that would be appropriate to the

readers of this book. A knowledge of SQL, the relational language, is assumed.

A general understanding of computer systems will probably already be in place

if one is even considering a book such as this. Other than that, perhaps the

most important quality that would help the reader would be a natural curiosity

and interest in how things work—and a desire to want to do things better. At

the other extreme, there are also two categories of the large number of peo-

ple with many years of experience in relational systems who might feel they

would beneﬁt; ﬁrst those who have managed pretty well over the years with

the detailed rule books and would like to relax a little more by understanding

why these rules apply; second, those who have already been using the tech-

niques described in this book for many years but who have not appreciated the

implications that have been brought into play by the introduction of the new

world hardware.

Most of the ideas and techniques used in this book are original and conse-

quently few external references will be found to other publications and authors.

On the other hand, as is always the case in the production of a book such as this,

Chapter 1

Introduction

To understand how SQL optimizers decide what table and index scans

should be performed to process SQL statements as efﬁciently as possible

To be able to quantify the work being done during these scans to enable

satisfactory index design

Type and background of audience for whom the book is written

Initial thoughts on the major reasons for inadequate indexing

Systematic index design.

ANOTHER BOOK ABOUT SQL PERFORMANCE!

Relational databases have been around now for over 20 years, and that’s precisely

how long performance problems have been around too—and yet here is another

book on the subject. It’s true that this book focuses on the index design aspects

of performance; however, some of the other books consider this area to a greater

or lesser extent. But then a lot of these books have been around for over 20 years,

and the problems still keep on coming. So perhaps there is a need for a book that

goes beyond the usual boundaries and starts to think about why so many people

are still having so many problems.

It’s certainly true that the world of relational database systems is a very

complex one—it has to be if one reﬂects on what really has to be done to satisfy

SQL statements. The irony is that the SQL is so beautifully simple to write; the

concept of tables and rows and columns is so easy to understand. Yet we could

be searching for huge amounts of information from vast sources of data held

all over the world—and we don’t even need to know where it is or how it can

be found. Neither do we have to worry about how long it’s going to take or

how much it’s going to cost. It all seems like magic. Maybe that’s part of the

problem—it’s too easy; but then of course, it should be so easy.

We still recognize that problems will arise—and huge problems at that.

Stories abound of “simple” queries that might have been expected to take a

fraction of a second appear to be quite happy to take several minutes or even

longer. But then, we have all these books, and they tell us how to code the query

Relational Database Index Design and the Optimizers, by Tapio Lahdenm

aki and Michael Leach

 2005 John Wiley & Sons, Inc.

剩余326页未读，继续阅读

caofeng891102

粉丝: 172
资源: 1248

关系数据库索引设计与优化

Relational_Database_Index_Design_and_the_Optimizers

Relational+Database+Index+Design+and+the+Optimizers.zip

Pro SQL Server 2012 Relational Database Design and Implementation

[Apress] Pro SQL Server 2012 Relational Database Design and Implementation

Relational Database Management Systems,Database Design, and GIS

Data Modeling And Relational Database Design

Pro SQL Server 2012 Relational Database Design and Implementation(Apress,2012)

oracle data model and relational database design

Data Modeling and Relational Database Design, Vol.1

Six Step Relational Database Design 2nd Edition

最新资源