Python算法解析与实践指南

1星需积分: 12 161 浏览量更新于2024-07-20 3 收藏 4.27MB PDF 举报

"Python算法教程是一本以Python语言为载体，深入浅出地介绍算法分析与设计的书籍。全书包含11章，涵盖了从基础的算法概念到复杂的算法问题解决策略，旨在帮助读者理解并掌握经典算法，提升解决算法问题的能力。书中涉及的主题包括树、图、计数问题、归纳递归、遍历、分解合并、贪心算法、复杂依赖、Dijkstra算法、匹配切割问题以及困难问题及其稀释等。此外，书后还附有加速Python性能、问题与算法列表、图术语词汇表和练习提示等实用资料。" 在这本Python算法教程中，读者将学习到： 1. **第一章：介绍** - 引导读者进入算法的世界，阐述学习算法的重要性，并提供解决问题的一般步骤，包括明确问题和深入思考。 2. **第二章：基础** - 基础知识的学习是关键，本章可能涵盖Python语言的基础语法、数据结构（如列表、字典、栈和队列）以及算法的基本概念。 3. **第三章：计数问题** - 计数是算法中的常见任务，本章将讲解如何有效地计算和处理各种计数问题，可能包括组合计数、排列计数等。 4. **第四章：归纳递归与减少** - 递归是算法设计的重要工具，本章将介绍如何使用归纳法和递归解决复杂问题，以及如何通过问题简化来降低复杂度。 5. **第五章：遍历** - 遍历是许多算法的核心部分，包括深度优先搜索（DFS）和广度优先搜索（BFS），本章将详细介绍这些遍历技术及其在树和图中的应用。 6. **第六章：分解、合并与征服** - 分治策略是一种高效的算法设计方法，本章将讨论如何将大问题拆解为小问题，然后合并解决方案，例如快速排序、归并排序等。 7. **第七章：贪婪算法** - 贪心算法常用于寻找局部最优解以达到全局最优，本章将探讨贪婪策略的原理和应用场景，如霍夫曼编码。 8. **第八章：复杂依赖与记忆化** - 当问题的解决涉及到大量的重复计算时，记忆化技术能提高效率，本章会讲解如何利用动态规划和记忆化技巧来处理复杂依赖问题。 9. **第九章：从A到B，与Edsger和朋友们** - 这一章很可能是关于Dijkstra算法的，Dijkstra是著名的单源最短路径算法，由Edsger Dijkstra提出，用于解决图中的路径查找问题。 10. **第十章：匹配、切割与流** - 匹配和流问题在图论中有广泛的应用，本章可能涵盖最大匹配、网络流等问题。 11. **第十一章：困难问题与有限的粗心** - 对于NP难问题的讨论，可能会介绍如何近似求解以及通过随机化算法处理复杂问题。除此之外，书中还包括了附录，如A章加速Python性能的技巧，B章列出的问题和算法参考，C章对图论术语的定义，以及D章中对练习题的提示，这些都将有助于读者更好地理解和应用所学的算法知识。这本教程适合Python初学者和有一定经验的开发者，希望提升算法能力的读者。

CHAPTER 2 ■ THE BASICS

This is a fairly straightforward and understandable definition, although it may seem a bit foreign at first. Basically,

O(g) is the set of functions that do not grow faster than g. For example, the function n

is in the set O(n

), or, in set

notation, n

∈ O(n

). We often simply say that n

is O(n

The fact that n

does not grow faster than itself is not particularly interesting. More useful, perhaps, is the fact that

neither 2.4n

+ 7 nor the linear function n does. That is, we have both

2.4n

7 ∈ O(n

)

and

∈

O(n

The first example shows us that we are now able to represent a function without all its bells and whistles; we can

drop the 2.4 and the 7 and simply express the function as O(n

), which gives us just the information we need. The

second shows us that O can be used to express loose limits as well: Any function that is better (that is, doesn’t grow

faster) than g can be found in O(g).

How does this relate to our original example? Well, the thing is, even though we can’t be sure of the details

(after all, they depend on both the Python version and the hardware you’re using), we can describe the operations

asymptotically: The running time of appending n numbers to a Python list is O(n), while inserting n numbers at its

beginning is O(n

The other two, W and Q, are just variations of O. W is its complete opposite: A function f is in W(g) if it satisfies the

following condition: There exists a natural number n

and a positive constant c such that

f(n)

cg(n)

for all n ³ n

. So, where O forms a so-called asymptotic upper bound, W forms an asymptotic lower bound.

Note ■ Our first two asymptotic operators, O and W, are each other’s inverses: If f is O(g), then g is W(f ). Exercise 2-3

asks you to show this.

The sets formed by Q are simply intersections of the other two, that is, Q(g) = O(g) ∩ W(g). In other words, a function f is

in Q(g) if it satisfies the following condition: There exists a natural number n

and two positive constants c

and c

such that

g(n)

f(n) £ c

g(n)

for all n ³ n

. This means that f and g have the same asymptotic growth. For example, 3n

+ 2 is Q(n

), but we could just

as well write that n

is Q(3n

+ 2). By supplying an upper bound and a lower bound at the same time, the Q operator is

the most informative of the three, and I will use it when possible.

T (n )

Figure 2-1. For values of n greater than n

, T(n) is less than cn

, so T(n) is O(n

)

CHAPTER 2 ■ THE BASICS

Rules of the Road

While the definitions of the asymptotic operators can be a bit tough to use directly, they actually lead to some of the

simplest math ever. You can drop all multiplicative and additive constants, as well as all other “small parts” of your

function, which simplifies things a lot.

As a first step in juggling these asymptotic expressions, let’s take a look at some typical asymptotic classes, or

orders. Table 2-1 lists some of these, along with their names and some typical algorithms with these asymptotic

running times, also sometimes called running-time complexities. (If your math is a little rusty, you could take a look at

the sidebar “A Quick Math Refresher” later in the chapter.) An important feature of this table is that the complexities

have been ordered so that each row dominates the previous one: If f is found higher in the table than g, then f is O(g).

Table 2-1. Common Examples of Asymptotic Running Times

Complexity Name Examples, Comments

Q(1) Constant Hash table lookup and modification (see “Black Box” sidebar on dict).

Q(lg n) Logarithmic Binary search (see Chapter 6). Logarithm base unimportant.

Q(n) Linear Iterating over a list.

Q(n lg n) Loglinear Optimal sorting of arbitrary values (see Chapter 6). Same as Q(lg n!).

Q(n

) Quadratic Comparing n objects to each other (see Chapter 3).

Q(n

) Cubic Floyd and Warshall’s algorithms (see Chapters 8 and 9).

O(nk) Polynomial k nested for loops over n (if k is a positive integer). For any constant k > 0.

W(kn) Exponential Producing every subset of n items (k = 2; see Chapter 3). Any k > 1.

Q(n!) Factorial Producing every ordering of n values.

Note ■ Actually, the relationship is even stricter: f is o(g), where the “Little Oh” is a stricter version if “Big Oh.”

Intuitively, instead of “doesn’t grow faster than,” it means “grows slower than.” Formally, it states that f(n)/g(n) converges

to zero as n grows to infinity. You don’t really need to worry about this, though.

Any polynomial (that is, with any power k > 0, even a fractional one) dominates any logarithm (that is, with any

base), and any exponential (with any base k > 1) dominates any polynomial (see Exercises 2-5 and 2-6). Actually, all

logarithms are asymptotically equivalent—they differ only by constant factors (see Exercise 2-4). Polynomials and

exponentials, however, have different asymptotic growth depending on their exponents or bases, respectively. So, n

grows faster than n

, and 5

grows faster than 4

The table primarily uses Q notation, but the terms polynomial and exponential are a bit special, because of

the role they play in separating tractable (“solvable”) problems from intractable (“unsolvable”) ones, as discussed

in Chapter 11. Basically, an algorithm with a polynomial running time is considered feasible, while an exponential

one is generally useless. Although this isn’t entirely true in practice, (Q(n

100

) is no more practically useful than

Q(2n)); it is, in many cases, a useful distinction.

Because of this division, any running time in O(n

), for any k > 0,

For the “Cubic” and “Polynomial” row, this holds only when k ³ 3.

Interestingly, once a problem is shown to have a polynomial solution, an efficient polynomial solution can quite often be

found as well.

I’m using lg rather than log here, but either one is fine.

CHAPTER 2 ■ THE BASICS

is called polynomial, even though the limit may not be tight. For example, even though binary search (explained in

the “Black Box” sidebar on bisect in Chapter 6) has a running time of Q(lg n), it is still said to be a polynomial-time

(or just polynomial) algorithm. Conversely, any running time in W(kn)—even one that is, say, Q(n!)—is said to be

exponential.

Now that we have an overview of some important orders of growth, we can formulate two simple rules:

In a sum, only the dominating summand matters.•

For example, Q(n

+ n

+ 42) = Q(n

In a product, constant factors don’t matter.•

For example, Q(4.2n lg n) = Q(n lg n).

In general, we try to keep the asymptotic expressions as simple as possible, eliminating as many unnecessary

parts as we can. For O and W, there is a third principle we usually follow:

Keep your upper or lower limits tight.•

In other words, we try to make the upper limits low and the lower limits high. For example,

although n

might technically be O(n

), we usually prefer the tighter limit, O(n

). In most cases,

though, the best thing is to simply use Q.

A practice that can make asymptotic expressions even more useful is that of using them instead of actual values,

in arithmetic expressions. Although this is technically incorrect (each asymptotic expression yields a set of functions,

after all), it is quite common. For example, Q(n

) + Q(n

) simply means f + g, for some (unknown) functions f and

g, where f is Q(n

) and g is Q(n

). Even though we cannot find the exact sum f + g, because we don’t know the exact

functions, we can find the asymptotic expression to cover it, as illustrated by the following two “bonus rules:”

• Q(f) + Q(g) = Q(f + g)

• Q(f) · Q(g) = Q(f · g)

Exercise 2-8 asks you to show that these are correct.

Taking the Asymptotics for a Spin

Let’s take a look at some simple programs and see whether we can determine their asymptotic running times. To

begin with, let’s consider programs where the (asymptotic) running time varies only with the problem size, not the

specifics of the instance in question. (The next section deals with what happens if the actual contents of the instances

matter to the running time.) This means, for example, that if statements are rather irrelevant for now. What’s

important is loops, in addition to straightforward code blocks. Function calls don’t really complicate things;

just calculate the complexity for the call and insert it at the right place.

Note ■ There is one situation where function calls can trip us up: when the function is recursive. This case is dealt with

in Chapters 3 and 4.

The loop-free case is simple: we are executing one statement before another, so their complexities are added.

Let’s say, for example, that we know that for a list of size n, a call to append is Q(1), while a call to insert at position 0 is

Q(n). Consider the following little two-line program fragment, where nums is a list of size n:

nums.append(1)

nums.insert(0,2)

CHAPTER 2 ■ THE BASICS

We know that the line first takes constant time. At the time we get to the second line, the list size has changed and

is now n + 1. This means that the complexity of the second line is Q(n + 1), which is the same as Q(n). Thus, the total

running time is the sum of the two complexities, Q(1) + Q(n) = Q(n).

Now, let’s consider some simple loops. Here’s a plain for loop over a sequence with n elements (numbers, say;

for example, seq = range(n)):

s = 0

for x in seq:

s += x

This is a straightforward implementation of what the sum function does: It iterates over seq and adds the elements

to the starting value in s. This performs a single constant-time operation (s += x) for each of the n elements of seq,

which means that its running time is linear, or Q(n). Note that the constant-time initialization (s = 0) is dominated by

the loop here.

The same logic applies to the “camouflaged” loops we find in list (or set or dict) comprehensions and generator

expressions, for example. The following list comprehension also has a linear running-time complexity:

squares = [x**2 for x in seq]

Several built-in functions and methods also have “hidden” loops in them. This generally applies to any function

or method that deals with every element of a container, such as sum or map, for example.

Things get a little bit (but not a lot) trickier when we start nesting loops. Let’s say we want to sum up all possible

products of the elements in seq; here’s an example:

s = 0

for x in seq:

for y in seq:

s += x*y

One thing worth noting about this implementation is that each product will be added twice. If 42 and 333 are

both in seq, for example, we’ll add both 42*333 and 333*42. That doesn’t really affect the running time; it’s just a

constant factor.

What’s the running time now? The basic rule is easy: The complexities of code blocks executed one after the

other are just added. The complexities of nested loops are multiplied. The reasoning is simple: For each round of the

outer loop, the inner one is executed in full. In this case, that means “linear times linear,” which is quadratic. In other

words, the running time is Q(n·n) = Q(n

). Actually, this multiplication rule means that for further levels of nesting,

we will just increment the power (that is, the exponent). Three nested linear loops give us Q(n

), four give us Q(n

and so forth.

The sequential and nested cases can be mixed, of course. Consider the following slight extension:

s = 0

for x in seq:

for y in seq:

s += x*y

for z in seq:

for w in seq:

s += x-w

If the elements are ints, the running time of each += is constant. However, Python also support big integers, or longs, which

automatically appear when your integers get big enough. This means you can break the constant-time assumption by using really

huge numbers. If you’re using floats, that won’t happen (but see the discussion of float problems near the end of the chapter).

CHAPTER 2 ■ THE BASICS

It may not be entirely clear what we’re computing here (I certainly have no idea), but we should still be able to

find the running time, using our rules. The z-loop is run for a linear number of iterations, and it contains a linear

loop, so the total complexity there is quadratic, or Q(n

). The y-loop is clearly Q(n). This means that the code block

inside the x-loop is Q(n + n

). This entire block is executed for each round of the x-loop, which is run n times. We

use our multiplication rule and get Q(n(n + n

)) = Q(n

+ n

) = Q(n

), that is, cubic. We could arrive at this conclusion

even more easily by noting that the y-loop is dominated by the z-loop and can be ignored, giving the inner block a

quadratic running time. “Quadratic times linear” gives us cubic.

The loops need not all be repeated Q(n) times, of course. Let’s say we have two sequences, seq1 and seq2, where

seq1 contains n elements and seq2 contains m elements. The following code will then have a running time of Q(nm).

s = 0

for x in seq1:

for y in seq2:

s += x*y

In fact, the inner loop need not even be executed the same number of times for each iteration of the outer loop.

This is where things can get a bit fiddly. Instead of just multiplying two iteration counts, such as n and m in the

previous example, we now have to sum the iteration counts of the inner loop. What that means should be clear in the

following example:

seq1 = [[0, 1], [2], [3, 4, 5]]

s = 0

for seq2 in seq1:

for x in seq2:

s += x

The statement s += x is now performed 2 + 1 + 3 = 6 times. The length of seq2 gives us the running time of the

inner loop, but because it varies, we cannot simply multiply it by the iteration count of the outer loop. A more realistic

example is the following, which revisits our original example—multiplying every combination of elements from a

sequence:

s = 0

n = len(seq)

for i in range(n-1):

for j in range(i+1, n):

s += seq[i] * seq[j]

To avoid multiplying objects with themselves or adding the same product twice, the outer loop now avoids the

last item, and the inner loop iterates over the items only after the one currently considered by the outer one. This is

actually a lot less confusing than it might seem, but finding the complexity here requires a little bit more care. This is

one of the important cases of counting that is covered in the next chapter.

Spoiler: The complexity of this example is still Q(n

剩余302页未读，继续阅读

yuxi218

粉丝: 2

Python算法解析与实践指南

Python算法教程_中文版.pdf

python算法教程-中文版

Python算法教程_中文版 PDF

python算法教程 pdf下载

python算法教程

python算法数据结构书籍

python算法书籍

遗传算法故障定位代码python

github 标星 15w,如何用 python 实现所有算法?

python数据结构与算法教程

最新资源