进阶Python编程：提升技能与创新实践

5星 · 超过95%的资源需积分: 0 180 浏览量更新于2024-07-22 2 收藏 5.38MB PDF 举报

《Pro Python 2nd Edition》是一本深入讲解Python高级特性和实践的指南，旨在帮助读者在掌握了基础Python知识后提升技能，探索那些通常留给实验的复杂概念。本书不仅关注代码编写，还着重提升编程技巧和方法，使读者成为更优秀的Python开发者。作者在书中详细阐述了Python的核心原则和哲学，引导读者理解并应用高级基础知识，如函数设计、面向对象编程（包括类的设计与使用）、常见的设计模式等。章节涵盖了字符串处理、文档编写、测试方法以及如何进行模块化和分发，这些都是提高代码质量和可维护性的重要环节。 "Sheet: A CSV Framework"章节提供了一个实用的CSV工作框架，有助于读者在实际项目中处理数据。此外，书中的附录部分是学习者的宝库，包含Python编程风格指南、投票规则、Python语言哲学（Zen of Python）、文档字符串约定以及向后兼容性和语言发展变迁的相关信息，帮助读者跟上Python语言的发展趋势。作者强调，尽管本书是第二版，但它并非对初学者的重复，而是基于第一版的基础上增加了新的内容和价值，旨在满足那些希望进一步提升Python技术深度的读者需求。通过阅读本书，读者不仅能学会如何编写更高效、创新的代码，还会对Python社区有更深的理解和互动，从而在实际工作中实现更高的生产力和创造力。《Pro Python 2nd Edition》是一本既适合有一定Python基础又寻求进阶的程序员的参考书籍，它提供了丰富的案例研究和实战技巧，是提升Python专业技能的理想资源。

CHAPTER 1 ■ PRINCIPLES AND PHILOSOPHY

In the Face of Ambiguity, Refuse the Temptation to Guess

Sometimes, when using or implementing interfaces between pieces of code written by different people, certain

aspects may not always be clear. For example, one common practice is to pass around byte strings without any

information about what encoding they rely on. This means that if any code needs to convert those strings to Unicode

or ensure that they use a specific encoding, there’s not enough information available to do so.

It’s tempting to play the odds in this situation, blindly picking what seems to be the most common encoding.

Surely it would handle most cases, and that should be enough for any real-world application. Alas, no. Encoding

problems raise exceptions in Python, so those could either take down the application or they could be caught and

ignored, which could inadvertently cause other parts of the application to think strings were properly converted when

they actually weren’t.

Worse yet, your application now relies on a guess. It’s an educated guess, of course, perhaps with the odds on

your side, but real life has a nasty habit of flying in the face of probability. You might well find that what you assumed

to be most common is in fact less likely when given real data from real people. Not only could incorrect encodings

cause problems with your application, those problems could occur far more frequently than you realize.

A better approach would be to only accept Unicode strings, which can then be written to byte strings using

whatever encoding your application chooses. That removes all ambiguity, so your code doesn’t have to guess anymore.

Of course, if your application doesn’t need to deal with Unicode and can simply pass byte strings through unconverted,

it should accept byte strings only, rather than you having to guess an encoding to use to produce byte strings.

There Should Be One—and Preferably Only One—Obvious Way to Do It

Although similar to the previous principle, this one is generally applied only to development of libraries and

frameworks. When designing a module, class, or function, it may be tempting to implement a number of entry points,

each accounting for a slightly different scenario. In the byte string example from the previous section, for example,

you might consider having one function to handle byte strings and another to handle Unicode strings.

The problem with that approach is that every interface adds a burden on developers who have to use it. Not only

are there more things to remember, but it may not always be clear which function to use even when all the options are

known. Choosing the right option often comes down to little more than naming, which can sometimes be a guess.

In the previous example, the simple solution is to accept only Unicode strings, which neatly avoids other

problems, but for this principle, the recommendation is broader. Stick to simpler, more common interfaces, such as

the protocols illustrated in Chapter 5, where you can, adding on only when you have a truly different task to perform.

You might have noticed that Python seems to violate this rule sometimes, most notably in its dictionary

implementation. The preferred way to access a value is to use the bracket syntax, my_dict['key'], but dictionaries

also have a get() method, which seems to do the exact same thing. Conflicts like this come up fairly frequently when

dealing with such an extensive set of principles, but there are often good reasons if you’re willing to consider them.

In the dictionary case, it comes back to the notion of raising an exception when a rule is violated. When thinking

about violations of a rule, we have to examine the rules implied by these two available access methods. The bracket

syntax follows a very basic rule: return the value referenced by the key provided. It’s really that simple. Anything that

gets in the way of that, such as an invalid key, a missing value, or some additional behavior provided by an overridden

protocol, results in an exception being raised.

The get() method, by contrast, follows a more complicated set of rules. It checks to see whether the provided key

is present in the dictionary; if it is, the associated value is returned. If the key isn’t in the dictionary, an alternate value

is returned instead. By default, the alternate value is None, but that can be overridden by providing a second argument.

By laying out the rules each technique follows, it becomes clearer why there are two different options. Bracket

syntax is the common use case, failing loudly in all but the most optimistic situations, while get() offers more

flexibility for those situations that need it. One refuses to allow errors to pass silently, while the other explicitly silences

them. Essentially, providing two options allows dictionaries to satisfy both principles.

CHAPTER 1 ■ PRINCIPLES AND PHILOSOPHY

More to the point, though, is that the philosophy states there should only be one obvious way to do it. Even in the

dictionary example, which has two ways to get values, only one—the bracket syntax—is obvious. The get() method

is available, but it isn’t very well known, and it certainly isn’t promoted as the primary interface for working with

dictionaries. It’s okay to provide multiple ways to do something as long as they’re for sufficiently different use cases,

and the most common use case is presented as the obvious choice.

Although That Way May Not Be Obvious at First Unless You’re Dutch

This is a nod to the homeland of Python’s creator and Benevolent Dictator for Life, Guido van Rossum. More

importantly, however, it’s an acknowledgment that not everyone sees things the same way. What seems obvious to

one person might seem completely foreign to somebody else, and though there are any number of reasons for those

types of differences, none of them are wrong. Different people are different, and that’s all there is to it.

The easiest way to overcome these differences is to properly document your work, so that even if the code isn’t

obvious, your documentation can point the way. You might still need to answer questions beyond the documentation,

so it’s often useful to have a more direct line of communication with users, such as a mailing list. The ultimate goal is

to give users an easy way to know how you intend them to use your code.

Now Is Better Than Never

We’ve all heard the saying, “Don’t put off ’til tomorrow what you can do today.” That’s a valid lesson for all of us, but it

happens to be especially true in programming. By the time we get around to something we’ve set aside, we might have

long since forgotten the information we need to do it right. The best time to do it is when it’s on our mind.

Okay, so that part was obvious, but as Python programmers, this antiprocrastination clause has special meaning

for us. Python as a language is designed in large part to help you spend your time solving real problems rather than

fighting with the language just to get the program to work.

This focus lends itself well to iterative development, allowing you to quickly rough out a basic implementation

and then refine it over time. In essence, it’s another application of this principle because it allows you to get working

quickly rather than trying to plan everything out in advance, possibly never actually writing any code.

Although Never Is Often Better Than Right Now

Even iterative development takes time. It’s valuable to get started quickly, but it can be very dangerous to try to

finish immediately. Taking the time to refine and clarify an idea is essential to get it right, and failing to do so usually

produces code that could be described as—at best—mediocre. Users and other developers will generally be better off

not having your work at all than having something substandard.

We have no way of knowing how many otherwise useful projects never see the light of day because of this notion.

Whether in that case or in the case of a poorly made release, the result is essentially the same: people looking for a

solution to the same problem you tried to tackle won’t have a viable option to use. The only way to really help anyone

is to take the time required to get it right.

If the Implementation Is Hard to Explain, It’s a Bad Idea

This is something of a combination of two other rules already mentioned: simple is better than complex, and complex

is better than complicated. The interesting thing about the combination here is that it provides a way to identify when

you’ve crossed the line from simple to complex or from complex to complicated. When in doubt, run it by someone

else and see how much effort it takes to get them on board with your implementation.

CHAPTER 1 ■ PRINCIPLES AND PHILOSOPHY

This also reinforces the importance of communication to good development. In open source development, like

that of Python, communication is an obvious part of the process, but it’s not limited to publicly contributed projects.

Any development team can provide greater value if its members talk to each other, bounce ideas around, and help

refine implementations. One-man development teams can sometimes prosper, but they’re missing out on crucial

editing that can only be provided by others.

If the Implementation Is Easy to Explain, It May Be a Good Idea

At a glance, this seems to be just an obvious extension of the previous principle, simply swapping “hard” and “bad”

for “easy” and “good.” Closer examination reveals that adjectives aren’t the only things that changed. A verb changes

its form as well: “is” became “may be.” That may seem like a subtle, inconsequential change, but it’s actually quite

important.

Although Python highly values simplicity, many very bad ideas are easy to explain. Being able to communicate your

ideas to your peers is valuable but only as a first step that leads to real discussion. The best thing about peer review is the

ability for different points of view to clarify and refine ideas, turning something good into something great.

Of course, that’s not to discount the abilities of individual programmers. One person can do amazing things all

alone, there’s no doubt about it. But most useful projects involve other people at some point or another, even if only

your users. Once those other people are in the know, even if they don’t have access to your code, be prepared to

accept their feedback and criticism. Even though you may think your ideas are great, other perspectives often bring

new insight into old problems, which only serves to make it a better product overall.

Namespaces Are One Honking Great Idea—Let’s Do More of Those!

In Python, namespaces are used in a variety of ways—from package and module hierarchies to object attributes—to

allow programmers to choose the names of functions and variables without fear of conflicting with the choices of

others. Namespaces avoid collisions without requiring every name to include some kind of unique prefix, which

would otherwise be necessary.

For the most part, you can take advantage of Python’s namespace handling without really doing anything special.

If you add attributes or methods to an object, Python will take care of the namespace for that. If you add functions or

classes to a module, or a module to a package, Python takes care of it. But there are a few decisions you can make to

explicitly take advantage of better namespaces.

One common example is wrapping module-level functions into classes. This creates a bit of a hierarchy, allowing

similarly named functions to coexist peacefully. It also has the benefit of allowing those classes to be customized

using arguments, which can then affect the behavior of the individual methods. Otherwise, your code might have to

rely on module-level settings that are modified by module-level functions, restricting how flexible it can be.

Not all sets of functions need to be wrapped up into classes, however. Remember that flat is better than nested, so as

long as there are no conflicts or confusion, it’s usually best to leave those at the module level. Similarly, if you don’t have a

number of modules with similar functionality and overlapping names, there’s little point in splitting them up into a package.

Don’t Repeat Yourself

Designing frameworks can be a very complicated process; programmers are often expected to specify a variety of

different types of information. Sometimes, however, the same information might need to be supplied to multiple

different parts of the framework. How often this happens depends on the nature of the framework involved, but

having to provide the same information multiple times is always a burden and should be avoided wherever possible.

Essentially, the goal is to ask your users to provide configurations and other information just once and then use

Python’s introspection tools, described in detail in later chapters, to extract that information and reuse it in the other

areas that need it. Once that information has been provided, the programmer’s intentions are explicitly clear, so

there’s still no guesswork involved at all.

CHAPTER 1 ■ PRINCIPLES AND PHILOSOPHY

It’s also important to note that this isn’t limited to your own application. If your code relies on the Django web

framework, for instance, you have access to all the configuration information required to work with Django, which is

often quite extensive. You might only need to ask your users to point out which part of their code to use and access its

structure to get anything else you need.

In addition to configuration details, code can be copied from one function to another if they share some common

behaviors. In accordance with this principle, it’s often better to move that common code out into a separate utility

function, Then, each function that needs that code can defer to the utility function, paving the way for future functions

that need that same behavior.

This type of code factoring showcases some of the more pragmatic reasons to avoid repetition. The obvious

advantage to reusable code is that it reduces the number of places where bugs can occur. Better yet, when you find a

bug, you can fix it in one place, rather than worry about finding all the places that same bug might crop up. Perhaps

best of all, having the code isolated in a separate function makes it much easier to test programmatically, to help

reduce the likelihood of bugs occurring in the first place. Testing is covered in detail in Chapter 9.

Don’t Repeat Yourself (DRY) is also one of the most commonly abbreviated principles, given that its initials spell

a word so clearly. Interestingly, though, it can actually be used in a few different ways, depending on context.

An adjective—“Wow, this feels very DRY!”•

A noun—“This code violates DRY.”•

A verb—“Let’s DRY this up a bit, shall we?”•

Loose Coupling

Larger libraries and frameworks often have to split their code into separate subsystems with different responsibilities.

This is typically advantageous from a maintenance perspective, with each section containing a substantially different

aspect of the code. The concern here is about how much each section has to know about the others because it can

negatively affect the maintainability of the code.

It’s not about having each subsystem completely ignorant of the others, nor is it to avoid them ever interacting at

all. Any application written to be that separated wouldn’t be able to actually do anything of interest. Code that doesn’t

talk to other code just can’t be useful. Instead, it’s more about how much each subsystem relies on how the other

subsystems work.

In a way, you can look at each subsystem as its own complete system, with its own interface to implement. Each

subsystem can then call into the other ones, supplying only the information pertinent to the function being called and

getting the result, all without relying on what the other subsystem does inside that function.

There are a few good reasons for this behavior, the most obvious being that it helps make the code easier to

maintain. If each subsystem only needs to know its own functions work, changes to those functions should be

localized enough to not cause problems with other subsystems that access them. You’re able to maintain a finite

collection of publicly reliable interfaces while allowing everything else to change as necessary over time.

Another potential advantage of loose coupling is how much easier it is to split off a subsystem into its own full

application, which can then be included in other applications later on. Better yet, applications created like this can

often be released to the development community at large, allowing others to utilize your work or even expand on it if

you choose to accept patches from outside sources.

The Samurai Principle

As I stated in the opening to this chapter, the samurai warriors of ancient Japan were known for following the code of

Bushido, which governed most of their actions in wartime. One particularly well-known aspect of Bushido was that

warriors should return from battle victorious or not at all. The parallel in programming, as may be indicated by the

keyword return, is the behavior of functions in the event that any exceptions are encountered along the way.

CHAPTER 1 ■ PRINCIPLES AND PHILOSOPHY

It’s not a unique concept among those listed in this chapter but, rather, an extension of the notion that errors

should never pass silently and should avoid ambiguity. If something goes wrong while executing a function that

ordinarily returns a value, any return value could be misconstrued as a successful call, rather than identifying that an

error occurred. The exact nature of what occurred is very ambiguous and may produce errors down the road, in code

that’s unrelated to what really went wrong.

Of course, functions that don’t return anything interesting don’t have a problem with ambiguity because nothing

is relying on the return value. Rather than allowing those functions to return without raising exceptions, they’re

actually the ones that are most in need of exceptions. After all, if there’s no code that can validate the return value,

there’s no way of knowing that anything went wrong.

The Pareto Principle

In 1906, Italian economist Vilfredo Pareto noted that 80 percent of the wealth in Italy was held by just 20 percent of its

citizens. Since then, this idea has been put to the test in a number of fields beyond economics, and similar patterns

have been found. The exact percentages may vary, but the general observation has emerged over time: the vast

majority of effects in many systems are a result of just a small number of the causes.

In programming, this principle can manifest itself in a number of different ways. One of the more common is with

regard to early optimization. Donald Knuth, the noted computer scientist, once said that premature optimization is

the root of all evil, and many people take that to mean that optimization should be avoided until all other aspects of

the code have been finished.

Knuth was referring to a focus solely on performance too early in the process. It’s useless to try to tweak every

ounce of speed out of a program until you’ve verified that it even does what it’s supposed to. The Pareto Principle

teaches us that a little bit of work at the outset can have a large impact on performance.

Striking that balance can be difficult, but there are a few easy things that can be done while designing a program,

which can handle the bulk of the performance problems with little effort. Some such techniques are listed throughout

the remainder of this book, under sidebars labeled Optimization.

Another application of the Pareto Principle involves prioritization of features in a complex application or

framework. Rather than trying to build everything all at once, it’s often better to start with the minority of features that

will provide the most benefit to your users. Doing so allows you to get started on the core focus of the application and

get it out to the people who need to use it, while you can refine additional features based on feedback.

The Robustness Principle

During early development of the Internet, it was evident that many of the protocols being designed would have to

be implemented by countless different programs and that they’d all have to work together in order to be productive.

Getting the specifications right was important, but getting people to implement them interoperably was even more

important.

In 1980, the Transmission Control Protocol (TCP) was updated with RFC 761,

which included what has become

one of the most significant guidelines in protocol design: be conservative in what you do; be liberal in what you accept

from others. It was called “a general principle of robustness,” but it’s also been referred to as Postel’s Law, after its

author, Jon Postel.

It’s easy to see how this principle would be useful when guiding the implementations of protocols designed for

the Internet. Essentially, programs that follow this principle will be able to work much more reliably with programs

that don’t. By sticking to the rules when generating output, that output is more likely to be understood by software

that doesn’t necessarily follow the specification completely. Likewise, if you allow for some variations in the incoming

data, incorrect implementations can still send you data you can understand.

http://propython.com/rfc-761

剩余368页未读，继续阅读

ramissue

粉丝: 354
资源: 1487

进阶Python编程：提升技能与创新实践

learning.python.2nd.edition

Core.Python.Programming.2nd.Edition

python_cookbook_2nd_edition

Apress.Pro.Django.2nd.Edition.Jul.2013

Pro Python 2nd edition 英文pdf

Pro Django, 2nd Edition.pdf

Pro Python System Administration, 2nd Edition

ArcPy and ArcGIS 2nd Edition.pdf

Pro Python System Administration(Apress,2ed,2014)

The IDA Pro Book

最新资源