Python编程指南：科学家的首选工具

Python

机器学习

需积分: 13 46 浏览量更新于2024-07-18 收藏 4.12MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

资源详情

资源推荐

1.1 Scientiﬁc software 3

algebraic and analytic processes, and to integrate all of them with their numerical and

graphical properties. A disadvantage of all of these packages is the quirky syntax and

limited expressive ability of their command languages. Unlike the compiled languages,

it is often extremely diﬃcult to programme a process which was not envisaged by the

package authors.

The best of the proprietary packages are very easy to use with extensive on-line help

and coherent documentation, which has not yet been matched by all of the open-source

alternatives. However, a major downside of the commercial packages is the extremely

high prices charged for their licences. Most of them oﬀer a cut down “student version”

at reduced price (but usable only while the student is in full-time education) so as to

encourage familiarity with the package. This largesse is paid for by other users.

Let us summarize the position. On the one hand, we have the traditional compiled

languages for numerics which are very general, very fast, very diﬃcult to learn and do

not interact readily with graphical or algebraic processes. On the other, we have standard

scientiﬁc packages which are good at integrating numerics, algebra and graphics, but are

slow and limited in scope.

What properties should an ideal scientiﬁc package have? A short list might contain:

1 a programming language which is both easy to understand and which has extensive

expressive ability,

2 integration of algebraic, numerical and graphical functions,

3 the ability to generate numerical algorithms running with speeds within an order of

magnitude of the fastest of those generated by compiled languages,

4 a user interface with adequate on-line help, and decent documentation,

5 an extensive range of textbooks from which the curious reader can develop greater

understanding of the concepts,

6 open-source software, freely available,

7 implementation on all standard platforms, e.g., Linux, Mac OS X, Unix, Windows.

The bad news is that no single package satisﬁes all of these criteria.

The major obstruction here is the requirement of algebraic capability. There are two

open-source packages, wx-Maxima and Reduce with signiﬁcant algebraic capabilities

worthy of consideration, but Reduce fails requirement 4 and both fail criteria 3 and 5.

They are however extremely powerful tools in the hands of experienced users. It seems

sensible therefore to drop the algebra requirement.

In 1991, Guido van Rossum created Python as a open-source platform-independent

general purpose programming language. It is basically a very simple language sur-

rounded by an enormous library of add-on modules, including complete access to the

underlying operating system. This means that it can manage and manipulate programmes

built from other complete (even compiled) packages, i.e., a scripting language. This ver-

satility has ensured both its adoption by power users such as Google, and a real army of

developers. It means also that it can be a very powerful tool for the scientist. Of course,

The author had initially intended to write a book covering both numerical and algebraic applications for

scientists. However, it turns out that, apart from simple problems, the requirements and approaches are

radically diﬀerent, and so it seems more appropriate to treat them diﬀerently.

4 Introduction

there are other scripting languages, e.g., Java and Perl, but none has the versatility or

user-base to meet criteria 3–5 above.

Five years ago it would not have been possible to recommend Python for scientiﬁc

work. The size of the army of developers meant that there were several mutually incom-

patible add-on packages for numerical and scientiﬁc applications. Fortunately, reason

has prevailed and there is now a single numerical add-on package numpy and a single

scientiﬁc one scipy around which the developers have united.

1.2 The plan of this book

The purpose of this intentionally short book is to show how easy it i s for the working

scientist to implement and test non-trivial mathematical algorithms using Python. We

have quite deliberately preferred brevity and simplicity to encyclopaedic coverage in

order to get the inquisitive reader up and running as soon as possible. We aim to leave

the reader with a well-founded framework to handle many basic, and not so basic, tasks.

Obviously, most readers will need to dig further into techniques for their particular

research needs. But after reading this book, they should have a sound basis for this.

This chapter and Appendix A discuss how to set up a scientiﬁc Python environment.

While the original Python interpreter was pretty basic, its replacement IPython is so

easy to use, powerful and versatile that Chapter 2 is devoted to it.

We now describe the subsequent chapters. As each new feature is described, we try

to illustrate it ﬁrst by essentially trivial examples and, where appropriate, by more ex-

tended problems. This author cannot know the mathematical sophistication of potential

readers, but in later chapters we s hall presume some familiarity with basic calculus,

e.g., the Taylor series in one dimension. However, for these extended problems we shall

sketch the background needed to understand them, and suitable references for further

reading will be given.

Chapter 3 gives a brief but reasonably comprehensive survey of those aspects of the

core Python language likely to be of most interest to scientists. Python is an object-

oriented language, which lends itself naturally to object-oriented programming (OOP),

which may well be unfamiliar to most scientists. We shall adopt an extremely light touch

to this topic, but need to point out that the container objects introduced in Section 3.5 do

not all have precise analogues in say C or Fortran. Again the brief introduction to Python

classes in Section 3.9 may be unfamiliar to users of those two families of languages.

The chapter concludes with two implementations of the sieve of Eratosthenes, which

is a classical problem: enumerate all of the prime numbers

less than a given integer

n. A straightforward implementation takes 17 lines of code, but takes inordinately long

execution times once n > 10

. However, a few minutes of thought and using already

described Python features suggests a shorter 13 line programme which runs 3000 times

faster and runs out of memory (on my laptop) once n > 10

. The point of this exercise is

The restriction to integer arithmetic in this chapter is because our exposition of Python has yet to deal

with serious calculations involving real or complex numbers eﬃciently.

1.2 The plan of this book 5

that choosing the right approach (and Python often oﬀers so many) is the key to success

in Python numerics.

Chapter 4 extends the core Python language via the add-on module numpy,togive

averyeﬃcient treatment of real and complex numbers. In the background, lurk C/C++

routines to execute repetitive tasks with near-compiled-language speeds. The empha-

sis is on using structures via vectorized code rather than the traditional for-loops or

do-loops. Vectorized code sounds formidable, but, as we shall show, it is much eas-

ier to write than the old-fashioned loop-based approach. Here too we discuss the input

and output of data. First, we look at how numpy can read and write text ﬁles, human-

readable data and binary data. Secondly, we look brieﬂy at data analysis. We s ummarize

also miscellaneous functions and give a brief introduction to Python’s linear algebra ca-

pabilities. Finally, we r eview even more brieﬂy a further add-on module scipy which

greatly extends the scope of numpy.

Chapter 5 gives an introduction to the add-on module matplotlib. This was inspired

by the striking graphics performance of the Matlab package and aspires to emulate or

improve on it for two-dimensional x, y-plots. Indeed, almost all of the ﬁgures in Chap-

ters 5–9 were produced using matplotlib. The original ﬁgures were produced in colour

using the relevant code snippets. The exigencies of book publishing have required con-

version to black, white and many shades of grey. After giving a range of examples to

illustrate its capabilities, we conclude the chapter with a s lightly more extended ex-

ample, a fully functional 49-line code to compute and produce high-deﬁnition plots of

Mandelbrot sets.

The diﬃculties of extending the discussion to three-dimensional graphics, e.g., rep-

resentations of the surface z = z(x, y) are discussed in Chapter 6. Some aspects of this

can be handled by the matplotlib module, but for more generality we need to invoke the

mayavi add-on module, which is given a brief introduction together with some exam-

ple codes. If the use of such graphics is a major interest for you, then you will need to

investigate further these modules.

If you already have some Python experience, you can of course omit parts of Chap-

ters 3 and 4. You are however encouraged strongly to try out the relevant code snippets.

Once you have understood them, you can deepen your understanding by modifying

them. These “hacking” experiments replace the exercises traditionally included in text-

books. The same applies to Chapters 5 and 6, which cover Python graphics and contain

more substantial snippets. If you already have an idea of a particular picture you would

like to create, then perusal of the examples given here and also those in the matplotlib

gallery (see Section 5.1) should produce a recipe for a close approximation which can

be “hacked” to provide a closer realization of the desired picture.

These ﬁrst chapters cover the basic tools that Python provides to enhance the scien-

tist’s computer experience. How should we proceed further?

A notable omission is that apart from a brief discussion in Section 4.5, the vast subject

of data analysis will not be covered. There are three main reasons for this.

1 Recently an add-on module pandas has appeared. This uses numpy and matplotlib

6 Introduction

to tackle precisely this issue. It comes with comprehensive documentation, which

is described in Section 4.5.

2 One of the authors of pandas has written a book, McKinney (2012), which reviews

IPython, numpy and matplotlib and goes on to treat pandas applications in great

detail.

3 I do not work in this area, and so would simply have to paraphrase the sources

above.

Instead, I have chosen to concentrate on the modelling activities of scientists. One

approach would be to target problems in bioinformatics or cosmology or crystallogra-

phy or engineering or epidemiology or ﬁnancial mathematics or ...etc.Indeed, a whole

series of books with a common ﬁrst half could be produced called “Python for Bioin-

formatics” etc. A less proﬂigate and potentially more useful approach would be to write

a second half applicable to all of these ﬁelds, and many more. I am relying here on the

unity of mathematics. Problems in one ﬁeld when reduced to a core dimensionless f orm

often look like a similarly reduced problem from another ﬁeld.

This property can be illustrated by the following example. In population dynamics

we might study a single species whose population N(T) depends on time T .Givena

plentiful food supply we might expect exponential growth, dN/dT = kN(T ), where the

growth constant k has dimension 1/time. However, there are usually constraints limiting

such growth. A simple model to include these is the “logistic equation”

(T ) = kN(T )

(

− N(T )

)

(1.1)

which allows for a stable constant population N(T ) = N

. The biological background to

this equation is discussed in many textbooks, e.g., Murray (2002).

In (homogeneous spherically symmetric) cosmology, the density parameter Ω de-

pends on the scale factor a via

dΩ

(1 + 3w)

Ω(1 − Ω), (1.2)

where w is usually taken to be a constant.

Now mathematical biology and cosmology do not have a great deal in common, but

it is easy to see that (1.1) and (1.2) represent the same equation. Suppose we scale the

independent variable T in (1.1) by t = kN

T , which renders the new time coordinate

t dimensionless. Similarly, we introduce the dimensionless variable x = N/N

so that

(1.1) becomes the logistic equation

= x(1 − x). (1.3)

In a general relativistic theory, there is no reason to prefer any one time coordinate to

any other. Thus we may choose a new time coordinate t via a = e

t/(1+3w)

, and then setting

x =Ω, we see that (1.2) also reduces to (1.3). Thus the same equations can arise in a

number of diﬀerent ﬁelds.

In Chapters 7–9, we have, for brevity and simplicity, used

minimal equations such as (1.3). If the minimal form for your problem looks something

This example was chosen as a pedagogic example. If the initial value x(0) = x

is speciﬁed, then the exact

1.2 The plan of this book 7

like the one being treated in a code snippet, you can of course hack the snippet to handle

the original long form for your problem.

Chapter 7 looks at four types of problems involving ordinary diﬀerential equations.

We start with a very brief introduction to techniques for solving initial value problems

and then look at a number of examples, including two classic non-linear problems, the

van der Pol oscillator and the Lorenz equations. Next we survey two-point boundary

value problems and examine both a linear Sturm–Liouville eigenvalue problem, and an

exercise in continuation for the non-linear Bratu problem. Problems involving delay dif-

ferential equations arise frequently in control theory and in mathematical biology, e.g.,

the logistic and Mackey–Glass equations, and a discussion of their numerical solution

is given in the next section. Finally in this chapter we look brieﬂy at stochastic calcu-

lus and stochastic ordinary diﬀerential equations. In particular, we consider a simple

example closely linked to the Black–Scholes equation of ﬁnancial mathematics.

There are two other major Python topics relevant to scientists that I would like to

introduce here. The ﬁrst is the incorporation of code written in other languages. There

are two aspects of this: (a) t he reuse of pre-existing legacy code, usually written in

Fortran, (b) if one’s code is being slowed down seriously by a few Python functions, as

revealed by the proﬁler, see Section 2.6, how do we recode the oﬀending functions in

Fortran or C? The second topic is how can a scientiﬁc user make worthwhile use of the

object-oriented programming (OOP) features of Python?

Chapter 8 addresses the ﬁrst topic via an extended example. We look ﬁrst at how

pseudospectral methods can be used to attack a large number of evolution problems

governed by partial diﬀerential equations, either initial value or initial-boundary value

problems. For the sake of brevity, we look only at problems with one time and one

spatial dimension. Here, as we explain, problems with periodic spatial dependence can

be handled very eﬃciently using Fourier methods, but for problems which are more

general, the use of Chebyshev transforms is desirable. However, in this case there is

no satisfactory Python black box available. It turns out that the necessary tools have

already been written in legacy Fortran77 code. These are listed in Appendix B, and we

show how, with an absolutely minimal knowledge of Fortran77, we can construct ex-

tremely fast Python functions to accomplish the required tasks. Our approach relies on

the numpy f2py tool which is included in all of the recommended Python distributions.

If you are interested in possibly reusing pre-existing legacy code, it is worthwhile study-

ing this chapter even if the speciﬁc example treated there is not the task that you have

in mind. See also Section 1.3 for other uses for f2py.

One of the most useful features of object-oriented programming (OOP) from the point

of view of the scientist is the concept of classes. Classes exist in C++ (but not C) and

Fortran90 and later (but not Fortran77). However, both i mplementations are complicated

and so are usually shunned by novice programmers. In contrast, Python’s implementa-

tion is much simpler and more user-friendly, at the cost of omitting some of the more

arcane features of other language implementations. We give a very brief introduction to

solution is x(t) = x

/[x

+ (1 − x

−t

]. In the current context, x

 0. If x

 1, then all solutions tend

monotonically towards the constant solution x = 1ast increases. See also Section 7.5.3.

剩余232页未读，继续阅读

jonwei026a

粉丝: 0
资源: 9

Python编程指南：科学家的首选工具

Python for Scientists, 2nd Edition(2017).pdf

How to Think Like a Computer Scientist: Learning with Python 3

Introduction to Python for Engineers and Scientists -(2018)

有关python大数据分析技术的文献及其作者和出处

用 python 输出横向九九乘法表

写一段能够展示python有趣的程序

python语言用dbscan聚类做文本聚类

python九九乘法表

用python的递推和递归分别写一个程序

用python编写一段合并word文档的代码，保持文档格式不变

jupter notebook visio studio code

用python编写斐波那契的前20项，要求每行输出5项，每行的数据之间用两个空格隔开

Deep Learning Toolbox

多线程实现科学家就餐

用“*” 輸出3 行正等腰三角形

python程序，输出斐波那契数列前20项，每行输出五项

matplotlib.pyplot

anaconda prompt

juypter notebook

最新资源