Preface
[ 2 ]
Chapter 4, Test Processing Using the Standard Library: The standard Python distribuon
includes a powerful set of built-in libraries designed to manage textual content. We look
at conguraon le reading and manipulaon, CSV les, and JSON data. We take a bit of a
detour at the end of this chapter to learn how to create your own redistributable Python egg
les.
Chapter 5, Regular Expressions: Looks at Python's regular expression implementaon and
teaches you how to implement them. We look at standardized concepts as well as Python's
extensions. We'll break down a few graphically so that the component parts are easy to piece
together. You'll also learn how to safely use regular expressions with internaonal alphabets.
Chapter 6, Structured Markup: Introduces you to XML and HTML processing. We create an
adventure game using both SAX and DOM approaches. We also look briey at
lxml and
ElementTree. HTML parsing is also covered.
Chapter 7, Creang Templates: Using the Mako template language, we'll generate e-mail
and HTML text templates much like the ones that you'll encounter within common web
frameworks. We visit template creaon, inheritance, lters, and custom tag creaon.
Chapter 8, Understanding Encodings and i18n: We provide a look into character encoding
schemes and how they work. For reference, we'll examine ASCII as well as KOI8-R. We also
look into Unicode and its various encoding mechanisms. Finally, we nish up with a quick
look at applicaon internaonalizaon.
Chapter 9, Advanced Output Formats: Provides informaon on how to generate PDF, Excel,
and OpenDocument data. We'll build these document types from scratch using direct Python
API calls relying on third-party libraries.
Chapter 10, Advanced Parsing and Grammars: A look at more advanced text manipulaon
techniques such as those used by programming language designers. We'll use the PyParsing
library to handle some conguraon le management and look into the Python Natural
Language Toolkit.
Chapter 11, Searching and Indexing: A praccal look at full text searching and the benet an
index can provide. We'll use the Nucular system to index a collecon of small text les and
make them quickly searchable.
Appendix A, Looking for Addional Resources: It introduces you to places of interest on the
Internet and some community resources. In this appendix, you will learn to create your own
documentaon and to use Java Lucene based engines. You will also learn about dierences
between Python 2 & Python 3 and to port code to Python 3.