没有合适的资源?快使用搜索试试~ 我知道了~
首页ISO 32000-1:2008 - 国际化PDF文件标准与使用规定
"ISO 32000-1:2008,正式名称为"Document management -- Portable document format -- Part 1: PDF 1.7",是PDF(Portable Document Format,便携式文档格式)的标准规范之一。该标准由Adobe Systems Incorporate在2008年发布,第一版于同年7月1日发布。ISO 32000-1旨在定义PDF文件的国际通用格式,确保跨平台、设备和操作系统之间的兼容性,使得电子文档能在各种环境下被准确地创建、阅读和打印。
作为PDF的官方标准,它规定了PDF文件的结构、内容表示、元数据处理、安全性以及交互性特性。例如,标准涵盖了字体嵌入策略,明确指出除非拥有嵌入字体的许可,否则用户不能编辑PDF文件,这是为了保护版权和Adobe的商标权益。此外,PDF 1.7版本的优化参数主要是针对打印设计的,这意味着它在确保文本清晰度的同时,兼顾了纸张介质的呈现效果。
版权方面,该文档源于ISO 32000-1标准文档,来源于ISO官方网站(<http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=51502>),并在Adobe的网站上提供。发布者强调,通过下载ISO 32000-1:2008的相关文件,用户需理解并承担不违反Adobe的软件许可政策的责任。
ISO 32000-1:2008不仅是PDF文件创作和验证的准则,也是企业和个人在数字化文档管理中必须遵循的重要技术规格。它对于保持电子文档的一致性和互操作性,提升工作效率和数据交换的可靠性起到了关键作用。"
PDF 32000-1:2008
8 © Adobe Systems Incorporated 2008 – All rights reserved
4.23
font
identified collection of graphics that may be
glyphs or other graphic elements [ISO 15930-4]
4.24
function
a special type of object that represents parameterized classes, including
mathematical formulas and sampled
representations with arbitrary resolution
4.25
glyph
recognizable abstract graphic symbol that is indepe
nd
ent of any specific design [ISO/IEC 9541-1]
4.26
graphic state
the top of a push down stack of the graphics control parameters that define the current global framework within
which
the graphics operators execute
4.27
ICC profile
colour profile conforming to the ICC spe
c
ification [ISO 15076-1:2005]
4.28
indirect object
an object that is labeled with a positive integer objec
t number followed by a non-negative integer generation
number followed by obj and having endobj after it
4.29
integer object
mathematical integers with an implementation specified i
n
terval centered at 0 and written as one or more
decimal digits optionally preceded by a sign
4.30
name object
an atomic symbol uniquely defined by a sequence of character
s
introduced by a SOLIDUS (/), (2Fh) but the
SOLIDUS is not considered to be part of the name
4.31
name tree
similar to a dictionary that associates
ke
ys and values but the keys in a name tree are strings and are ordered
4.32
null object
a single object of type null, d
enoted by the keyword null, and having a type and value that are unequal to those
of any other object
4.33
number tree
similar to a dictionary that associates keys an
d
values but the keys in a number tree are integers and are
ordered
4.34
numeric object
either an integer object or a real object
4.35
object
a basic data structure from which PDF files are constructed
an
d includes these types: array, Boolean,
dictionary, integer, name, null, real, stream and string
© Adobe Systems Incorporated 2008 – All rights reserved 9
PDF 32000-1:2008
4.36
object reference
an object value used to allow one object to refer to another
; th
at has the form “<n> <m> R” where <n> is an
indirect object number, <m> is its version number and R is the uppercase letter R
4.37
object stream
a stream that contains a sequence of PDF objects
4.38
PDF
Portable Document Format file fo
rm
at defined by this specification [ISO 32000-1]
4.39
real object
approximate mathematical real numbers, but with limited r
ange and precision and written as one or more
decimal digits with an optional sign and a leading, trailing, or embedded PERIOD (2Eh) (decimal point)
4.40
rectangle
a specific array object used to describe locations on a p
a
ge and bounding boxes for a variety of objects and
written as an array of four numbers giving the coordinates of a pair of diagonally opposite corners, typically in
the form [ ll
x
ll
y
ur
x
ur
y
] specifying the lower-left x, lower-left y, upper-right x, and upper-right y coordinates of
the rectangle, in that order
4.41
resource dictionary
associates resource names, used in content str
eams, with the resource objects themselves and organized into
various categories (e.g., Font, ColorSpace, Pattern)
4.42
space character
text string character used to represent orthog
ra
phic white space in text strings
NOTE 2 space characters include HORIZONTAL TAB (U+0009), LINE FEED (U+000A), VERTICAL TAB (U+000B),
FORM FEED (U+000C), CARRIAGE RETURN (U+000D), SPACE (U+0020), NOBREAK SPACE (U+00A0),
EN SPACE (U+2002), EM SPACE (U+2003), FIGURE SPACE (U+2007), PUNCTUATION SPACE (U+2008),
THIN SPACE (U+2009), HAIR SPACE (U+200A), ZERO WIDTH SPACE (U+200B), and IDEOGRAPHIC
SPACE (U+3000)
4.43
stream object
consists of a dictionary followed by zero or more byte
s b
racketed between the keywords stream and endstream
4.44
string object
consists of a series of bytes (unsigned integer values in th
e range 0 to 255) and the bytes are not integer
objects, but are stored in a more compact form
4.45
web capture
refers to the process of creating PDF content by importin
g a
nd possibly converting internet-based or locally-
resident files. The files being imported may be any arbitrary format, such as HTML, GIF, JPEG, text, and PDF
4.46
white-space character
characters that separate PDF syntactic constructs such as nam
es and
numbers from each other; white space
characters are HORIZONTAL TAB (09h), LINE FEED (0Ah), FORM FEED (0Ch), CARRIAGE RETURN (0Dh),
SPACE (20h); (see Table 1 in 7.2.2, “Character Set”)
PDF 32000-1:2008
10 © Adobe Systems Incorporated 2008 – All rights reserved
4.47
XFDF file
file conforming to the XML Forms Data Format 2.0 specificatio
n, wh
ich is an XML transliteration of Forms Data
Format (FDF)
4.48
XMP packet
structured wrapper for serialized XML metadata that can be embedded in a wide variety of file formats
5Notation
PDF operators, PDF keywords, the names of keys in PDF dictionaries, and other predefined names are written
in bold sans serif font; words that denote operands of PDF operators or values of dictionary keys are written in
italic sans serif font.
Token characters used to delimit objects and descri
be the structure of PDF files, as defined in 7.2, "Lexical
Conventions", may be identified by their ANSI X3.4-1986 (ASCII 7-bit USA codes) character name written in
upper case in bold sans serif font followed by a parenthetic two
dig
it hexadecimal character value with the suffix
“h”.
Characters in text streams, as define
d by 7.9.2, "String Object Types", may be identified by their ANSI X3.4-
1986 (ASCII 7-bit USA codes) character name written in uppercase in
sans serif font followed by a parenthetic
four digit hexadecimal character code value with the prefix “U+” as shown in EXAMPLE 1 in this clause.
EXAMPLE 1 EN SPACE (U+2002).
6 Version Designations
For the convenience of the reader, the PDF versions in which various features were introduced are provided
informatively within this document. The first version of PDF was designated PDF 1.0 and was specified by
Adobe Systems Incorporated in the PDF Reference 1.0 document published by Adobe and Addison Wesley.
Since then, PDF has gone through seven revisions designated as: PDF 1.1, PDF 1.2, PDF 1.3, PDF 1.4, PDF
1.5, PDF 1.6 and PDF 1.7. All non-deprecated features defined in a previous PDF version were also included in
the subsequent PDF version. Since ISO 32000-1 is a PDF version matching PDF 1.7, it is also suitable for
interpretation of files made to conform with any of the PDF specifications 1.0 through 1.7. Throughout this
specification in order to indicate at which point in the sequence of versions a feature was introduced, a notation
with a PDF version number in parenthesis (e.g., (PDF 1.3)) is used. Thus if a feature is labelled with (PDF 1.3)
it means that PDF 1.0, PDF 1.1 and PDF 1.2 were not specified to support this feature whereas all versions of
PDF 1.3 and greater were defined to support it.
© Adobe Systems Incorporated 2008 – All rights reserved 11
PDF 32000-1:2008
7Syntax
7.1 General
This clause covers everything about the syntax of PDF at the object, file, and document level. It sets the stage
for subsequent clauses, which describe how the contents of a PDF file are interpreted as page descriptions,
interactive navigational aids, and application-level logical structure.
PDF syntax is best understood by considering it as four parts, as shown in Figure 1:
• Objects. A PDF document is a data structure composed from a small set of basic types of data objects.
Sub-
clause 7.2, "Lexical Conventions," describes the character set used to write objects and other
syntactic elements. Sub-clause 7.3, "Objects," describes the syntax and essential properties of the objects.
Sub-clause 7.3.8, "Stream Objects," provides complete details of the most complex data type, the stream
object.
• File struc
t
ure. The PDF file structure determines how objects are stored in a PDF file, how they are
accessed, and how they are updated. This structure is independent of the semantics of the objects. Sub-
clause 7.5, "File Structure," describes the file structur
e. Sub-
clause 7.6, "Encryption," describes a file-level
mechanism for protecting a document’s contents from unauthorized access.
• Document structure. Th
e PDF document structure specifies how the basic object types are used to
represent components of a PDF document: pages, fonts, annotations, and so forth. Sub-clause 7.7,
"Document Structure," describes the overall document structure; later clauses address the detailed
semantics of the components.
• Co
ntent streams. A PDF co
ntent stream contains a sequence of instructions describing the appearance of
a page or other graphical entity. These instructions, while also represented as objects, are conceptually
distinct from the objects that represent the document structure and are described separately. Sub-clause
7.8, "Content Streams and Resources," discusses PDF content streams and
th
eir associated resources.
Figure 1 – PDF Components
In addition, this clause describes some dat
a structures, built from basic objects, that are so widely used that
they can almost be considered basic object types in their own right. These objects are covered in: 7.9,
"Common Data Structures"; 7.10, "Functions"; and 7.11, "File Specifications."
NOTE Variants of PDF’s object and file syntax are also used as the basis for other file formats. These include the
Forms Data Format (FDF), described in 12.7.7, "Forms Data Format", and the Portable Job Ticket Format
(PJTF), described in Adobe Technical Note #5620, Po
rt
able Job Ticket Format.
7.2 Lexical Conventions
7.2.1 General
At the most fundamental level, a PDF file is a sequence o
f bytes. These bytes can be grouped into tokens
according to the syntax rules described in this sub-clause. One or more tokens are assembled to form higher-
Objects
File
structure
Document
structure
Content
stream
PDF 32000-1:2008
12 © Adobe Systems Incorporated 2008 – All rights reserved
level syntactic entities, principally objects, which are the basic data values from which a PDF document is
constructed.
A non-encrypted PDF can be entirely represented using by
te values corresponding to the visible printable
subset of the character set defined in ANSI X3.4-1986, plus white space characters. However, a PDF file is not
restricted to the ASCII character set; it may contain arbitrary bytes, subject to the following considerations:
• The tokens that delimit objects and that describe the stru
cture of a PDF file shall use the ASCII character
set. In addition all the reserved words and the names used as keys in PDF standard dictionaries and
certain types of arrays shall be defined using the ASCII character set.
• The data values of strings and streams objects may be wr
itten either entirely using the ASCII character set
or entirely in binary data. In actual practice, data that is naturally binary, such as sampled images, is
usually represented in binary for compactness and efficiency.
• A PDF file containing binary data shall be transported as a
binary file rather than as a text file to insure that
all bytes of the file are faithfully preserved.
NOTE 1 A binary file is not portable to environments that impose reserved character codes, maximum line lengths, end-
of-line conventions, or other restrictions
NOTE 2 In this clause, the usage of the term character is en
tirely independent of any logical meaning that the value
may have when it is treated as data in specific contexts, such as representing human-readable text or
selecting a glyph from a font.
7.2.2 Character Set
The PDF character set is divided into three classes, called regu
lar, delimiter, and white-space characters. This
classification determines the grouping of characters into tokens. The rules defined in this sub-clause apply to
all characters in the file except within strings, streams, and comments.
The White-space characters sh
own in Table 1 separate syntactic constructs such as names and numbers from
each other. All white-space characters are equivalent, except i
n comments, strings, and streams. In all other
contexts, PDF treats any sequence of consecutive white-space characters as one character.
The CARRIAGE RETURN (0Dh) and LINE FEED (0Ah) char
acters, also called newline characters, shall be
treated as end-of-line (EOL) markers. The combination of a CARRIAGE RETURN followed immediately by a
LINE FEED shall be treated as one EOL marker. EOL markers may be treated the same as any other white-
space characters. However, sometimes an EOL marker is required or recommended—that is, preceding a
token that must appear at the beginning of a line.
NOTE The examples in this standard use a convention that arranges tokens into lines. However, the examples’ use of
white space for indentation is purely for clarity of exposition and need not be included in practical use.
Table 1 – White-space characters
Decimal Hexadecimal Octal Name
0 00 000 Null (NUL)
9 09 011 HORIZONTAL TAB (HT)
10 0A 012 LINE FEED (LF)
12 0C 014 FORM FEED (FF)
13 0D 015 CARRIAGE RETURN (CR)
32 20 040 SPACE (SP)
剩余755页未读,继续阅读
cc729123910
- 粉丝: 20
- 资源: 12
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- Fisher Iris Setosa数据的主成分分析及可视化- Matlab实现
- 深入理解JavaScript类与面向对象编程
- Argspect-0.0.1版本Python包发布与使用说明
- OpenNetAdmin v09.07.15 PHP项目源码下载
- 掌握Node.js: 构建高性能Web服务器与应用程序
- Matlab矢量绘图工具:polarG函数使用详解
- 实现Vue.js中PDF文件的签名显示功能
- 开源项目PSPSolver:资源约束调度问题求解器库
- 探索vwru系统:大众的虚拟现实招聘平台
- 深入理解cJSON:案例与源文件解析
- 多边形扩展算法在MATLAB中的应用与实现
- 用React类组件创建迷你待办事项列表指南
- Python库setuptools-58.5.3助力高效开发
- fmfiles工具:在MATLAB中查找丢失文件并列出错误
- 老枪二级域名系统PHP源码简易版发布
- 探索DOSGUI开源库:C/C++图形界面开发新篇章
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功