ECMAScript规范解析与历史

需积分: 3 87 浏览量更新于2024-07-24 收藏 1.16MB PDF 举报

"ECMAScript Language Specification Edition 3, 24-Mar-00, Edition 3 Final" ECMAScript，通常简称为ES，是一种基于多种原始技术的标准，其中包括最著名的JavaScript（Netscape）和JScript（Microsoft）。这种语言由Brendan Eich在Netscape公司发明，并首次出现在该公司的Navigator 2.0浏览器中。自此以后，它在Netscape的所有后续浏览器以及从Microsoft的Internet Explorer 3.0开始的所有浏览器中都有应用。 ECMAScript语言规范的开发始于1996年11月。第一版ECMA标准在1997年6月的ECMA全体大会上被采纳。这个标准随后被提交给ISO/IEC JTC1，于1998年4月通过快速通道程序被批准为国际标准ISO/IEC 16262。1998年6月的ECMA全体大会批准了第二版ECMA-262，以确保其与ISO/IEC 16262完全同步。第一版和第二版之间的变化主要是编辑性的。当前文档定义的是标准的第三版，它包含了若干重要的增强。这一版的发布对开发者来说意义重大，因为它提供了语言的新特性和改进，这些可能包括但不限于： 1. **类型系统**：ECMAScript 3引入了更强大的数据类型，如字符串、数字、布尔值、null、undefined，以及对象和数组等。 2. **函数和作用域**：规定了函数的创建、调用方式，以及变量的作用域规则，包括局部作用域和全局作用域。 3. **正则表达式**：增强了正则表达式的功能，允许更复杂的模式匹配和字符串处理。 4. **错误处理**：引入了try-catch语句，用于捕获和处理运行时错误。 5. **字符串和数组方法**：添加了更多的字符串和数组操作方法，如split(), join(), push(), pop()等，提高了代码的可读性和效率。 6. **原型和继承**：定义了基于原型的对象继承机制，使得类和对象的创建更加灵活。 7. **JSON支持**：虽然JSON在ECMAScript 3中并非原生支持，但后来的版本（特别是ECMAScript 5）开始提供JSON对象，用于解析和序列化数据。 8. **严格模式**：引入了"strict mode"，这是一种更严格的代码执行模式，有助于检测和防止一些常见的编程错误。 ECMAScript 3是JavaScript语言发展的一个重要里程碑，它的标准化工作为后来的版本（如ECMAScript 5、6、7等）奠定了基础，并且影响了现代Web开发的许多方面。对于任何希望深入理解JavaScript或ECMAScript的人来说，熟悉这一版的规范是至关重要的。

ECMAScript Language Specification Edition 3 24-Mar-00

expression

constructor.prototype, and properties added to an object’s prototype are shared, through

inheritance, by all objects sharing the prototype.

4.3.6 Native Object

A native object is any object supplied by an ECMAScript implementation independent of the host environment.

Standard native objects are defined in this specification. Some native objects are built-in; others may be

constructed during the course of execution of an ECMAScript program.

4.3.7 Built-in Object

A built-in object is any object supplied by an ECMAScript implementation, independent of the host environment,

which is present at the start of the execution of an ECMAScript program. Standard built-in objects are defined in this

specification, and an ECMAScript implementation may specify and define others. Every built-in object is a native

object.

4.3.8 Host Object

A host object is any object supplied by the host environment to complete the execution environment of

ECMAScript. Any object that is not native is a host object.

4.3.9 Undefined Value

The undefined value is a primitive value used when a variable has not been assigned a value.

4.3.10 Undefined Type

The type Undefined has exactly one value, called undefined.

4.3.11 Null Value

The null value is a primitive value that represents the null, empty, or non-existent reference.

4.3.12 Null Type

The type Null has exactly one value, called null.

4.3.13 Boolean Value

A boolean value is a member of the type Boolean and is one of two unique values, true and false.

4.3.14 Boolean Type

The type Boolean represents a logical entity and consists of exactly two unique values. One is called true and the

other is called false.

4.3.15 Boolean Object

A Boolean object is a member of the type Object and is an instance of the built-in Boolean object. That is, a

Boolean object is created by using the Boolean constructor in a new expression, supplying a boolean as an

argument. The resulting object has an implicit (unnamed) property that is the boolean. A Boolean object can be

coerced to a boolean value.

4.3.16 String Value

A string value is a member of the type String and is a finite ordered sequence of zero or more 16-bit unsigned

integer values.

NOTE Although each value usually represents a single 16-bit unit of UTF-16 text, the language does not place any restrictions

or requirements on the values except that they be 16-bit unsigned integers.

4.3.17 String Type

The type String is the set of all string values.

ECMAScript Language Specification Edition 3 24-Mar-00

5 Notational Conventions

5.1 Syntactic and Lexical Grammars

This section describes the context-free grammars used in this specification to define the lexical and syntactic

structure of an ECMAScript program.

5.1.1 Context-Free Grammars

A context-free grammar consists of a number of productions. Each production has an abstract symbol called a

nonterminal as its left-hand side, and a sequence of zero or more nonterminal and terminal symbols as its right-

hand side. For each grammar, the terminal symbols are drawn from a specified alphabet.

Starting from a sentence consisting of a single distinguished nonterminal, called the goal symbol, a given context-

free grammar specifies a language, namely, the (perhaps infinite) set of possible sequences of terminal symbols

that can result from repeatedly replacing any nonterminal in the sequence with a right-hand side of a production for

which the nonterminal is the left-hand side.

5.1.2 The Lexical and RegExp Grammars

A lexical grammar for ECMAScript is given in section 7. This grammar has as its terminal symbols the characters of

the Unicode character set. It defines a set of productions, starting from the goal symbol InputElementDiv or

InputElementRegExp, that describe how sequences of Unicode characters are translated into a sequence of input

elements.

Input elements other than white space and comments form the terminal symbols for the syntactic grammar for

ECMAScript and are called ECMAScript tokens. These tokens are the reserved words, identifiers, literals, and

punctuators of the ECMAScript language. Moreover, line terminators, although not considered to be tokens, also

become part of the stream of input elements and guide the process of automatic semicolon insertion (section 7.8.5).

Simple white space and single-line comments are discarded and do not appear in the stream of input elements for

the syntactic grammar. A MultiLineComment (that is, a comment of the form “/*…*/” regardless of whether it

spans more than one line) is likewise simply discarded if it contains no line terminator; but if a MultiLineComment

contains one or more line terminators, then it is replaced by a single line terminator, which becomes part of the

stream of input elements for the syntactic grammar.

A RegExp grammar for ECMAScript is given in section 15.10. This grammar also has as its terminal symbols the

characters of the Unicode character set. It defines a set of productions, starting from the goal symbol Pattern, that

describe how sequences of Unicode characters are translated into regular expression patterns.

Productions of the lexical and RegExp grammars are distinguished by having two colons “::” as separating

punctuation. The lexical and RegExp grammars share some productions.

5.1.3 The Numeric String Grammar

A second grammar is used for translating strings into numeric values. This grammar is similar to the part of the

lexical grammar having to do with numeric literals and has as its terminal symbols the characters of the Unicode

character set. This grammar appears in section 9.3.1.

Productions of the numeric string grammar are distinguished by having three colons “:::” as punctuation.

5.1.4 The Syntactic Grammar

The syntactic grammar for ECMAScript is given in sections 11, 12, 13 and 14. This grammar has ECMAScript

tokens defined by the lexical grammar as its terminal symbols (section 5.1.2). It defines a set of productions,

starting from the goal symbol Program, that describe how sequences of tokens can form syntactically correct

ECMAScript programs.

When a stream of Unicode characters is to be parsed as an ECMAScript program, it is first converted to a stream of

input elements by repeated application of the lexical grammar; this stream of input elements is then parsed by a

single application of the syntax grammar. The program is syntactically in error if the tokens in the stream of input

elements cannot be parsed as a single instance of the goal nonterminal Program, with no tokens left over.

ECMAScript Language Specification Edition 3 24-Mar-00

Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.

The syntactic grammar as presented in sections 11, 12, 13 and 14 is actually not a complete account of which

token sequences are accepted as correct ECMAScript programs. Certain additional token sequences are also

accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in

certain places (such as before line terminator characters). Furthermore, certain token sequences that are described

by the grammar are not considered acceptable if a terminator character appears in certain “awkward” places.

5.1.5 Grammar Notation

Terminal symbols of the lexical and string grammars, and some of the terminal symbols of the syntactic grammar,

are shown in fixed width font, both in the productions of the grammars and throughout this specification

whenever the text directly refers to such a terminal symbol. These are to appear in a program exactly as written. All

nonterminal characters specified in this way are to be understood as the appropriate Unicode character from the

ASCII range, as opposed to any similar-looking characters from other Unicode ranges.

Nonterminal symbols are shown in italic type. The definition of a nonterminal is introduced by the name of the

nonterminal being defined followed by one or more colons. (The number of colons indicates to which grammar the

production belongs.) One or more alternative right-hand sides for the nonterminal then follow on succeeding lines.

For example, the syntactic definition:

WithStatement :

with ( Expression ) Statement

states that the nonterminal WithStatement represents the token with, followed by a left parenthesis token, followed

by an Expression, followed by a right parenthesis token, followed by a Statement. The occurrences of Expression

and Statement are themselves nonterminals. As another example, the syntactic definition:

ArgumentList :

AssignmentExpression

ArgumentList , AssignmentExpression

states that an ArgumentList may represent either a single AssignmentExpression or an ArgumentList, followed by a

comma, followed by an AssignmentExpression. This definition of ArgumentList is recursive, that is, it is defined in

terms of itself. The result is that an ArgumentList may contain any positive number of arguments, separated by

commas, where each argument expression is an AssignmentExpression. Such recursive definitions of nonterminals

are common.

The subscripted suffix “opt”, which may appear after a terminal or nonterminal, indicates an optional symbol. The

alternative containing the optional symbol actually specifies two right-hand sides, one that omits the optional

element and one that includes it. This means that:

VariableDeclaration :

Identifier Initialiser

opt

is a convenient abbreviation for:

VariableDeclaration :

Identifier

Identifier Initialiser

and that:

IterationStatement :

for ( ExpressionNoIn

opt

; Expression

opt

; Expression

opt

) Statement

is a convenient abbreviation for:

IterationStatement :

for(;Expression

opt

; Expression

opt

) Statement

for ( ExpressionNoIn ; Expression

opt

; Expression

opt

) Statement

剩余190页未读，继续阅读

hqbwhatever

粉丝: 1
资源: 44

ECMAScript规范解析与历史

ECMAScript Language Specification Edition 3 24-Mar-00 (Mozilla)

Ecma-262 edition 6(ECMAScript® 2015 Language Specification)

ECMAScript 5.1 Language Specification for JavaScript

ECMAScript® 2016 7th Language Specification.pdf

TypeScript Language Specification

ECMAScript 5th Edition: Language Specification

ECMAScript 5th Edition - JavaScript Language Specification

Learning ECMAScript 6

提案：跟踪ECMAScript提案

ECMAScript 5.1 规范解析

最新资源