![](https://csdnimg.cn/release/download_crawler_static/1159619/bg6.jpg)
ReturnCode SequenceAt( Sequence *sequence, Position fromBegin, Position fromEnd, Se-
quence *returnedSequence );
These variations will not aect the basic character of the data structure used to implement the
sequence or the comparisons b etween them that follow. Therefore I will assume the rst interface
(Empty, Insert, Delete, IntemAt, and Close).
3 Comparing sequence data structures
In order to compare sequence data structures it is necessary to know the relative frequency of the
ve basic op erations in typical editing.
The
NewSequence
op eration is infrequent. Most sequence data structures require the
NewSequence
op eration to scan the entire le and convert it to some internal form. This can b e a problem with
very large les. To lo ok through a very large le, it must b e read in any case but if the sequence
data structure do es not require sequence prepro cessing it can interleave the le reading with text
editor use rather than requiring the user to wait for the entire le to b e read b efore editing can
b egin. This is an advantage and means the user do es not have to worry ab out how many les are
loaded or how large they are, since the cost of reading the large le is amortized over its use.
The
Close
op eration is also infrequent and not normally an imp ortant factor in comparing sequence
data structures.
The
ItemAt
op eration will, of course, b e very frequent, but it will also b e very lo calized as well
b ecause characters in a text editor are almost always accessed sequentially. When the screen is
drawn the characters on the screen are accessed sequentially beginning with the rst character
on the screen. If the user pages forward or backwards the characters are accessed sequentially.
Searches normally begin at the current p osition and pro ceed forward or backward sequentially.
Overall, there is almost no random access in a text editor. This means that, although the
ItemAt
op eration is very frequent, its eciency in the general case is not signicant. Only the case where
a character is requested that is sequentially b efore or after the last character accessed needs to b e
optimized.
To test this hypothesis I instrumented a text editor and lo oked at the
ItemAt
op erations in a variety
of text editing situations. The numb er of
ItemAt
calls that were sequential (one away from the last
ItemAt
) was always ab ove 97% and usually ab ove 99%. The numb er of
ItemAts
that were within
20 items of the previous
ItemAt
was always over 99%. The average numb er of characters from
one
ItemAt
to the next was typically around 2 but sometimes would go up to as high as 10 or 15.
Overall these exp eriments verify that the p ositions of
ItemAt
calls are extremely lo calized.
Thus sequence data structures should b e optimized for edits (
inserts
and
deletes
) and sequential
character access. Since caching can b e used with most data structures to make sequential access
fast, this means that the eciency of inserts and deletes is paramount. With regard to
Inserts
and
Deletes
one would assume that there would b e more
Inserts
than
Deletes
but that there would b e
a mix b etween them in typical text editing.
In all sequence data structures,
ItemAt
is much faster than
Inserts
and
Deletes
. If we consider
typical text editing, the display is changed after each
Insert
and
Delete
and this requires some
6