with machine learning algorithms, achieving a success rate from
4.89% to 66.2%.
Our own early work [14] has broken a number of CAPTCHAs
(including those hosted at Captchaservice.org, a web service
specialised for CAPTCHA generation) with almost 100% success
by simply counting the number of pixels of each segmented
character, although these schemes were all resistant to the best
OCR software on the market. In contrast to other work that relied
on sophisticated computer vision or machine learning algorithms,
this study used only simple pattern recognition algorithms but
exploited fatal design errors that were discovered in each scheme.
This is one of the few work examining the robustness of
CAPTCHA from the security angle.
PWNtcha [7] is an excellent web page that aims to “demonstrate
the inefficiency of many CAPTCHA implementations”. It briefly
comments on the weaknesses of about a dozen simple
CAPTCHAs, which were claimed to be broken with a success
ranging from 49% to 100%. However, no technical detail of the
attacks was publicly available. Many more CAPTCHAs were also
commented at this site. For example, both the MSN scheme and a
Yahoo CAPTCHA that will be discussed in this paper (i.e. Yahoo
Scheme 1 in Section 6.1) were regarded by this site as “very
good” and difficult to break.
Two interesting algorithms were proposed in [19] to amplify the
skill gap between humans and computers. The algorithms could
improve systems security for text-based CAPTCHAs, but are
orthogonal to this paper. (In this paper, we do not discuss other
types of CAPTCHAs such as image-based ones. For those who
are interested, an overview of image-based CAPTCHAs can be
found in [19].)
Usability and robustness are two fundamental issues with
CAPTCHAs, and they often interconnect with each other. In [21],
we examined usability issues that should be considered and
addressed in the design of CAPTCHAs, and discussed subtle
implications some of the issues can have on robustness.
One last note: a survey on CAPTCHAs research (including the
design of most early notable schemes) can be found in [ 13], and
the limitations of defending against bots with CAPTCHAs
(including protocol-level attacks) were discussed in [ 15].
3. THE MSN SCHEME
Fig 1 shows some sample challenges generated by the MSN
CAPTCHA scheme. We have no access to the codebase of the
MSN scheme, so we collected from Microsoft’s website 100
random samples that were generated in real time online at [16].
By studying [4, 5] and the samples we collected, we observed that
the MSN scheme (as deployed) has the following characteristics.
Fig 1. The MSN CAPTCHA: 4 sample challenges.
• Eight characters are used in each challenge;
• Only upper case letters and digits are used.
• Foreground (i.e. challenge text) is dark blue and background
light gray.
• Warping (both local and global) is used for character
distortion.
Local warp produces “small ripples, waves and elastic
deformations along the pixels of the character”, and it foils
“feature-based algorithms which may use character thickness
or serif features to detect and recognise characters” [6].
Characters in the first and second rows of Table 1 are largely
distorted by local warping.
Global warp generates character-level, elastic deformations
to foil template matching algorithms for character detection
and recognition. Characters in the third and fourth rows of
Table 1 are largely distorted by global warping.
• The following random arcs of different thicknesses are used
as the main anti-segmentation measure.
o Thick foreground arcs: These arcs are of foreground
color. Their thickness can be the same as the thick
portions of characters. They do not directly intersect
with any characters, so they are also called “non-
intersecting arcs”.
o Thin foreground arcs: These arcs are of foreground
color. Although they are typically not as thick as the
above type of arcs, their thickness can be the same as
the thin portions of characters. They intersect with thick
arcs, characters or both, and thus also called
“intersecting thin arcs”.
o Thin background arcs: These arcs are thin and of
background color. They cut through characters and
remove some character content (pixels).
Both local and global warping is commonly used for distortion in
text-based CAPTCHAs. Many schemes use background textures
and meshes in foreground and background colors as clutter to
increase robustness. However, random arcs of different
thicknesses are used as clutter in the MSN scheme. The rationale
was as follows. These arcs are themselves good candidates for
false characters. The mix of random arcs and characters would
confuse state of the art segmentation methods, providing strong
segmentation resistance [5].
4. A SEGMENTATION ATTACK
We have developed a low-cost attack that can effectively and
efficiently segment challenges generated by the MSN scheme.
Specifically, our attack achieves the following:
• Identify and remove random arcs
• Identify all character locations in the right order; in other
words, divide each challenge into 8 ordered segments, each
containing a single character.
Our attack is built on observing and analysing the 100 random
samples we collected – this is a “sample set”. The effectiveness of
this attack was tested not only on the sample set, but also on a
large test set of 500 random samples – the design of the attack
used no prior knowledge about any sample in this set. This
methodology follows the common practice in the fields such as
545