A fully automated greedy square jigsaw puzzle solver
Dolev Pomeranz Michal Shemesh Ohad Ben-Shahar
dolevp@cs.bgu.ac.il shemeshm@cs.bgu.ac.il ben-shahar@cs.bgu.ac.il
Computer Science Department, Ben-Gurion University of The Negev, Beer Sheva, Israel
Abstract
In the square jigsaw puzzle problem one is required to re-
construct the complete image from a set of non-overlapping,
unordered, square puzzle parts. Here we propose a fully au-
tomatic solver for this problem, where unlike some previous
work, it assumes no clues regarding parts’ location and re-
quires no prior knowledge about the original image or its
simplified (e.g., lower resolution) versions. To do so, we
introduce a greedy solver which combines both informed
piece placement and rearrangement of puzzle segments to
find the final solution. Among our other contributions are
new compatibility metrics which better predict the chances
of two given parts to be neighbors, and a novel estima-
tion measure which evaluates the quality of puzzle solutions
without the need for ground-truth information. Incorporat-
ing these contributions, our approach facilitates solutions
that surpass state-of-the-art solvers on puzzles of size larger
than ever attempted before.
1. Introduction
The popular jigsaw puzzle problem exists for many cen-
turies, long before it was first solved by a computer in
the 20th century. Given n jigsaw puzzle parts of an im-
age, it is required to reconstruct the complete image. Al-
though phrased as a game, this problem, which was proved
to be NP-complete [2, 7], serves a platform for many ap-
plications, e.g. in biological [14], molecular [22], text and
speech [12, 25], archaeological contexts [3, 10], and image
editing [4].
The traditional jigsaw puzzle problem assumes pieces of
various shapes, and indeed earlier computational work con-
sidered the problem in this form. The first jigsaw solver
proposed by Freeman and Garder in 1964 [8] was designed
to solve apictorial puzzles and was able to handle nine-piece
problems. By the end of the century, shaped-based solvers
were already able to reconstruct puzzles of more than 200
pieces [9]. The use of appearance and chromatic informa-
tion began only three decades after the work of Freeman
and Garder (e.g., [11, 6, 21, 24, 23, 13, 15]).
A major building block in most appearance-based puzzle
solvers evaluate the similarity between two parts, and the
strategies for doing so typically divide into two groups. One
approach compares the appearance of the abutting bound-
aries while using a formal distance measure between these
two vectors (e.g., [11, 6, 21, 24, 13, 15]). Alternatively, it
was suggested to use the entire part and measure its statis-
tical properties in order to group similar parts together [15]
or as a means to define pairwise similarity measures [11].
Once the similarity between two parts is estimated, differ-
ent classes of algorithms are employed to solve the puz-
zle, and in particular, previous methods have used greedy
algorithms [7, 15], dynamic programming [1], genetic al-
gorithms [21] and approximation algorithms on graphical
models [5].
Recently, Cho et al. [5] presented a probabilistic solver
which achieves approximated puzzle reconstruction via a
graphical model and a probability function that is maxi-
mized via loopy belief propagation. Since they lack a local
evidence term required for their graphical model, they em-
ployed two strategies that exploit prior knowledge - either
a dense-and-noisy evidence which estimates the low reso-
lution image from a bag of parts, or a sparse-and-accurate
evidence which assumes that a few parts, called “anchor
patches”, are given by an oracle (e.g., a human observer)
at their correct location in the puzzle. While only semi-
automatic, their approach was seminal in its ability to han-
dle puzzles with over 400 pieces.
In this paper we challenge the state-of-the-art in jigsaw
puzzle solving in several ways. We suggest a computational
framework that can handle in reasonable time square jigsaw
puzzles of size larger than ever attempted before. How-
ever, we completely exclude the use of clues, oracles, or
human intervention, as well as any use of prior knowledge
about the original image or simplified (e.g., lower resolu-
tion) versions of it. Despite these significant restrictions,
our approach achieves better than state-of-the-art perfor-
mance, and frequently succeeds in providing completely ac-
curate puzzle solutions.
9