Google Landmarks Dataset v2: 大规模实例识别与检索基准

需积分: 0 94 浏览量更新于2024-07-15 收藏 9.5MB PDF 举报

"Google Landmarks Dataset v2 是一个大规模实例级识别和检索基准测试数据集，由Google Research发布，用于图像检索技术的评估和进步。该数据集包含超过5百万张图像，涵盖20万多种不同的实例标签，是目前同类数据集中最大的一个。测试集有11.8万张带有精确标注的图像，适用于检索和识别任务。数据集的构建耗费了超过800小时的人工注解工作。GLDv2特别关注现实世界应用中的挑战性问题，如极度倾斜的类别分布和复杂的视觉变化。" 谷歌地标数据集v2（Google Landmarks Dataset v2，简称GLDv2）是一个专门为实例级识别和图像检索提供基准测试的大规模数据集。随着图像检索和实例识别技术的快速发展，需要具有挑战性的数据集来准确评估其性能，并提出新的实际应用难题。GLDv2的出现就是为了满足这一需求。该数据集的特点包括： 1. 大规模：GLDv2拥有超过5百万张图像，远超之前的任何同类数据集，这为深度学习模型提供了充足的训练和验证数据。 2. 细粒度分类：数据集包含20万多种独特的实例标签，意味着每个类别可能只包含少量图像，这对于处理长尾分布的问题尤其具有挑战性。 3. 实际应用导向：GLDv2的设计考虑到了真实世界中的复杂情况，如地标在不同季节、天气、视角下的变化，以及光照、遮挡等因素对识别的影响。 4. 测试集质量高：11.8万张带有精确注解的测试图像，为评价检索和识别算法的准确性提供了可靠依据。 5. 重度人工注解：数据集的构建过程中，人工注解工作耗时巨大，确保了标注的质量和准确性。 GLDv2的出现，对于推动图像检索和实例识别技术的进步具有重要意义。它不仅能够帮助研究人员开发更高效、更具鲁棒性的算法，还能促进计算机视觉领域在处理不平衡数据分布和复杂视觉环境问题上的研究。通过在GLDv2上进行训练和测试，研究人员可以设计出更加适应现实世界场景的模型，提升模型在实际应用中的表现。同时，该数据集也为学术界和工业界的协作提供了共享资源，推动了整个领域的共同发展。

reduced training set of

1.6

M images and

k landmarks (see

Sec. 5.1). While the index and training set do not share

images, their label space is highly overlapping, with

common classes. The query set is randomly split into 1/3

validation and 2/3 testing data. The validation data was

used for the “Public” leaderboard in the Kaggle competition,

which allowed participants to submit solutions and view their

scores in real-time. The test set was used for the “Private”

leaderboard, which was used for the ﬁnal ranking and was

only revealed at the end of the competition.

3.3. Challenges

Besides its scale, the Google Landmarks Dataset v2

presents practically relevant challenges, as motivated above.

Class distribution.

The class distribution is extremely long-

tailed, as illustrated in Fig. 1.

% of classes have at most

images and

% of classes have at most

images. The

dataset therefore contains a wide variety of landmarks, from

world-famous ones to lesser-known, local ones.

Intra-class variation.

As is typical for an image dataset col-

lected from the web, the Google Landmarks Dataset v2 has

large intra-class variability, including views from different

vantage points and of different details of the landmarks, as

well as both indoor and outdoor views for buildings.

Out-of-domain query images.

To simulate a realistic query

stream, the query set consists of only

1.1

% images of land-

marks and

98.9

% out-of-domain images, for which no result

is expected. This puts a strong emphasis on the importance

of robustness in a practical instance recognition system.

3.4. Metrics

The Google Landmarks Dataset v2 uses well-established

metrics, which we now introduce. Reference implementa-

tions are available on the dataset website.

Recognition

is evaluated using micro Average Precision

(

AP) [

] with one prediction per query. This is also known

as Global Average Precision (GAP). It is calculated by sort-

ing all predictions in descending order of their conﬁdence

and computing:

µAP =

i=1

P(i)rel(i), (1)

where

is the total number of predictions across all queries;

is the total number of queries with at least one landmark

from the training set visible in it (note that most queries do

not depict landmarks);

P(i)

is the precision at rank

; and

rel(i)

is a binary indicator function denoting the correctness

of prediction

. Note that this metric penalizes a system

that predicts a landmark for an out-of-domain query image;

overall, it measures both ranking performance as well as the

ability to set a common threshold across different queries.

Retrieval

is evaluated using mean Average Precision@100

(mAP@100), which is a variant of the standard mAP metric

100

200

300

400

500

1,000

1,500

Germany

USA

France

Spain

Italy

Czech R.

Netherl.

Japan

Poland

Russia

Ukraine

India

Austria

Canada

Sweden

Switzerl.

China

Israel

Brazil

Number of images (thousands) Population (millions)

Figure 3: Histogram of the number of images from the top-20 countries

(blue) compared to their populations (red).

that only considers the top-100 ranked images. We chose this

limitation since exhaustive retrieval of every matching image

is not necessary in most applications, like image search. The

metric is computed as follows:

mAP@100 =

q=1

AP@100(q), (2)

where

AP@100(q) =

min(m

, 100)

min(n

,100)

k=1

(k)rel

(k)

(3)

where

is the number of query images that depict landmarks

from the index set;

is the number of index images con-

taining a landmark in common with the query image

(note

that this is only for queries which depict landmarks from the

index set, so

6= 0

);

is the number of predictions made

by the system for query

;

(k)

is the precision at rank

for the

-th query; and

rel

(k)

is a binary indicator function

denoting the relevance of prediction

for the

-th query.

Some query images will have no associated index images to

retrieve; these queries are ignored in scoring, meaning this

metric does not penalize the system if it retrieves landmark

images for out-of-domain queries.

3.5. Data Distribution

The Google Landmarks Dataset v2 is a truly world-

spanning dataset, containing landmarks from 246 of the 249

countries in the ISO 3166-1 country code list. Fig. 3 shows

the number of images in the top-20 countries and Fig. 4

shows the number of images by continent. We can see that

even though the dataset is world-spanning, it is by no means

a representative sample of the world, because the number of

images per country depends heavily on the activity of the

local Wikimedia Commons community.

Fig. 5 shows the distribution of the dataset images by

landmark category, as obtained from the Google Knowl-

edge Graph. By far the most frequent category is churches,

剩余17页未读，继续阅读

kaichu2

粉丝: 853
资源: 71

Google Landmarks Dataset v2: 大规模实例识别与检索基准

Google landmark

Google-Landmark-Recognition-2020-3rd-Place-Solution

Google Landmark Dataset谷歌地标数据集-数据集

google-landmarks-dataset

matlab光照模型代码-Face-landmarks-detection-benchmark:人脸地标（基准点）检测基准

matlab人脸检测嘴巴定位代码-Detection-of-Facial-Landmarks-Using-Local-Based-Inform

An Designated Obstacle Monitoring Approach based on Self-Defined Landmarks for a Mobile Robot

Landmarks-World-Map:Google Map API测试

手势识别数据集-手势识别分类landmarks图像数据集-含21600张手势landmarks图片+九种常用手势分类整理

Combined cine- and tagged-MRI for tracking landmarks on the tongue surface

最新资源