Detecting Watermarks, Timestamps, and Frames (WTFs)
Tobias Weyand, Chih-Yun Tsai and Bastian Leibe
Summary
A common problem of computer vision applications that are based on Internet photos are false-positive matches caused by Watermarks, Timestamps, and Frames (WTFs) superimposed on the image content. If a WTF is present in two otherwise unrelated images, the pair is often falsely considered a match by local-feature based image matching, because WTFs cause spatially coherent local feature matches even though the images show different objects. This can in turn hurt computer vision applications such as image retrieval or large-scale Structure-from-Motion that require reliable image matching as a building block.
We propose a simple, but very effective approach to detect such WTF matches directly during image matching. Given a matching image pair with an estimated homography, we first determine similar regions in both images to compute similarity maps. Exploiting the fact that WTFs typically appear near the border, we build a spatial histogram of the similar regions and apply a binary classifier to decide whether the match is due to a WTF. This approach is able to detect many different kinds of watermarks, timestamps and frames with high accuracy and solves many problems in image retrieval and clustering, as we demonstrate in our paper.
The code of our WTF-detector as well as the dataset we collected to train and evaluate it are available below.
Paper
This approach was published in:
Fixing WTFs: Detecting Image Matches caused by Watermarks, Timestamps, and Frames in Internet Photos (PDF, Slides, Poster)
T. Weyand, C.-Y. Tsai, B. Leibe IEEE Winter Conference on Applications of Computer Vision (WACV'15), 2015, Waikoloa Beach, Hawaii
Software
If you use this software for research purposes we ask you to cite our paper:
@InProceedings{WeyandTsai15WACV,
title = {{Fixing WTFs: Detecting Image Matches caused by Watermarks, Timestamps, and Frames in Internet Photos}},
author = {T. Weyand and C.-Y. Tsai and B. Leibe},
booktitle = {{IEEE Winter Conference on Applications of Computer Vision (WACV'15)}},
year = 2015
}
The code is licensed under the GPLv3.
THIS CODE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Use at your own risk.
Dataset
We provide a dataset of 36,240 matching image pairs. 90% of these pairs are valid matches, and 10% are WTF-matches. The dataset comes with a hand-annotated ground truth and the cross-validation splits we used in the paper to allow comparison with our results.
wtf_dataset.tgz (8.2GB)
Contact
Tobias Weyand (weyand@vision.rwth-aachen.de) Chih-Yun Tsai (chi.tsai@rwth-aachen.de)