Preventing reverse search engines from finding the origin of your image?
I am planning to make a game, where I give the user an anime related image and they have to guess where its from.
However, most of the images, specially from popular anime are easy to use on Google reverse search to find and pin point the anime.
I've tried to see if I could make so it would not recognize the image but unhappily my tricks were not good enough.
Grey-scale with horizontal flip:
Puzzling the image with grey-scale:
Grey-scale with horizontal and vertical flip worked for this image but doesn't always work:
Also the above mentioned methods are rather easy for people to guess the anime.
I would like to know if there any good trick, that I could use on my images to make so it would not work on reverse search engines and yet not be so overcomplicated that I can implement on my program.
For example grey-scale, cropping, flipping are rather easy things to achieve in C#.
I think you were on the right track with your watermarking option, but you left too much of the original image in tact. Here are two images I tried that Google was unable to find:
The first image returns a lot of "checkered flag" results, and the second returns lots of mosaic/collage images. Size does matter! I initially tried it with a much smaller checkerboard pattern (16px); Google was still able to identify that. These 32px squares seem to be a happy medium.
Based off of the information that DanS provided, I think this would be a dependable technique to fool Google (and easily automated!). I can only presume that someone who was able to identify this anime would still be able to do so from these images.
Ironically, the images I have created will eventually be indexed by Google and lead to this post, defeating the purpose!
Google may use a different system but a large number of such services (tineye included) use perceptual hashes where the overall hash is close enough to be a match, rather than exact.
A whitepaper showed up a few years back which detailed the process. I haven't been able to find a link to it, but the basic system relies on a action chain to generate the hashes.
- Reduce the image to a small scale, usually 32x32 or 64x64
- Convert the image to greyscale
- Ramp up the contrast to a predefined value, to ensure a high level of difference between the black and white tones
- Calculate the pHash from the pixels in the resulting image
The process would be repeated for any uploaded image, and then cross checked with indexed hashes to find any near matches. In short, the image must be drastically changed across large portions to fool any system like this.