Average hash
========================

After the decolorizing and scaling step, each pixel block is compared to the average (as the name suggests) of all pixel values of the image. In the example below, we will generate a 64-bit hash, which means that the image is scaled to 8×8 pixels. If the value in the pixel block is larger than the average, it gets value 1 (white) and otherwise a 0 (black). The final image hash is followed by flattening the array into a vector.


.. code:: python

	# Initialize with hash
	model = Undouble(method='ahash')

	# Import example
	X = model.import_example(data='cat_and_dog')
	imgs = model.import_data(X, return_results=True)

	# Compute hash for a single image
	hashs = model.compute_imghash(imgs['img'][0], to_array=False, hash_size=8)

	# The hash is a binairy array or vector.
	print(hashs)

	# Plot the image using the undouble plot_hash functionality
	model.results['img_hash_bin']
	model.plot_hash(idx=0)

	# Plot the image manually
	fig, ax = plt.subplots(1, 2, figsize=(8,8))
	ax[0].imshow(imgs['img'][0])
	ax[1].imshow(hashs[0])


.. |ahash| image:: ../figs/ahash.png

.. table:: Average hash
   :align: center

   +----------+
   | |ahash|  |
   +----------+


Perceptual hash
========================

After the first step of decolorizing, a Discrete Cosine Transform (DCT) is applied; first per row and afterward per column. The pixels with high frequencies are cropped to 8 x 8 pixels. Each pixel block is then compared to the median of all gray values of the image. If the value in the pixel block is larger than the median, it gets value 1 and otherwise a 0. The final image hash is followed by flattening the array into a vector.

.. code:: python

	# Initialize with hash
	model = Undouble(method='phash')

	# Import example
	X = model.import_example(data='cat_and_dog')
	imgs = model.import_data(X, return_results=True)

	# Compute hash for a single image
	hashs = model.compute_imghash(imgs['img'][0], to_array=False, hash_size=8)

	# The hash is a binairy array or vector.
	print(hashs)

	# Plot the image using the undouble plot_hash functionality
	model.results['img_hash_bin']
	model.plot_hash(idx=0)

	# Plot the image manually
	fig, ax = plt.subplots(1, 2, figsize=(8,8))
	ax[0].imshow(imgs['img'][0])
	ax[1].imshow(hashs[0])

.. |phash| image:: ../figs/phash.png

.. table:: Perceptual hash
   :align: center

   +----------+
   | |phash|  |
   +----------+


Differential hash
========================

After the first step of decolorizing and scaling, the pixels are serially (from left to right per row) compared to their neighbor to the right. If the byte at position x is less than the byte at position (x+1), it gets value 1 and otherwise a 0. The final image hash is followed by flattening the array into a vector.

.. code:: python

	# Initialize with hash
	model = Undouble(method='dhash')

	# Import example
	X = model.import_example(data='cat_and_dog')
	imgs = model.import_data(X, return_results=True)

	# Compute hash for a single image
	hashs = model.compute_imghash(imgs['img'][0], to_array=False, hash_size=8)

	# The hash is a binairy array or vector.
	print(hashs)

	# Plot the image using the undouble plot_hash functionality
	model.results['img_hash_bin']
	model.plot_hash(idx=0)

	# Plot the image manually
	fig, ax = plt.subplots(1, 2, figsize=(8,8))
	ax[0].imshow(imgs['img'][0])
	ax[1].imshow(hashs[0])


.. |dhash| image:: ../figs/dhash.png

.. table:: Differential hash
   :align: center

   +----------+
   | |dhash|  |
   +----------+


Haar wavelet hash
========================

After the first step of decolorizing and scaling, a two-dimensional wavelet transform is applied to the image. Each pixel block is then compared to the median of all gray values of the image. If the value in the pixel block is larger than the median, it gets value 1 and otherwise a 0. The final image hash is followed by flattening the array into a vector.

.. code:: python

	# Initialize with hash
	model = Undouble(method='whash-haar')

	# Import example
	X = model.import_example(data='cat_and_dog')
	imgs = model.import_data(X, return_results=True)

	# Compute hash for a single image
	hashs = model.compute_imghash(imgs['img'][0], to_array=False, hash_size=8)

	# The hash is a binairy array or vector.
	print(hashs)

	# Plot the image using the undouble plot_hash functionality
	model.results['img_hash_bin']
	model.plot_hash(idx=0)

	# Plot the image manually
	fig, ax = plt.subplots(1, 2, figsize=(8,8))
	ax[0].imshow(imgs['img'][0])
	ax[1].imshow(hashs[0])


.. |whash| image:: ../figs/whash.png

.. table:: Haar wavelet hash
   :align: center

   +----------+
   | |whash|  |
   +----------+


Crop-resistant hash
========================

The Crop resistant hash is implemented as described in the paper "Efficient Cropping-Resistant Robust Image Hashing". DOI 10.1109/ARES.2014.85. This algorithm partitions the image into bright and dark segments, using a watershed-like algorithm, and then does an image hash on each segment. This makes the image much more resistant to cropping than other algorithms, with the paper claiming resistance to up to 50% cropping, while most other algorithms stop at about 5% cropping.


.. code:: python

	# Import library
	from undouble import Undouble

	# Init with default settings
	model = Undouble()

	# Import example data
	targetdir = model.import_example(data='flowers')

	# Importing the files files from disk, cleaning and pre-processing
	model.import_data(targetdir)

	# Compute image-hash
	model.compute_hash(method='crop-resistant-hash')

	# Find images with image-hash <= threshold
	results = model.group(threshold=5)

	# Plot the images
	model.plot()

	# Print the output for demonstration
	print(model.results.keys())

	# The detected groups
	model.results['select_pathnames']
	model.results['select_scores']
	model.results['select_idx']

	# Plot the hash for the first group
	model.plot_hash(filenames=model.results['filenames'][model.results['select_idx'][0]])


Plot image hash
========================

All examples are created using the underneath code:

.. code:: python

    # pip install imagesc
    
    import cv2
    from scipy.spatial import distance
    import numpy as np
    import matplotlib.pyplot as plt
    from imagesc import imagesc
    from undouble import Undouble
    
    methods = ['ahash', 'dhash', 'whash-haar']
    
    for method in methods:
        # Average Hash
        model = Undouble(method=method, hash_size=8)
        # Import example data
        targetdir = model.import_example(data='cat_and_dog')
        # Grayscaling and scaling
        model.import_data(targetdir)
        # Compute image for only the first image.
        hashs = model.compute_imghash(model.results['img'][0], to_array=True)
        # Compute the image-hash
        print(method + ' Hash:')
        image_hash = ''.join(hashs[0].astype(int).astype(str).ravel())
        print(image_hash)
    
        # Import image for plotting purposes
        img_g = cv2.imread(model.results['pathnames'][0], cv2.IMREAD_GRAYSCALE)
        img_r = cv2.resize(img_g, (8, 8), interpolation=cv2.INTER_AREA)
    
        # Make the figure
        fig, ax = plt.subplots(2, 2, figsize=(15, 10))
        ax[0][0].imshow(model.results['img'][0][..., ::-1])
        ax[0][0].axis('off')
        ax[0][0].set_title('Source image')
        ax[0][1].imshow(img_g, cmap='gray')
        ax[0][1].axis('off')
        ax[0][1].set_title('grayscale image')
        ax[1][0].imshow(img_r, cmap='gray')
        ax[1][0].axis('off')
        ax[1][0].set_title('grayscale image, size %.0dx%.0d' %(8, 8))
        ax[1][1].imshow(hashs[0], cmap='gray')
        ax[1][1].axis('off')
        ax[1][1].set_title(method + ' function')
    
        # Compute image hash for the 10 images.
        hashs = model.compute_imghash(model, to_array=False)
    
        # Compute number of differences across all images.
        adjmat = np.zeros((hashs.shape[0], hashs.shape[0]))
        for i, h1 in enumerate(hashs):
            for j, h2 in enumerate(hashs):
                adjmat[i, j] = np.sum(h1!=h2)
        
        # Compute the average image-hash difference.
        diff = np.mean(adjmat[np.triu_indices(adjmat.shape[0], k=1)])
        print('[%s] Average difference: %.2f' %(method, diff))
    
        # Make a heatmap to demonstrate the differences between the image-hashes
        imagesc.plot(hashs, cmap='gray', col_labels='', row_labels=model.results['filenames'], cbar=False, title=method + '\nAverage difference: %.3f' %(diff), annot=True)


.. include:: add_bottom.add