Examples
========

This section provides comprehensive examples of using KNRscore to compare different dimensionality reduction techniques. Each example demonstrates a specific use case and includes detailed explanations of the results.

High-Dimensional Embedding Comparison: PCA vs t-SNE
---------------------------------------------------

In this example, we compare a 50-dimensional PCA embedding with a 2-dimensional t-SNE embedding of the MNIST dataset. This comparison helps us understand how well t-SNE preserves the structure of the high-dimensional PCA space.

.. code-block:: python

	# Import required libraries
	from sklearn import (manifold, decomposition)
	import numpy as np
	import KNRscore as knrs

	# Load MNIST example data
	X, y = knrs.import_example()

	# Create PCA embedding (50 dimensions)
	X_pca_50 = decomposition.TruncatedSVD(n_components=50).fit_transform(X)

	# Create t-SNE embedding (2 dimensions)
	X_tsne = manifold.TSNE(n_components=2, init='pca').fit_transform(X)

	# Compare embeddings
	scores = knrs.compare(X_pca_50, X_tsne, n_steps=5)

	# Visualize comparison
	fig, ax = knrs.plot(scores, xlabel='PCA (50d)', ylabel='tSNE (2d)')

.. image:: ../figs/pca50_tsne.png
   :width: 600
   :align: center
   :alt: PCA 50D vs t-SNE 2D Comparison

**Interpretation**:
- The heatmap shows high similarity scores (green/yellow) across different neighborhood sizes
- This indicates that t-SNE successfully preserves both local and global structures from the PCA space
- The consistent high scores suggest that t-SNE maintains the relative positions of samples well

2D Embedding Comparison: PCA vs t-SNE
-------------------------------------

Here we compare two 2-dimensional embeddings to understand how different dimensionality reduction techniques represent the same data in low-dimensional space.

.. code-block:: python

	# Create 2D PCA embedding
	X_pca_2 = decomposition.TruncatedSVD(n_components=2).fit_transform(X)
	
	# Create 2D t-SNE embedding
	X_tsne = manifold.TSNE(n_components=2, init='pca').fit_transform(X)

	# Compare embeddings
	scores = knrs.compare(X_pca_2, X_tsne, n_steps=5)

	# Visualize comparison
	fig, ax = knrs.plot(scores, xlabel='PCA (2d)', ylabel='tSNE (2d)')

.. image:: ../figs/pca2_tsne.png
   :width: 600
   :align: center
   :alt: PCA 2D vs t-SNE 2D Comparison

**Interpretation**:
- Lower similarity scores (blue) indicate significant differences in local neighborhood structures
- The increasing similarity at larger scales suggests that global structure is better preserved
- This demonstrates how different techniques prioritize different aspects of the data structure

Random Data Comparison
----------------------

This example demonstrates how KNRscore can detect completely different embeddings by comparing t-SNE with randomly permuted data.

.. code-block:: python

	# Create random permutation of t-SNE coordinates
	X_rand = np.c_[np.random.permutation(X_tsne[:,0]), 
				   np.random.permutation(X_tsne[:,1])]

	# Compare random data with t-SNE
	scores = knrs.compare(X_rand, X_tsne, n_steps=5)

	# Visualize comparison
	fig, ax = knrs.plot(scores, xlabel='Random (2d)', ylabel='tSNE (2d)')

.. image:: ../figs/random_tsne.png
   :width: 600
   :align: center
   :alt: Random Data vs t-SNE Comparison

**Interpretation**:
- Consistently low similarity scores (blue) across all scales
- This confirms that random permutation destroys both local and global structure
- Serves as a useful baseline for comparison with other embeddings

Visualization Examples
----------------------

KNRscore also provides tools for creating scatter plots of the embeddings:

.. code-block:: python

	# Create scatter plot of PCA embedding
	fig, ax = knrs.scatter(X_pca_2[:,0], X_pca_2[:,1], 
						  labels=y, 
						  title='PCA',
						  density=False)

	# Create scatter plot of t-SNE embedding
	fig, ax = knrs.scatter(X_tsne[:,0], X_tsne[:,1], 
						  labels=y, 
						  title='tSNE')

	# Create scatter plot of random data
	fig, ax = knrs.scatter(X_rand[:,0], X_rand[:,1], 
						  labels=y, 
						  title='Random')

.. image:: ../figs/scatter_pca.png
   :width: 600
   :align: center
   :alt: PCA Scatter Plot

.. image:: ../figs/scatter_tsne.png
   :width: 600
   :align: center
   :alt: t-SNE Scatter Plot

.. image:: ../figs/scatter_random.png
   :width: 600
   :align: center
   :alt: Random Data Scatter Plot

**Visualization Features**:
- Color-coded by class labels
- Optional density estimation
- Customizable markers and sizes
- Interactive plotting capabilities

Advanced Usage
--------------

For more advanced usage, consider:

1. **Custom Neighborhood Sizes**:

.. code-block:: python

	# Compare with custom neighborhood sizes
	scores = knrs.compare(X_pca, X_tsne, nn=100, n_steps=10)

2. **Multiple Comparisons**:

.. code-block:: python

    # pip install umap-learn
    import umap
    
    # Create PCA embedding (50 dimensions)
    X_pca = decomposition.TruncatedSVD(n_components=50).fit_transform(X)
    # Create t-SNE embedding (2 dimensions)
    X_tsne = manifold.TSNE(n_components=2, init='pca').fit_transform(X)
    # Create UMAP embedding (2 dimensions)
    X_umap = umap.UMAP(n_components=2).fit_transform(X)

	# Compare multiple embeddings
	embeddings = {
		'PCA': X_pca,
		'tSNE': X_tsne,
		'UMAP': X_umap,
		}

	for name1, emb1 in embeddings.items():
		for name2, emb2 in embeddings.items():
			if name1 < name2:
				scores = knrs.compare(emb1, emb2)
				knrs.plot(scores, xlabel=name1, ylabel=name2)

3. **Parameter Optimization**:

.. code-block:: python

	# Find optimal t-SNE parameters
	perplexities = [5, 30, 50, 100]
	for p in perplexities:
		X_tsne = manifold.TSNE(perplexity=p).fit_transform(X)
		scores = knrs.compare(X_pca, X_tsne, n_steps=10)
		# Higher score is better
		print(f"Perplexity {p}: {np.mean(scores['scores']):.3f}")

    # Higher score is better
    # 100%|██████████| 50/50 [01:47<00:00,  2.15s/it]
    # Perplexity 5: 0.844
    # 100%|██████████| 50/50 [01:33<00:00,  1.88s/it]
    # Perplexity 30: 0.877
    # 100%|██████████| 50/50 [01:31<00:00,  1.82s/it]
    # Perplexity 50: 0.885
    # 100%|██████████| 50/50 [01:17<00:00,  1.55s/it]
    # Perplexity 100: 0.889


.. include:: add_bottom.add