Skip to content

ShriKaranHanda/pagerank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PageRank via Power Iteration

This project implements Google's PageRank algorithm using eigenvector computation via the power iteration method.

Overview

PageRank models the web as a directed graph where nodes represent pages and edges represent links. The ranking corresponds to the stationary distribution of a Markov chain, computed as the dominant eigenvector of the stochastic matrix.

Features

  • Graph Construction: Creates a small directed graph (10-20 nodes) representing web pages
  • Link Matrix Construction: Builds the column-stochastic link matrix with damping factor
  • Power Iteration: Implements the power iteration algorithm from scratch
  • Convergence Analysis: Studies convergence behavior vs. tolerance and damping factor α
  • Comparison: Validates results against scipy.linalg.eig and NetworkX PageRank
  • Spectral Analysis: Analyzes eigenvalue spectrum and spectral gap
  • Visualization: Comprehensive plots and graphs throughout

Installation

  1. Clone this repository
  2. Install required packages:
pip install -r requirements.txt

Usage

Open and run the Jupyter notebook:

jupyter notebook pagerank.ipynb

The notebook is organized into sections:

  1. Setup and Imports
  2. Graph Creation and Visualization
  3. PageRank Matrix Construction
  4. Power Iteration Implementation
  5. Convergence Analysis
  6. Comparison with Built-in Solvers
  7. Spectral Analysis
  8. Summary and Analysis

Technical Report

A comprehensive LaTeX technical report is available in the report/ directory. To generate the PDF:

# First, run the notebook to generate figures
jupyter notebook pagerank.ipynb  # Run all cells

# Then compile the report
cd report
./compile_report.sh

See report/README.md for detailed instructions.

Key Results

  • Power iteration converges efficiently for typical damping factors (α ≈ 0.85)
  • Convergence speed is determined by the spectral gap (λ₁ - λ₂)
  • Higher damping factors lead to slower convergence but better page discrimination
  • All methods (power iteration, eigensolver, NetworkX) produce consistent results

References

  1. S. Brin and L. Page, "The Anatomy of a Large-Scale Hypertextual Web Search Engine", Computer Networks, 1998.
  2. D. Gleich, "PageRank Beyond the Web", SIAM Review, 2015.
  3. L. Page, S. Brin, R. Motwani, and T. Winograd, "The PageRank Citation Ranking: Bringing Order to the Web", Technical Report, Stanford University, 1998.

Future Extensions

  • Sparse matrix implementation for larger graphs
  • Personalized PageRank
  • Dynamic PageRank updates
  • Topic-sensitive PageRank
  • Real-world graph testing

About

Exploring PageRank

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published