← Back to Projects

Reproducibility Study Of Learning Fair Graph Representations Via Automated Data Augmentations

Project Overview

This research undertakes a comprehensive reproducibility analysis of "Learning Fair Graph Representations Via Automated Data Augmentations" by Ling et al. (2022). We assess the validity of the original claims focused on node classification tasks and explore the performance of the Graphair framework in link prediction tasks across various real-world datasets.

Key Features

Replication of original experiments to assess reproducibility of three main claims
Extension of Graphair framework from node classification to link prediction tasks
Cross-dataset evaluation with NBA, Pokec-n, Pokec-z, Citeseer, Cora, and PubMed datasets
Implementation of dyadic-level fairness metrics for link prediction
Comparative analysis with baseline models (FairAdj, FairDrop)
Ablation studies on model components and hyperparameter sensitivity

Technical Implementation

Implemented using the DIG library's Graphair module with modifications for link prediction tasks. Employed adversarial training for fairness, contrastive learning for informativeness, and reconstruction regularization. Adapted the framework to handle dyadic fairness metrics by computing Hadamard products of node embeddings and implementing subgroup/mixed dyadic group classifications. Conducted extensive grid search hyperparameter tuning and utilized high-performance computing with NVIDIA A100 GPUs.

Key Findings & Impact

Successfully reproduced two of three original claims and partially reproduced the third, with discrepancies attributed to differences in experimental setup and training epochs. Extended Graphair to link prediction, demonstrating superior trade-off for subgroup dyadic-level fairness compared to baseline models. The study validates Graphair's adaptability across different downstream tasks and provides insights into the challenges of reproducing graph representation learning research.

Publication

Co-authored with Thijmen Nijdam, Juell Sprott, and Jurgen de Heus from the University of Amsterdam. Code and data publicly available at: https://github.com/juellsprott/graphair-reproducibility