Proxi: a Python package for proximity graph construction


Graph-based representation of metagenomic data is a promising direction not only for analyzing microbial interactions but also for a broad range of machine learning tasks including feature selection, classification, clustering, anomaly detection, and dimensionality reduction.

Proxi is an open source Python package for proximity graph construction. In proximity graphs, each node is connected by an edge (directed or undirected) to its nearest neighbors according to some distance metric d. The current implementation supports three types of proximity graphs: k-nearest neighbor (k-NN) graphs; radius-nearest neighbor (r-NN) graphs; and perturbed k-nearest neighbor (pk-NN) graphs. The pk-NN algorithm constructs improved k-NN graphs from noisy data using bootstrapping and graph aggregation techniques.

Availability: Proxi source code is freely available at

Documentation: Tutorials and online documentation are available at

Citing Proxi:
El-Manzalawy, Y. (2018). Proxi: a Python package for proximity network inference from metagenomic data. bioRxiv, 357764.