Pitch

Estimate the geographical distribution of Ethereum validators.

Abstract

In order for Ethereum to remain credibly neutral, its validators need to be widely distributed across many geographical locations. We know a large share of Ethereum nodes are geolocated in North America and Western Europe, but this gives an incomplete picture of the network since nodes can run an arbitrary number of validators actively participating in attesting and proposing blocks. This project aims to estimate the geographical location of validators, using ML trained on data collected from crawlers deployed across many different regions.

Expected deliverables

Deliverable 1. is mandatory, while deliverables 2. and 3. can be done subsequently or completed by another team.

  1. Build an accurate dataset of the geographical distribution of validators. This first step will involve working on improving crawlers for data collection (armiarma). Next, estimating validators’ location can be done using different approaches, from simple heuristics (e.g., latency) to triangulation methods and ML (similar to what was used for blockprint) to cross-validate the accuracy of the results. The code, the models should be open source, and the dataset should be archived and easily queryable/accessible and maintainable.
  2. Build a public facing dashboard to display and track the geolocation of active validators.
  3. Write a blog post documenting the method and findings.

Resources