War of Algorithms: Anatomy of Sybil Detection and Cluster Analysis. Part I

Crypto airdrops were originally intended as a way to fairly distribute tokens among early users and active ecosystem participants. Over time, they have also become a target for Sybil farmers—individuals and teams using hundreds or thousands of addresses to siphon off the largest possible share of rewards.
To maintain campaign fairness and protect token price, projects are forced to implement increasingly sophisticated Sybil detection and cluster analysis mechanisms. In this article, we will break down what Sybil labeling is, how address clusters are formed, and what tools are used to filter airdrop participants.
What is a Sybil Attack and Sybil Labeling
A Sybil attack in Web3 is a situation where a single entity controls multiple addresses or accounts to obtain a disproportionately large share of rewards (airdrops, bonus programs, points). Projects perceive this as abuse (farming) rather than natural protocol usage.
Sybil labeling is the process where analytical services or project teams mark specific clusters of addresses as "Sybil" based on a set of on-chain and off-chain markers. These labels are then used to:
- Completely exclude clusters from an airdrop;
- Reduce their allocation size;
- Limit their participation in future campaigns and protocol governance.
How Address Clusters are Formed
Technically, the search for Sybil clusters is built around a transaction graph:
- Vertices — these are wallets (addresses),
- Edges — these are transfers, interactions with contracts, and shared points of fund entry/exit.
Clustering algorithms are run on this graph to group addresses that:
- Frequently interact with the same CEX accounts or bridge nodes;
- Repeat the same actions in the same smart contracts sequentially;
- Demonstrate similar temporal activity patterns.
Typical Heuristics:
- Common Deposit/Withdrawal Nodes. If dozens of addresses regularly deposit and withdraw funds through the same centralized account or bridge, they are grouped into a single cluster.
- Action Sequences. Addresses performing identical steps in a single campaign with minimal time intervals often look like the "tail" of the same script.
- Shared Infrastructure. If a project has access to off-chain data (KYC, email, IP, device fingerprints), it allows them to link multiple on-chain addresses to the same person or team.
Key On-chain Signals of Sybil Behavior
On-chain analysis is the first layer of Sybil detection. Projects usually focus on:
- Identical Behavioral Patterns. The same set of contracts, similar transaction counts, identical functions, and gas profiles across dozens of addresses.
- Reward Aggregation. Multiple "small" addresses that eventually consolidate rewards into one or two main wallets.
- Mass Deposits/Withdrawals via the Same Nodes. If a huge set of addresses moves tokens through the same CEX or bridge within a short timeframe, it looks like centralized farming.
- Lack of Organic Activity. Addresses that exist only during the campaign period, do not use other dApps, and show no activity before or after the airdrop.
Off-chain Signals and Additional Analytics
Where legally and technically possible, projects integrate an off-chain layer:
- The same email, social profile, or KYC document linked to multiple addresses;
- Identical or very similar IP addresses/subnets, data-center proxies, cheap VPNs;
- Identical device fingerprints (browser fingerprint, OS, screen resolution, font set, time zone, etc.).
This information is particularly accessible in centralized campaigns (e.g., registration via a web form with social login) where the project controls the frontend and user sessions.
Tools and Platforms for Sybil Detection
Not all projects build their own analytical stack from scratch—they often use external tools and partners:
- Blockchain Analytics Platforms. Specialized services that build address clusters, tag CEXs, mixers, and bridges, recognize characteristic airdrop farmer patterns, and provide a risk score for each wallet.
- Sybil-hunting Consultants. Teams specializing specifically in airdrop security: they help build heuristics, prepare Sybil cluster lists, and test various filtering scenarios before a snapshot.
- Internal Analytics. Dune/Flipside dashboards, SQL queries, Python scripts, and graph databases (Neo4j, etc.), with which the project iteratively tests rules for a specific campaign.
How Projects Filter Airdrops in Practice
A simple "black-and-white" flag is rarely used. Instead, projects build a scoring system: each signal adds or subtracts points. The final score determines whether:
- An address or cluster will be completely excluded;
- It will receive a reduced allocation;
- It will be considered safe and receive the full reward.
Filtering usually occurs in several stages:
- Broad Clustering. Obvious farmers are caught using relatively "strict" rules.
- Manual Validation on a Sample. Checking how many real users fall under the filter and softening rules if necessary.
- Fine-tuning and Appeals. In some projects, final Sybil cluster lists are partially published, giving users the opportunity to appeal if an address was mistakenly clustered.
Conclusion
Ultimately, the fight against Sybil farming is not so much a question of morality as it is a question of dynamic equilibrium. Projects learn to filter abuse more accurately, and farmers learn to adapt more cleverly.
Every new filter gives birth to a new bypass strategy; every major block leads to a new round of optimization. This constant race is the source of progress. Web3 is not divided into "honest" and "dishonest" participants; it evolves through competition between those who build the defense and those who test its strength.
But this is only one side of the story. The other begins where a Sybil farmer is no longer just an attacker, but an indicator of weak points in the distribution architecture itself.
To be continued...
