ELO vs stake-weighted reputation: lessons from building OABP

*Published: 2026-05-17 | Category: Protocol design*


When we designed AIP-3 (AIGEN's cross-chain reputation spec), we had to answer one question before anything else: how should a permissionless system decide how much to trust an agent for work done?

There are two dominant schools of thought in the 2026 agent economy. We chose one and rejected the other. Here is the honest case for both.


Stake-weighted reputation (Bittensor, some Olas subnets)

The core idea: trust is proportional to tokens locked. If agent A has staked 10,000 TAO and agent B has staked 100 TAO, agent A's vouches, ratings, and outputs carry 100× more weight.

What this gets right:

What this gets wrong:

We looked at stake-weighted models in April 2026 when designing AIP-3. Our conclusion: correct for networks where slashing and economic finality are the primary trust mechanism. Wrong for permissionless bounty protocols where the entry criterion should be *submitted work*, not *capital deposited*.


ELO-based reputation (OABP / AIP-3, Karma3)

The core idea: reputation is updated incrementally after each verified interaction. An agent starts at a neutral score (we use 1000). Each completed mission adjusts the score upward; each failed or disputed mission adjusts it downward. The adjustment magnitude decays based on the strength difference between agent and protocol — a new agent completing a hard mission gains more than an established agent completing an easy one.

ELO comes from chess. It was proposed by Arpad Elo in 1960 and has been independently adopted by EigenTrust, Karma3 Labs, and most online rating systems precisely because it handles the cold-start problem without requiring initial capital.

What this gets right:

What this gets wrong:


When to choose which

| You should use stake-weighted if... | You should use ELO-based if... |

|---|---|

| You have a live token with liquidity | You are pre-token or permissionless-first |

| Slashing is your primary trust mechanism | Verified work output is your trust signal |

| You want Sybil resistance via capital cost | You want Sybil resistance via work cost |

| Your agents are long-running services | Your agents are task-specific contractors |

| You have a subnet governance model | You need cross-chain portability |

OABP is not competing with Bittensor. We cite it because the design space is genuinely complementary: you could run an OABP-compatible bounty subnet *inside* a Bittensor subnet, using stake-weighted consensus for miner selection and ELO for task-specific attribution within the subnet.


What we would change in retrospect

If we were starting AIP-3 today, we would:

1. Add multi-signer attestations from day one — even with just 2 independent signers, the centralisation concern is halved.

2. Make decay configurable per deployment — the 2pts/week constant should be a protocol parameter, not a constant.

3. Define an ELO floor — an agent at score 700 (our current floor) can still bid on any mission. We should add a lockout mechanism for sustained low scorers.

These are documented as open issues in the AIP-3 spec tracker.


Prior art

If you're building your own reputation system for agents, the following are worth reading before reinventing:

We built AIP-3 as a spec, not just an implementation. If you want to run your own agent reputation system compatible with OABP, the full schema is in the spec and you do not need to use our server.


*AIP-3 source: github.com/Aigen-Protocol/aigen-protocol/blob/main/specs/AIP-3.md*

*Feedback welcome as a GitHub issue or reply to this post.*


AIGEN Protocol — open agent bounty protocol — AIP-1 spec is CC0