KIP-59 Humanity Court Parameter Updates (January 2023)

KIP-59 Humanity Court Parameter Updates (January 2023)

I propose to update the parameters in Humanity Court.

Motivation

Various users have complained that ETH rewards in Humanity Court are barely worth the gas, so they are essentially volunteering. Moreover, incoherent votes mean 17x losses over the potential profit (after taking gas costs under account), since the PNK they have to put on the line is 7000 PNK. This is very punishing and various jurors have either unstaked or will be about to unstake from this court.

Proof of Humanity is also a highly dysfunctional DAO that hasn’t updated inconsistencies in their Registration Policy for over 6 months. Jurors are forced to essentially guess around the inconsistencies on a case by case basis. This exposes jurors to high levels of risk.

Also, a new proposal has been accepted that will make changes in the policy that will significantly raise the workload of jurors. This HIP is particular is rather complicated.

William’s assumptions for these calculations were that: 97% of rulings voted to reject profiles (that means, larger minstake or alpha needed, to prevent jurors from just voting the most likely option and getting away with it as a profitable strategy) and, that the juror effort is 7.2$.

Juror effort is larger

given the extremely exhaustive list of checks that jurors need to perform, this cannot be assumed to be a 7.2$ task. I’ll try to make a list, but I might miss something:

  • the obvious: person in profile and video must be the same
  • request must always be compared to the given policy, and old requests, or requests within a period of policy updates, can result is a wrong decision, because some policies have different rules.
  • profile picture must look at the camera
  • video must not be mirrored.
  • photo and video must be on “the right orientation”. note: the policy doesn’t explain if the profile picture can be mirrored, so jurors take a risk there.
  • they have to say the phrase, but it could be in spanish (a language most people don’t speak)
  • they must either:
    • show a sign with an address that jurors will have to check manually (a process that could take around a minute), check if there are mistakes, and then compute if the mistake can be punished according to a list of conditions.
    • say a verbal confirmation phrase containing 8 words (that could be in spanish), that are related to their address. Juror has to go to a PoH frontend and crosscheck the phrase. Thick accents are allowed here, but mispronunciations, omissions, etc are not. The frontend may not even show you the phrase and you will be required to check it manually. The policy doesn’t specify how this phrase is generated and it could become an issue. (Notice how all phrases have the first word start with the letter “a”? The reason why is important and it’s not clarified in the policy. In fact, you would by default, given the wording in the policy, assume it would work in a different way. So far no one tried to challenge on this basis, but it could be a good argument that would render all those profiles incorrect.)
  • “excessive makeup” is banned. colored lipstick is mentioned as an example. there have been challenges related to makeup and the results look random, so it’s super risky to be a juror in this circumstance.

This is a long and tedious process and jurors should be expected to make mistakes, just like the regular humans that submit to this registry have mistakes around it. It is error prone and there is subjectivity around a few things, such as caligraphy, what’s excessive makeup, what’s a mispronunciation, is a profile picture mirror allowed, are sybils/farmed accounts allowed, etc.

Proof of Humanity disputes are reasonably technical at this very moment, and after HIP-58 becomes live, even more so. So I would consider juror effort to be worth 30$ as a minimum.

I would also assume (a priori, again just guessing without much objectivity) that a wrong vote should punish like 4x or 5x proper votes, at its worst, given the volatile nature of the Policy of the only Arbitrable that runs disputes on this court.

These are the suggested new values for Humanity Court:

Humanity

Proposed juror fee : 0.025 ETH
Proposed minstake : 16000.0 PNK
Proposed alpha: 0.5

On the estimates for juror effort, totally if the estimated effort that is included in the calculator is no longer appropriate, then that should be updated. The estimates for effort of different types of juror task that are included in the calculator are completely subjective, so the more feedback jurors in different courts give about those the better.

The parameters you propose here satisfy all of the constraints in the calculator except for the contraint that the calculator takes to make the “lazy” strategy of always voting for the moment common answer in the court unprofitable. Indeed, this is the condition that is currently making the deposit so high in this court. (As you point in your motivation, the issue is that the percentage of rulings that reject profiles is quite high. To correct the statistics you cite, as of the last time I checked these values in November it is around 86% of rulings that vote to reject profiles. However, when you include the rate at which jurors who vote to reject are overturned by “accept” being crowdfunded but without a further juror vote, a juror who always votes to reject is seen by the court contract as being coherent 91% of the time. So a quite high deposit is necessary to make it so that the strategy of always voting to reject has a negative expected return. On the other hand, 97% of jurors overall are ruled as coherent in this court, so even with a high deposit the expected return of honestly ruling on the case is still positive. See this document for somewhat more complete tables with these values: Parameters - November 2022 - Google Docs Note that there are voting/incentive systems that are designed to better handle these types of situations where there is one outcome that observed much more often than others. In v1, there is the same voting/incentive system in all courts and the system that is used was chosen for its resistance against 51% attacks as well as its simplicity. Eventually, in v2 with its modular design, conceivably the court that handles PoH cases can use a different system and then you can go back to something more 51% attack resistant upon appeal.)

However, maybe the contraints the calculator is taking are overkill. It is trying to find parameters such that E[honest]>0>E[lazy]. Maybe it is enough to have E[honest]>E[lazy] and E[honest]>0. Then the lazy strategy is potentially profitable, but jurors who follow it are accepting an opportunity cost by not taking the time to review the case which would have given them a higher expected return. If you assume juror effort of 30 USD, the parameters you propose satisfy that.

Then, I would have to update this segment:

0.02 ETH ~ 1200 PNK
1200 * 11 => ~13000 PNK of juror stake
13000 / 0.5 = 26000 PNK of minstake

This would be a significantly larger minstake, but given the circumstances (jurors able to stay coherent with PoH executing opposite decision) it could make sense.
Even if you were to be forgiving and reduce it to 20_000 PNK of minstake, it’d still be greater given the increase in juror effort.

As a previous staker in the Humanity court the recent parameter changes have forced me to unstake. The penalty for incoherence is far too high. I’m fine with the coherence reward being less per case to make it easier to challenge and submit profiles but the penalty cost makes it far too risky to be staked in the court.

Before the change, we were receiving .025 ETH per case and 4,350 PNK vote stake. I believe if the reward drops so too should the vote stake or should be kept same.

.02 ETH coherence reward and 5,000 vote stake seem reasonable or if it stays at .011 ETH the vote stake should drop to 2,500 PNK.

I think the “lazy juror” term is also an assumption. You can’t prove a juror is being lazy by voting a certain way more often, maybe thats just the way the votes happened. Jurors also vote no more often since the person challenging a case is putting their money on the line – in most cases, the “no” vote will be the right choice.

Changing parameters to change the way people are voting is an assumption, incorrect, and also goes against the nature of the Kleros system. If people are voting “no” too often then they will be appealed and lose their money for being “lazy” – if the system cannot correct this then you’re stating there’s an issue with the system itself.

Just to clarify, the term “lazy juror” isn’t an attack on any specific juror in the Kleros courts. It’s a term to describe a specific voting strategy where a hypothetical juror votes purely based on historical probability of outcomes rather than an appreciation of evidence in the ongoing case.

For instance, assume that the probability of a profile being accepted is 80%. Then it makes sense for the juror to keep voting to accept (or at least voting to accept every 4 out of 5 submissions) without really caring about whether the submission actually confirms to the criteria.

William’s discussion on lazy jurors isn’t to change the actual outcomes that people come to, but to ensure that the system remains secure.

Understood, but even as a trading strategy you still cannot prove anyone is even using this strategy it’s an assumption. The amount of jurors voting no can just be the nature of how the humanity court works.

If challengers are putting their money on the line to create these cases the majority of cases are just going to be “no” by default. Seeing most cases being voted no has nothing to do with a trading strategy, what you’re witnessing is the nature of how the system works.

Now, let’s assume a bunch of jurors begin to use this strategy of voting “no” for every single case without reviewing the case. As more cases happen and challengers begin to become emboldened enough to start challenging any profiles since everyone is voting no anyways – then it will become more lucrative for people to submit appeals on these cases and far more cases will begin to get appealed and the challengers will lose money along with the traders using the “lazy voting” strategy. The system in place will already fix the issue you are trying to “fix”, you’ve incorrectly made an assumption and changed the parameters without allowing the court system to fix it itself.

The mechanism design of a court doesn’t require you to cite historical examples or rely solely on an empirical record of the past behaviour of agents. It requires you to factor in and model how agents may react within the realm of possibilities and mitigate any undesirable responses. For instance, we’ve never actually had a p+epsilon attack in any live courts either. That doesn’t mean we don’t include safeguards against that. Ultimately, the whole point of game theory is to make assumptions.

This is only an outcome you can depend on once you design the payoffs in such a way that the lazy strategy has a lower expected payoff than the average juror payoff. Which is the point of parameter changes, to ensure that the expected payoffs continue to be asymmetric so that jurors are biased towards honest voting. The simple existence of appeals isn’t secure enough.

Note that cryptoeconomic design doesn’t have to adopt the perspective of just one juror in one case; we also have to model the behaviour of many jurors across many cases over a long period of time, which ends up adding more dynamics and complications than the hypothetical you’ve mentioned.

This has already been discussed much better than I ever could hope to in the Kleros Yellow Paper. Please refer to section 4.7.5 (pages 25-27), which discusses exactly this issue.

If this was already addressed thoroughly in the yellow paper why are the parameters now being changed due to the way votes are happening in Humanity court? If what you’re saying is true then it had already been accounted for and there’s no reason to make changes.

What took place that warrants the parameters to change all of a sudden?

Without getting into all the technicals, just applying common sense that a 7,000 vote stake to a .011 ETH coherence reward is just ridiculous. I’m shocked this isn’t evident to the people who voted or created KIP-57.

The primary reason for changes in all court parameters are the changes in gas prices. With respect to PoH, another reason for the change is the fact that the court has many single juror rounds, which is mathematically relevant, as mentioned here:

That being said, there is also a degree of subjectivity involved in these calculations when it comes to certain assumptions, such as in quantifying juror effort. You may refer to the document below and offer your amendments:

You may also experiment with the parameter calculator for coming up with alternative parameters:

Looking through the Google Colaboratory doc I don’t think I’m technical enough to come up with my own parameters :cold_sweat:

Just want to voice my opinion as a former juror of the humanity court and plead with someone to lower the voting stake to a level that isn’t so disadvantageous to jurors. I’m fine with the reward being lowered but if I’m ever wrong on just 1 case the penalty is far too severe and I hope everyone can see that.

Is there anyway to solve the hypothetical issue of “lazy jurors” without making the vote stake so ridiculously high?

I don´t know if this is relevant for this KIP or if it is a separate issue, but it is a nice to have that when the parameters are changed, all jurors in that court can be notified by email that the parameters have changed (to the same email that is notified that it has been drawn).
In may case, I found out about the changes when I get drawn, and now I have to rule a risky case and in the best scenario I will get 10U$ an in the worst I could lose 200.

As someone who is both a challenger and a juror, I have personally felt the challenger earns way too much in comparison to a juror, especially when you factor in gas costs.

I will be unstaking from the PoH court very soon because it is simply not worth it anymore. I will have to reconsider if KIP59 passes. Keep in mind I’m not alone; before KIP57 passed the PNK staked in Humanity court was 21mil, and now it is about to hit 16mil. Without intervention soon, it will likely continue to decrease.

I originally voted against KIP57, and unfortunately I was the only one. Though many jurors have told me they were unaware of the changes and thus didn’t vote, but they agree that the risk:reward is too high. Better to change it now than never.

I hope to see this proposal pass in some capacity; one in which the PNK per vote decreases and/or the ETH reward increases.

Hey, according to a new set of calculations and following Williams tips, I have updated the parameters. Pasting the rationale here:

Yeah… if we assume that contemporary reject ratio is 90% and juror effort is about 30$, then:

at 30 gas, jurors should get jurorEffort, voting costs ~7$ in gas so feeForJuror should be around ~30$ + 7$ = ~0.023 ETH. so, I round up
feeForJuror=0.025

juror stake should be 10x the juror effort to disincentivize lazy jurors, so ~300$ in PNK, that’s 11k PNK. with alpha of 0.5, minstake 22k.

so the new params:

feeForJuror=0.025 ETH
minstake=22_000 PNK
alpha: 0.5

PNK minstake should still be somewhat safe to lower. After @trialsleet suggestion and @William approving, I will lower the minstake of this proposal to 16000, to lower the juror stake to 8000.

feeForJuror=0.025 ETH
minstake=16_000 PNK
alpha: 0.5

I propose to vote this parameter change

1 Like

To reiterate comments I made about these parameters on Telegram, this deviates somewhat from what the parameter calculator we use normally targets in that it normally attempts to find parameters such that the expected return of the “lazy” strategy of always voting to reject has a negative expected return. Under these parameters, the lazy strategy is somewhat positive, but it still seems to be small enough relative to the expected payoff of participating honestly that that is acceptable. In exchange, this proposal seems to be somewhat more in line with the risk aversion tolerance of jurors in this court.

Explicitly, while the average payouts of the honest and lazy strategies have somewhat complicated formulas if you have to take into account the redistribution of PNK from panels of several jurors, in the case of one juror panels (which is where I feel the issue is here as cases that are appealed have rates of being decided as reject than cases in the first round) the formulas that used in the parameter calculator are really simple. Based on historical rates of coherence, etc E[honest]=.97feeForJuror-.03voteStake-gas-effort and E[lazy]=.9feeForJuror-.1voteStake-gas. So if one takes the gas to vote at around 5 euros, the expected return of a given “lazy” vote is around 7 euros. While not ideal, that is still solidly less than the expected value of honest participation assuming that the effort required to review the case is no more than around 20 euros, so we still have that dishonest behaviour would imply an opportunity cost with these parameters.

As another heuristic, 1 PNK staked on mainnet is currently earning around .0058 PNK in rewards per month. On the other hand, 1 PNK staked in the Humanity court has a 1/(17.310^6) chance of being selected in a given case; there have been on average 27 PoH cases per month over the past year or so. Then 1 PNK engaged in a lazy strategy is earning around 27/(17.310^6)7=1.110^-5 EUR per month, which is only about 1/14 of the staking rewards. So that seems acceptably negligible for parameters that fit better into peoples’ risk profiles.

2 Likes

I believe we have a consensus on .025 coherence to 8k votestake, we should move this to vote round. The sooner we have this in place I will be more comfortable staking myself and encouraging others to stake as well.

Challenger strategies have evolved over the lifetime of Proof of Humanity from ‘simple’ questions like ‘is the human facing the camera?’, to grey subjective calls about the mirrored orientation of photos given small reference cues such as background shadows, asymmetric hair style, and facial features.

I suggest that the juror effort is bimodal where these grey edge cases require significant effort to resolve while other challenges based on incorrect written address require much less effort.

The arbitrable policy is poorly written causing these edge cases.

What if we denominated the juror effort primarily in time. I think time is a more natural scale for jurors to estimate their effort, we could add this as feedback after each dispute where jurors estimate the time taken. For the calculator parameters, one can extrapolate the juror effort in FIAT based on the expected hourly wage for a worker with the required expertise. For example 30 minutes of a smart contract expert may cost ~$100/hour → $50 while 30 min of a non-skilled worker would cost minimum wage ~ $10/hour → $5

Then a court wide policy creating an upper bound on expected spent effort to resolve cases could resolve bad risk:reward ratios for jurors.

In the case of disputes which require no particular skills but the policy may be poorly written, logically inconsistent, and include edge cases requiring extraordinary effort to analyze, I suggest explicitly instructing jurors to refuse to arbitrate if they think resolving the dispute requires extraordinary effort time (more than 2 * expected effort time on the dispute). This could be a general part of all court policies.

Ideally at a contract level POH would distinguish high effort and low effort challenge types and send these disputes to Humanity Court (High Effort) court and Humanity Court (Low Effort)

1 Like

This is for POH to resolve, but in the past 3 months, ~17% of submitter profiles were challenged, in the past 6 months ~13% of submissions challenged, and over the lifetime of proof of humanity ~5% of profiles are challenged.

The sudden change in challenger behavior was initiated by a challenger in the ecosystem who realized a policy change left a grey area about the orientation of images and started making many challenges, and winning. Given the high rejection rate and high challenge rate of submissions, the policy change had unintended consequences where challengers are incentivized to make take risks on the deficiency of the policy and jurors are stuck in the middle since they must come to a consensus about the outcome of a dispute, given risk aversion, the job of the juror is actually much more difficult than the challenger. The challenger has much more reward for the risk involved than the jurors.

With %17 of profiles challenged, and most of those disputes resolving in favor of the challenger. Approximately, for every 5 profiles reviewed, challengers will find 1 incorrect profile, assuming the challenger wins (usually the case), the challengers makes submitter deposit - arbitration fees = 0.111-0.0111 ~ 0.1 eth ~ $150. The challenger’s reward is massive while the juror reward is minuscule in comparison.

1 Like