KIP-59 Humanity Court Parameter Updates (January 2023)

KIP-59 Humanity Court Parameter Updates (January 2023)

I propose to update the parameters in Humanity Court.

Motivation

Various users have complained that ETH rewards in Humanity Court are barely worth the gas, so they are essentially volunteering. Moreover, incoherent votes mean 17x losses over the potential profit (after taking gas costs under account), since the PNK they have to put on the line is 7000 PNK. This is very punishing and various jurors have either unstaked or will be about to unstake from this court.

Proof of Humanity is also a highly dysfunctional DAO that hasn’t updated inconsistencies in their Registration Policy for over 6 months. Jurors are forced to essentially guess around the inconsistencies on a case by case basis. This exposes jurors to high levels of risk.

Also, a new proposal has been accepted that will make changes in the policy that will significantly raise the workload of jurors. This HIP is particular is rather complicated.

William’s assumptions for these calculations were that: 97% of rulings voted to reject profiles (that means, larger minstake or alpha needed, to prevent jurors from just voting the most likely option and getting away with it as a profitable strategy) and, that the juror effort is 7.2$.

Juror effort is larger

given the extremely exhaustive list of checks that jurors need to perform, this cannot be assumed to be a 7.2$ task. I’ll try to make a list, but I might miss something:

  • the obvious: person in profile and video must be the same
  • request must always be compared to the given policy, and old requests, or requests within a period of policy updates, can result is a wrong decision, because some policies have different rules.
  • profile picture must look at the camera
  • video must not be mirrored.
  • photo and video must be on “the right orientation”. note: the policy doesn’t explain if the profile picture can be mirrored, so jurors take a risk there.
  • they have to say the phrase, but it could be in spanish (a language most people don’t speak)
  • they must either:
    • show a sign with an address that jurors will have to check manually (a process that could take around a minute), check if there are mistakes, and then compute if the mistake can be punished according to a list of conditions.
    • say a verbal confirmation phrase containing 8 words (that could be in spanish), that are related to their address. Juror has to go to a PoH frontend and crosscheck the phrase. Thick accents are allowed here, but mispronunciations, omissions, etc are not. The frontend may not even show you the phrase and you will be required to check it manually. The policy doesn’t specify how this phrase is generated and it could become an issue. (Notice how all phrases have the first word start with the letter “a”? The reason why is important and it’s not clarified in the policy. In fact, you would by default, given the wording in the policy, assume it would work in a different way. So far no one tried to challenge on this basis, but it could be a good argument that would render all those profiles incorrect.)
  • “excessive makeup” is banned. colored lipstick is mentioned as an example. there have been challenges related to makeup and the results look random, so it’s super risky to be a juror in this circumstance.

This is a long and tedious process and jurors should be expected to make mistakes, just like the regular humans that submit to this registry have mistakes around it. It is error prone and there is subjectivity around a few things, such as caligraphy, what’s excessive makeup, what’s a mispronunciation, is a profile picture mirror allowed, are sybils/farmed accounts allowed, etc.

Proof of Humanity disputes are reasonably technical at this very moment, and after HIP-58 becomes live, even more so. So I would consider juror effort to be worth 30$ as a minimum.

I would also assume (a priori, again just guessing without much objectivity) that a wrong vote should punish like 4x or 5x proper votes, at its worst, given the volatile nature of the Policy of the only Arbitrable that runs disputes on this court.

I’ve had issues running the calculator, so these proposed juror fee and minstake are guesses. Not as objective. Here they are, these are intended to create a conversation around them and should be polished.

Humanity

Proposed juror fee : 0.03 ETH
Proposed minstake : 10000.0 PNK
Proposed alpha: 0.5


Some assumptions I’ve made to reach these numbers:

Relatively low gas: 15 gwei
Relatively average gas at which jurors should get the effort: 30 gwei
High gas situation: 80 gwei

Juror effort: 30$ (although this might be a low amount given my exposition? feedback please)

Why 0.03 ETH juror fee: juror effort ~0.02 ETH and at 30 gwei, tx gas fee would be ~0.01 ETH

Jurors should be able to mess up 1 every 4 rulings and stay even, so:

0.02 ETH ~ 1200 PNK
1200 * 4 => ~5000 PNK of juror stake
5000 / 0.5 = 10000 PNK of minstake

On the estimates for juror effort, totally if the estimated effort that is included in the calculator is no longer appropriate, then that should be updated. The estimates for effort of different types of juror task that are included in the calculator are completely subjective, so the more feedback jurors in different courts give about those the better.

The parameters you propose here satisfy all of the constraints in the calculator except for the contraint that the calculator takes to make the “lazy” strategy of always voting for the moment common answer in the court unprofitable. Indeed, this is the condition that is currently making the deposit so high in this court. (As you point in your motivation, the issue is that the percentage of rulings that reject profiles is quite high. To correct the statistics you cite, as of the last time I checked these values in November it is around 86% of rulings that vote to reject profiles. However, when you include the rate at which jurors who vote to reject are overturned by “accept” being crowdfunded but without a further juror vote, a juror who always votes to reject is seen by the court contract as being coherent 91% of the time. So a quite high deposit is necessary to make it so that the strategy of always voting to reject has a negative expected return. On the other hand, 97% of jurors overall are ruled as coherent in this court, so even with a high deposit the expected return of honestly ruling on the case is still positive. See this document for somewhat more complete tables with these values: Parameters - November 2022 - Google Docs Note that there are voting/incentive systems that are designed to better handle these types of situations where there is one outcome that observed much more often than others. In v1, there is the same voting/incentive system in all courts and the system that is used was chosen for its resistance against 51% attacks as well as its simplicity. Eventually, in v2 with its modular design, conceivably the court that handles PoH cases can use a different system and then you can go back to something more 51% attack resistant upon appeal.)

However, maybe the contraints the calculator is taking are overkill. It is trying to find parameters such that E[honest]>0>E[lazy]. Maybe it is enough to have E[honest]>E[lazy] and E[honest]>0. Then the lazy strategy is potentially profitable, but jurors who follow it are accepting an opportunity cost by not taking the time to review the case which would have given them a higher expected return. If you assume juror effort of 30 USD, the parameters you propose satisfy that.

Then, I would have to update this segment:

0.02 ETH ~ 1200 PNK
1200 * 11 => ~13000 PNK of juror stake
13000 / 0.5 = 26000 PNK of minstake

This would be a significantly larger minstake, but given the circumstances (jurors able to stay coherent with PoH executing opposite decision) it could make sense.
Even if you were to be forgiving and reduce it to 20_000 PNK of minstake, it’d still be greater given the increase in juror effort.

As a previous staker in the Humanity court the recent parameter changes have forced me to unstake. The penalty for incoherence is far too high. I’m fine with the coherence reward being less per case to make it easier to challenge and submit profiles but the penalty cost makes it far too risky to be staked in the court.

Before the change, we were receiving .025 ETH per case and 4,350 PNK vote stake. I believe if the reward drops so too should the vote stake or should be kept same.

.02 ETH coherence reward and 5,000 vote stake seem reasonable or if it stays at .011 ETH the vote stake should drop to 2,500 PNK.

I think the “lazy juror” term is also an assumption. You can’t prove a juror is being lazy by voting a certain way more often, maybe thats just the way the votes happened. Jurors also vote no more often since the person challenging a case is putting their money on the line – in most cases, the “no” vote will be the right choice.

Changing parameters to change the way people are voting is an assumption, incorrect, and also goes against the nature of the Kleros system. If people are voting “no” too often then they will be appealed and lose their money for being “lazy” – if the system cannot correct this then you’re stating there’s an issue with the system itself.

Just to clarify, the term “lazy juror” isn’t an attack on any specific juror in the Kleros courts. It’s a term to describe a specific voting strategy where a hypothetical juror votes purely based on historical probability of outcomes rather than an appreciation of evidence in the ongoing case.

For instance, assume that the probability of a profile being accepted is 80%. Then it makes sense for the juror to keep voting to accept (or at least voting to accept every 4 out of 5 submissions) without really caring about whether the submission actually confirms to the criteria.

William’s discussion on lazy jurors isn’t to change the actual outcomes that people come to, but to ensure that the system remains secure.

Understood, but even as a trading strategy you still cannot prove anyone is even using this strategy it’s an assumption. The amount of jurors voting no can just be the nature of how the humanity court works.

If challengers are putting their money on the line to create these cases the majority of cases are just going to be “no” by default. Seeing most cases being voted no has nothing to do with a trading strategy, what you’re witnessing is the nature of how the system works.

Now, let’s assume a bunch of jurors begin to use this strategy of voting “no” for every single case without reviewing the case. As more cases happen and challengers begin to become emboldened enough to start challenging any profiles since everyone is voting no anyways – then it will become more lucrative for people to submit appeals on these cases and far more cases will begin to get appealed and the challengers will lose money along with the traders using the “lazy voting” strategy. The system in place will already fix the issue you are trying to “fix”, you’ve incorrectly made an assumption and changed the parameters without allowing the court system to fix it itself.

The mechanism design of a court doesn’t require you to cite historical examples or rely solely on an empirical record of the past behaviour of agents. It requires you to factor in and model how agents may react within the realm of possibilities and mitigate any undesirable responses. For instance, we’ve never actually had a p+epsilon attack in any live courts either. That doesn’t mean we don’t include safeguards against that. Ultimately, the whole point of game theory is to make assumptions.

This is only an outcome you can depend on once you design the payoffs in such a way that the lazy strategy has a lower expected payoff than the average juror payoff. Which is the point of parameter changes, to ensure that the expected payoffs continue to be asymmetric so that jurors are biased towards honest voting. The simple existence of appeals isn’t secure enough.

Note that cryptoeconomic design doesn’t have to adopt the perspective of just one juror in one case; we also have to model the behaviour of many jurors across many cases over a long period of time, which ends up adding more dynamics and complications than the hypothetical you’ve mentioned.

This has already been discussed much better than I ever could hope to in the Kleros Yellow Paper. Please refer to section 4.7.5 (pages 25-27), which discusses exactly this issue.

If this was already addressed thoroughly in the yellow paper why are the parameters now being changed due to the way votes are happening in Humanity court? If what you’re saying is true then it had already been accounted for and there’s no reason to make changes.

What took place that warrants the parameters to change all of a sudden?

Without getting into all the technicals, just applying common sense that a 7,000 vote stake to a .011 ETH coherence reward is just ridiculous. I’m shocked this isn’t evident to the people who voted or created KIP-57.

The primary reason for changes in all court parameters are the changes in gas prices. With respect to PoH, another reason for the change is the fact that the court has many single juror rounds, which is mathematically relevant, as mentioned here:

That being said, there is also a degree of subjectivity involved in these calculations when it comes to certain assumptions, such as in quantifying juror effort. You may refer to the document below and offer your amendments:

You may also experiment with the parameter calculator for coming up with alternative parameters:

Looking through the Google Colaboratory doc I don’t think I’m technical enough to come up with my own parameters :cold_sweat:

Just want to voice my opinion as a former juror of the humanity court and plead with someone to lower the voting stake to a level that isn’t so disadvantageous to jurors. I’m fine with the reward being lowered but if I’m ever wrong on just 1 case the penalty is far too severe and I hope everyone can see that.

Is there anyway to solve the hypothetical issue of “lazy jurors” without making the vote stake so ridiculously high?

I don´t know if this is relevant for this KIP or if it is a separate issue, but it is a nice to have that when the parameters are changed, all jurors in that court can be notified by email that the parameters have changed (to the same email that is notified that it has been drawn).
In may case, I found out about the changes when I get drawn, and now I have to rule a risky case and in the best scenario I will get 10U$ an in the worst I could lose 200.

As someone who is both a challenger and a juror, I have personally felt the challenger earns way too much in comparison to a juror, especially when you factor in gas costs.

I will be unstaking from the PoH court very soon because it is simply not worth it anymore. I will have to reconsider if KIP59 passes. Keep in mind I’m not alone; before KIP57 passed the PNK staked in Humanity court was 21mil, and now it is about to hit 16mil. Without intervention soon, it will likely continue to decrease.

I originally voted against KIP57, and unfortunately I was the only one. Though many jurors have told me they were unaware of the changes and thus didn’t vote, but they agree that the risk:reward is too high. Better to change it now than never.

I hope to see this proposal pass in some capacity; one in which the PNK per vote decreases and/or the ETH reward increases.