Ask for transparency in the Hagah Contests

EDIT: This post contains incorrect methodology. Will update later after reviewing the video below

EDIT 2: The script to compare contest results MIGHT be buggy. For anyone familiar with coding, please review my analysis. TTV, I suggest that you opensource the code so that we can review it because I suspect there are flaws in it.

Hello!

Can we review the Hagah contest results? I was bored last night and decided to do some number crunching, and the Bomonga contest results seem to be a little too close for comfort. According to my calculations, the first and second place entries were virtually tied and were probably decided by a single vote (maybe two or three). That single vote however, if going by the results Mesonak posted, should have been in favor of Entry 1 (from Kodiak), unless there were rounding errors that slightly swung it in favor of Entry 2 – at least according to my models.

Ordinarily in races that close there’d be a recount. Now I trust the TTV team, and I also understand that there is a script run to calculate results, but can we display the vote count totals for transparency?

Also, just a disclaimer, but my models assume that the weights for first, second, and third choices are equal. So if the categories are weighted differently, then the calculations could definitely change. But as it stands now, the Bomonga Results on my end are calculated as follows:

1. Entry 1 (318.92)
2. Entry 2 (317.09)
3. Entry 3 (258.28)
4. Entry 4 (242.71)

My proof of work can be found here:

I have included two possible scenarios (of many) that would make Entry 2 the winner, so the results we have now are definitely possible. But there’s no way to know for sure without being able to see the votes themselves. And to clarify, I think there is only potential cause for concern in the Bomonga results. The other contests seem to have been decided by comfortable margins.

5 Likes

It’s likely due to either the limited number of fraudulent votes or the 1-2-3 preferences being weighted differently.

We actually know who voted for who. The results posted by Meso show every user vote.
https://board.ttvchannel.com/t/bomonga-poll-eight/61158/76?u=huemus

So if you really want to check the results, you have to move the votes for entry 3 and 4 as first choice and see what voted each person for second choice. Then do the same for the third choice.

The preferences should not be added together like this. I’m not quite sure how to explain it through text, but the major error in your proof is (I believe) where you add all the earned votes together to get the total votes. The system should be as follows, using Bomonga for an example:
Entry 1 has 122.1 votes, Entry 2 has 122.1 votes, Entry 3 has 97.68, Entry 4 has 65.12.
Entry 4 has the lowest number of votes, so those 65.12 votes should be divided amongst the other three entries, as those voters indicated with their second preferences.
If one result now has an overwhelming majority, it wins, otherwise, you repeat the process with the next lowest entry, reallocating to either second or third preferences as necessary.
Now you only have two entries, one of which has more votes. That one wins.

2 Likes

This is incorrect.

When you first add up the results, the second and third votes don’t matter; you only count the first votes. Once you have those totals, you eliminate the entry with the fewest number of votes.

Then, you take all the people whose first votes went to the eliminated entry, and you count their second votes, adding them to the previous totals for the other three entries.

Then you repeat this process until one entry has over 50% of the votes.

TTV always refers to this video, which explains the process:

3 Likes

Lol I am a scrub. I’ll review the video. Thanks all for clarifying!

Also, fwiw, the vote totals the incorrect way add up to:

  1. 320
  2. 317

TL;DR at the bottom:

OK, so I reviewed the clip. I think TTV’s script to check the ranked choice votes might be wrong. I think it is buggy because it is prematurely terminating in an if-statement loop, but I’d need to review the code to be sure. I think this is the case based on my logic below – someone please correct me if I’m wrong:

For the sake of simplicity, let’s reduce the complexity of the problem. Assume all voters of Entry 4 voted for Entry 2.

The votes go from:

  1. 30%
  2. 30%
  3. 24%
  4. 16%

to:

  1. 30%
  2. 46%
  3. 16%

For now, observe that none of them have reached the majority yet (50% threshold). Entry 2 only needs >4% of the counted proportioned votes to cinch the win.

What happened here? In eliminating 4, we moved all of its votes to the next best entry. The total number of votes in this pool has increased. This also means that if we need another round, then we can also discount all ranked choices in second and third preferences for Entry 4. This means we’d expect all remaining votes for second and third preferences to be split between 1,2,3.

However, observe that none of the votes in this simplified version have reached majority. An entirely new elimination round is needed.

Ok, now pause for a moment. We know that we need one more set of eliminations to determine the winning moc. So we repeat the above. In eliminating 3, then that means we should expect all remaining votes to be divided between 1 and 2. We can disregard all entry 3 votes in the second and third choice selections.

Observe again that this effectively reduces the problem to granting the same weight to each category of choices between entries 1 and 2 across all three preferences. Ergo, in summing the selections for preferences 1,2, and 3, only entries 1 and 2 matter.

Now back up to our non-simplified, real situation. Recall that the totals in this scenario were as follows:

Entry 1: 320
Entry 2: 317

Considering the fact that only the totals between entries 1 and 2 matter across all three preferences, in the final round of elininations, then we should have expected to see this result. But we didn’t, and I think I know why.

If the code re-counts a vote from an eliminated entry and checks the percentage within the same loop, then if it increments the total number of votes at the same time as it counts either Entry 1 or Entry 2, then we expect the following running proportions:

EITHER:
(Entry_1 + 1)/(total_vote + 1)
or
(Entry_2 + 1)/(total_vote + 1)

The pseudocode for this scenario would look something like:
for vote in reallocated_votes:
—> reallocate(vote)
—> for entry in entries:
—>—> if entry.get(“proportion”) > 50%:
—>—>—> return entry.get(“entry_number”)

BUT THIS IS WRONG!

The proper method to do this would be to assess all votes for the remaining 24% in entirety. Why? Because in close races, then the ordering of the counted reallocated votes can falsely determine the outcome if it’s based on a running proportion. Just think about how wildly percentage points change in actual elections, and how in close races every vote must be counted (and often triggering recounts).

To illustrate, say that we are stuck in the following scenario:
Owl: 4 votes (40%)
Lion: 4 votes (40%)
TO REALLOCATE >>> Turtle: 2 (20%)

Let’s assume that both voters for Turtle were evenly split between Owl and Lion. The possible sets of votes look like [O, L] or [L, O]

If we go by the pseudocode above, then the first set would yield a win for Owl. OWL → [4+1]/[8+1] vs LION → [4]/[8+1]

The second would yield a win for Lion. Same deal but reversed: OWL-> [4]/[8+1] vs LION-> [4+1]/[8+1]. But really the results should be indeterminate.

Why is this problematic? If the code uses the pseudocode I describe above, then it will always show the wrong answer if the sets are decided on arbitrary ordering (like… who voted first or whose username is first alphanumerically) if it does not complete the reallocation round first.

So again, the proper method to do this would be to assess all votes for the remaining reallocation round in entirety, and I don’t think that’s what happened.

TL;DR:
The script needs to be audited. If all votes were legitimate and there were no duplicates, then it is probably incorrectly comparing proportioned totals on a running basis (ie. not completed).

I’ll admit that I don’t quite understand the technical stuff at the bottom of your post, but this line seems incorrect; it’s true that you’re adding the second votes of the people who originally voted for the eliminated entry, but you also removed their first votes; the total number of votes remains the same, with the exception of those voters who only entered their first choice (which we were specifically warned not to do, on account of it messing with the code).

This also seems wrong, though it’s entirely possible I’m misunderstanding something. If someone’s first choice was for Entry 2, and their second or third vote was for Entry 1, that latter vote is still meaningless because their first choice is still in the running. You still can’t just add up all of the votes for the two entries, since each voter can only have one vote counting at a time.

Great catches, I think you are correct on both points actually. Let me sit on that for a bit, but those would be major oversights on my end. Given how close the results seem to be, it’s still hard to say without looking at the code and throwing some test cases at it. This system is difficult to audit, and it’s compounded based on how close the two entries were!

1 Like

If you really want to break into the system, your only choice is check every vote:

In this image the reddish lines are the ones that end up voting for Entry 1 and the bluish one for Entry 2. Since we know the fight is between Entry 1 and Entry 2, we don’t need to know who voted for Entry 3 or 4 for their third choice (that’s why I covered them). So you have to track each one of the 407 votes.

Let’s take Meso for example. He voted for Entry 2 first, so we don’t need to know what he voted after. The same happens with the ones that voted for Entry 1 first.

From Entry 3 and 4 you must see what voted each one. For example, the light blue line voted for entry 3 first, and 2 as second, so his final vote go for Entry 2. If you take me as example (pink line) I went for entry 3 as first and 1 as second, so my final vote go for Entry 1, no mater what I chose as thrid.

There are also people who voted for Entry 3 or 4 as first and Entry 4 or 3 as second. So you must see what they voted for thrid option. As exaple, the dark green line voted for Entry 3, then Entry 4 and then Entry 2, so this vote went for Entry 2.

You can’t just add the total votes of the second preference to Entry 1 or 2, since there are people who voted for Entry 1 as first and then for Entry 2 (dark red mark), or for Entry 2 as first and then for Entry 1 (dark blue mark).

Add votes until one of them reach 50%+1. Since there are fewer votes in the second preference than in the first preference, and fewer in third preference than in the second preference, it could happen that someone voted only for Entry 3 or 4, not choosing any of the two finalist. In that case, the total voters with which you calculate 50% + 1 would be less (406 voters instead of the original 407, if only one person did this).

If you accepted the mission, I would be interested in see the final result. I know that looks like a lot of work, but there aren’t that many votes.

2 Likes

Coding is well beyond my comprehension, but @Infrared, the designer of the program TTV uses, provided the code (or at least elements of it) on github, found in this post. I don’t know if/how it was modified past the date it was posted, and if any changes would show on github, but if there are any problems with it, that’s probably a good place to start, and Infrared is the person to ask.

2 Likes

Here is the code. Also, in case the code changes in the future, here is a permalink to the code as it was when I wrote this post: commit id f1b378f.

The README goes over implementation details that aren’t covered in the video (such as how surplus votes for a winning entry get reallocated). The first script grabs the poll data from Discourse and puts it in a spreadsheet. The second script uses that spreadsheet and simulates all the rounds of voting to determine the outcome. For transparency’s sake, after each round the second script outputs an updated spreadsheet and a chart.

The TTV staff run the scripts without my involvement. Based on the timestamps of our messages, I believe the version they are using is from the fourth-to-last commit (commit id c3e0710). The changes since then have only been to documentation or code style. You can verify this yourself by looking at the diffs here. So, if you run the latest code (commit id f1b378f) yourself, you should get the same outcome that TTV got.

Now how can you verify the poll results yourself? Unfortunately, Discourse does not allow TTV to hide poll results during voting and then reveal them afterward–poll results are either always public, or always hidden. They decided to hide the results (I believe to make vote brigading harder). Anyway, as a result of that decision, you can’t run the first script to grab the poll results directly from Discourse without a staff API key. However, if you get a hold of the spreadsheet generated by the first script, you can run the second script to determine the poll’s outcome. So the easiest way to verify the results would be to request that first spreadsheet from TTV (specifically @TenebraeInvictus iirc).

The code only checks for winners after all votes have been reallocated, so it avoids the issue you mentioned. That said, if there’s a tie, the code selects a winner randomly (rather than based on which entry is first alphabetically, which entry’s votes were tabulated first, etc.). That’s because the two entries don’t have enough votes to win, so they’re technically tied for last place, meaning one gets randomly eliminated. The relevant line is here.

The random decision is by design rather than an oversight. My thinking was that the script should look for a fair result where it can, and it should be up to the discretion of whoever runs the script to accept or reject the results. Since a random tiebreaker is the fairest result possible without running more polls, that’s what the script does. That said, since the script outputs charts and spreadsheets as it simulates the polls, it would be obvious to TTV that the winner was chosen randomly by tiebreaker. It’d then be up to TTV to decide whether to accept that result or to run a new poll.

7 Likes

just realized i’m the first vote on a few of them
that’s quite cool, isn’t it?