In an ideal world, we would be able to compute the rating as a probability curve as they do on OK Cupid.
http://www.okcupid.com/static?p=faq#2.6
The more reviews that a translation has, the more certain the score would be. In other words, a translation with only one review would not be very certain, so the curve would be spread out across a high and low range of the rating scale. As more reviews are made, if the review numbers agree, the curve gets narrower and narrower toward a point on the scale. If they disagree, they can spread out the curve. In this way, a result can be computed immediately without having to wait for three reviewers.
The fact is, we are dealing with a some subjective judgements and not
all reviewers are of equal skill. jboselkei is certain to have a margin
of error, so a probability curve is more appropriate than a number.
However, I understand that it would be very difficult software to program. The system we already have is great and I'm not complaining.
-epkat