Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do you know of some generalization that instead of just positive and negative ratings would work with real numbers? E.g. rating could be anything between 0 and 1.

Great job btw.



My current ranking stuff knowledge is coming from Evan Miller's blog, here's a post that deals with star ratings: https://www.evanmiller.org/ranking-items-with-star-ratings.h...


Have you tried the simple Bayesian approach with a Beta prior? [1] I'd be interested to learn how it does.

  pretend_upvotes = 4
  pretend_downvotes = 4

  def score(item_upvotes, item_downvotes):
    upvotes = item_upvotes + pretend_upvotes
    downvotes = item_downvotes + pretend_downvotes
    return upvotes / float(upvotes + downvotes)
[1] http://julesjacobs.github.io/2015/08/17/bayesian-scoring-of-...


Interesting, I like the simplicity of that. Do you have any info how to determine good initial values for the prior? In this example good values for pretend_up and pretend_down? Would it make sense to use average_upvotes and average_downvotes or values that have that ratio?


Values that have that ratio might be good, but I'm not sure about the magnitude because maybe the average number of votes is too high so that the prior overwhelms the data. The scores get pulled towards that ratio as you increase the magnitude. If the ratio is close to 0 it has the effect of downranking videos with few votes, and if the ratio is close to 1 it has the effect of upranking videos with few votes. The effect might be too strong if you use the average magnitude. It might also be good to set the ratio a bit lower than the average ratio if you want to rank conservatively.

Parametrising it like you suggest might make it easier to experiment:

  ratio = 0.5
  number = 100
  pretend_upvotes = ratio*number
  pretend_downvotes = (1-ratio)*number
You could even set ratio to 0, but I actually think it makes sense to rank 1 up / 2 down above 101 up / 200 down, because the latter is definitely bad whereas the former might be good.


You can either estimate the prior as part of a hierarchical model, or use empirical Bayesian estimation. I spoke last year about an example of EBE applied to music trends:

https://mobile.twitter.com/achompas/status/88732699382138880...


You probably want a Beta distribution, which has finite support in [0,1] and also doubles as the conjugate prior for the binomial.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: