5/26/12

bts data

Here's data about this experiment, having to do with bayesian truth serum. The data contains answers to 100 questions from both turkers and blog readers (all anonymized).

The questions were displayed three at a time, where usually one of the questions was showing the results, and there were two questions you could answer. In such cases:

  • 57.2% of the time, people answered the top question
  • 76.4% of the time, people answered the older question (the one that had been there longer)

I'm not sure these numbers are really meaningful though because one common case seems like it would be the case where there are 3 questions, the user answers the first one (which is not included in this data since there are 3 questions instead of 2), and then the user answers the second one. On the other hand, this question would be both on-top and older, so the bias would skew each figure the same amount, meaning that the difference in percentages probably is meaningful (i.e. older-ness of question is more predictive than which question is on top).

Graph 1: actual vs mean-guess



Graph 2: actual vs mean-guess for people who answered 'yes' and 'no'


Here's the Excel file for the above plots.

JavaScript Eval code (after pasting this data into the lower-left textarea):


var db = eval(input)
foreach(db.questions, function (q) {
    q.answers = {}
    q.guesses = []
})
foreach(db.answers, function (a) {
    var q = db.questions[a.question]
    q.answers[a.text] = q.answers[a.text] ? q.answers[a.text] + 1 : 1
    if (a.text.match(/yes|no/))
        q.guesses.push(a.guess)
})
foreach(db.questions, function (q) {
    var s = ""
    s += q.answers.yes / (q.answers.yes + q.answers.no)
    s += ','
    var sum = 0
    var total = 0
    foreach(q.guesses, function (g) { sum += g; total += 1 })
    s += sum / total
    print(s)
})



No comments:

Post a Comment