I'd love to see this analysis done for ChatGPT, which has a much bigger 'consumer' marketshare.
I'm also very wary of their analysis method, given classifiers-gonna-classify. We already see it in their example of someone asking why their game is crashing and it buckets them into Computer & Mathematical occupation. I'm guessing the original question was not that of a game developer but rather a game player, so can you really call this an occupational task? Sure it's in that domain, I guess, but in a completely different context. If I'm asking a question about how to clean my dish washer, that's hardly in repairman or industrial occupations.
But it ignores them equally throughout the year… It’s not an exact measure, but that doesn’t mean it’s not a useful metric. So long as the measurement is unbiased and captures enough of the traffic, it can still be useful.
The troughs in that graph are all during prime US school/college vacation times: Summer, Winter, and Spring breaks. And then magnitude of the fall corresponds to how long the breaks typically are. To me, that makes a lot of sense.
It seems like there was a dip in mid-May and that it didn't go back up until mid/late-August? That corresponds pretty closely to US summer breaks. Also a huge spike in early December followed up a huge drop, final exams and then winter break?
What I see is it going from 58 to 40 (the scale is ???), and it’s only continued to rise over time. So that maybe a common use (~30%), but it’s not the only use.
Most of those kids will continue to use it as they graduate, having embedded it in their workflow (unfortunately many will probably fully outsource all thinking to it, having learned a lot less since it did it all for them).
Yeah. reminds me of the ancient okcupid data analysis blogs and not the creepy one by sleep8. The group I'm surprised not to see represented in their analysis is "personal", where people I know use ChatGPT as a therapist/life coach/sms analysis&editor. and of course they crucially but understandably left off the denominator. 35% of a million requests is different than 35% of a billion. and also how many of the conversations had 1 message, indicating "just testing" vs 10 or 100 messages.
True! Consistency and representativeness matter, in soup samples as in social samples!
Is the soup smooth or lumpy? Striated or uniform? For that matter a soup could (and often does) involve huge soup bones that give it important parts of its flavor, but never show up directly in a spoonful. And you might need something different from a spoon to convincingly rule out some specific rare lumpy ingredient.
The didactic value of sampling the soup pot goes well behind its basic function: correcting the beginner’s misperception that a sample’s statistical power is directly related to population size :)
35% of a million students in the USA is very different to 35% of a billion students across the USA, Europe and Africa.
Since there aren't a billion students in the USA, 35% of them is an impossibility.
If you scale your population above some recognized boundary you aren't sampling in the same space any more. After all the local star density to 1AU tends very strongly to 1. That's not indicative of the actual star density in the milky way.
What do you mean by “statistically”? The end results would be like three orders of magnitude apart. Wouldn’t the desired sample size depend on the size of the population itself?
>Wouldn’t the desired sample size depend on the size of the population itself?
No, The most important thing is the distribution of the sample size. You have to make sure it isn't obviously biased in some way (i.e You're only surveying students in a university for extrapolation on the entire population of the country). Beyond that, the desired sample size levels off quickly.
5000 (assuming the same distribution) won't be any more or less accurate for 10M than it is for 1M.
Of course, if you just ask everyone or almost everyone then you no longer need to worry about distribution but yeah
I'm also very wary of their analysis method, given classifiers-gonna-classify. We already see it in their example of someone asking why their game is crashing and it buckets them into Computer & Mathematical occupation. I'm guessing the original question was not that of a game developer but rather a game player, so can you really call this an occupational task? Sure it's in that domain, I guess, but in a completely different context. If I'm asking a question about how to clean my dish washer, that's hardly in repairman or industrial occupations.
Still, it's cool they're doing this.