Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd love to see this analysis done for ChatGPT, which has a much bigger 'consumer' marketshare.

I'm also very wary of their analysis method, given classifiers-gonna-classify. We already see it in their example of someone asking why their game is crashing and it buckets them into Computer & Mathematical occupation. I'm guessing the original question was not that of a game developer but rather a game player, so can you really call this an occupational task? Sure it's in that domain, I guess, but in a completely different context. If I'm asking a question about how to clean my dish washer, that's hardly in repairman or industrial occupations.

Still, it's cool they're doing this.



If you look at ChatGPT search volume, you can see massive dips during the summer when school is out:

https://trends.google.com/trends/explore?date=today%205-y&ge...

Which suggests that the most common use is as a tutor / cheating on homework.


This ignores people who open ChatGPT.com or use the app


But it ignores them equally throughout the year… It’s not an exact measure, but that doesn’t mean it’s not a useful metric. So long as the measurement is unbiased and captures enough of the traffic, it can still be useful.

The troughs in that graph are all during prime US school/college vacation times: Summer, Winter, and Spring breaks. And then magnitude of the fall corresponds to how long the breaks typically are. To me, that makes a lot of sense.


Yeah but it's old. There was no dip in 2024, only a steady increase.


It seems like there was a dip in mid-May and that it didn't go back up until mid/late-August? That corresponds pretty closely to US summer breaks. Also a huge spike in early December followed up a huge drop, final exams and then winter break?



What I see is it going from 58 to 40 (the scale is ???), and it’s only continued to rise over time. So that maybe a common use (~30%), but it’s not the only use.

Most of those kids will continue to use it as they graduate, having embedded it in their workflow (unfortunately many will probably fully outsource all thinking to it, having learned a lot less since it did it all for them).


That dip didn't exist in 2024. Site visits just increased steadily throughout last year with no summer dips.


Yeah. reminds me of the ancient okcupid data analysis blogs and not the creepy one by sleep8. The group I'm surprised not to see represented in their analysis is "personal", where people I know use ChatGPT as a therapist/life coach/sms analysis&editor. and of course they crucially but understandably left off the denominator. 35% of a million requests is different than 35% of a billion. and also how many of the conversations had 1 message, indicating "just testing" vs 10 or 100 messages.


> not the creepy one by sleep8

What are you referring to?


Oh I guess it was just a tweet, but still.

https://www.404media.co/ceo-reminds-everyone-eightsleep-pod-...


> 35% of a million requests is different than 35% of a billion.

Not statistically.


A mentor I respect memorably explained to young me that “it doesn’t matter how big the pot of soup, you can use the same size spoon to taste it.”


Sorry but that mentor has a small practical imagination, a pot can be so large that the top 3 feet that you reach with that spoon could be all oil


True! Consistency and representativeness matter, in soup samples as in social samples!

Is the soup smooth or lumpy? Striated or uniform? For that matter a soup could (and often does) involve huge soup bones that give it important parts of its flavor, but never show up directly in a spoonful. And you might need something different from a spoon to convincingly rule out some specific rare lumpy ingredient.

The didactic value of sampling the soup pot goes well behind its basic function: correcting the beginner’s misperception that a sample’s statistical power is directly related to population size :)


to push this analogy too far, that's because you didn't stir it well, not because the spoon is too small.


Have to sample to see if it’s stirred well enough.


No, you can model whether stirring actions should create a representative sample


Not with immiscible layered stratified flow…

“You're gonna need a bigger spoon!”


35% of a million students in the USA is very different to 35% of a billion students across the USA, Europe and Africa.

Since there aren't a billion students in the USA, 35% of them is an impossibility.

If you scale your population above some recognized boundary you aren't sampling in the same space any more. After all the local star density to 1AU tends very strongly to 1. That's not indicative of the actual star density in the milky way.


Yes statistically. What do you think "statistically" means?


What do you mean by “statistically”? The end results would be like three orders of magnitude apart. Wouldn’t the desired sample size depend on the size of the population itself?


>Wouldn’t the desired sample size depend on the size of the population itself?

No, The most important thing is the distribution of the sample size. You have to make sure it isn't obviously biased in some way (i.e You're only surveying students in a university for extrapolation on the entire population of the country). Beyond that, the desired sample size levels off quickly.

5000 (assuming the same distribution) won't be any more or less accurate for 10M than it is for 1M.

Of course, if you just ask everyone or almost everyone then you no longer need to worry about distribution but yeah


Very wary. Not weary.


why not both?


lmao




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: