More

plewd · 2026-06-04T13:19:16 1780579156

That's a fair point, and it gets into intrinsic vs extrinsic motivation. Problem is that nearly all students are conditioned to care about external motivators (GPA, parental expectations, etc..) instead of "the joy of learning".

plewd · 2026-06-04T13:09:30 1780578570

Is that really a fair comparison though? Were there any stats showing that ball pens directly impacted metrics like grades?

I understand that it's harder to see things without the benefit of hindsight, but we must agree that AI's impact on students (or society, to be even more vague) has a much larger scope.

kopirgan · 2026-06-04T14:16:16 1780582576

I'm frankly not sure in both cases, just commenting on how over the ages things change but remain the same. If the broader concern about AI blunting thoughts, introduce laziness etc is true, so are things like calculators, although I agree on much smaller scale.

I do share some of the concerns, though I don't have kids of school going age.

plewd · 2026-06-04T08:24:48 1780561488

I'm curious why this comment got flagged? At least to me it looks really human (typos, uncapitalized words, etc..)

sperandeo · 2026-06-04T18:27:53 1780597673

im curious too because i am not using ai and now im literally scrutenizing how to even talk here. Even asking ai how this even looks ai? what is even going on here?

dang · 2026-06-04T22:27:51 1780612071

I responded to the GP comment because it was your most recent post, but there have been quite a few other posts that got classified as genai - e.g.:

https://news.ycombinator.com/item?id=48310576

https://news.ycombinator.com/item?id=48310559

https://news.ycombinator.com/item?id=48310458

As I said, it's impossible to know for sure - but we have no choice but to use imperfect measures because (1) the site is getting inundated, and (2) the community feels strongly about it.

In most cases when I've traced through similar cases with specific users, it turned out that they were using some tool (e.g. a grammar checker) that they didn't think of as AI-editing, but which was leaving LLM imprints on their text. So far it doesn't seem possible to run text through LLMs and not have that happen, but we have no idea how that will evolve over time.

For anyone wondering, I suppose I should add that this is in no way anti-AI! LLMs are amazing and we rely on them heavily; we just don't use them on text that we post to HN, since HN conversation is a social activity for humans. There was an interesting PG post about this the other day:

https://x.com/paulg/status/2058844147092488401, https://xcancel.com/paulg/status/2058844147092488401

Although he was writing about emails, not forum comments, the argument is much the same.

plewd · 2026-06-04T08:15:42 1780560942

Those that start negatively impacting day-to-day life significantly, although I believe nearly everyone can benefit from some sort of therapy.

plewd · 2026-05-27T08:17:11 1779869831

I don't know much about stats, but does "the null is that the success rate is the same" imply that it's a sketchy methodology because they can come up with some findings ("ruder prompts are better/worse!") more often?

331c8c71 · 2026-05-27T08:52:46 1779871966

You are asking about one-sided vs two-sided tests. Not really "more often" because formal type 1 error rate is still the same. I'd say two-sided tests leave more space for post-hoc theorizing but there are valid situations when there is no clear one-sided hypothesis a priori. Do we really know whether that the hypothesis should have been "ruder prompts are better"?

I'd say this is benign compared to other ways of (mis)using statistics e.g. looking which way the difference goes and then running one-sided tests or tweaking the setup until one gets "significant" p vals.

EDIT: I looked in the paper again and noticed that they actually did pairwise t-test on all possible combinations of tones. They should have adjusted for multiple testing since they are doing 10 tests (choose 2 from 10) and not one.

jampekka · 2026-05-27T08:39:53 1779871193

That's the usual null hypothesis for these kinds of tests.

plewd · 2026-05-22T07:49:59 1779436199

Isn't that exactly what the free trial is for? Am I missing something?

clan · 2026-05-22T11:54:34 1779450874

Well, yes. You are right.

But as I read the OP it is that he objects to the barrier of entry. He would prefer (possibly very harsh) rate limiting over the hassle of registrering an account. Maybe combined with a weak "nag" screen.

It might be hard implementing in a bulletproof way as IP restrictions are easy to circumvent. But it might be "good enough" to drive more adoption.

I'm a bit on the fence. It would be an interesting experiment.

plewd · 2026-05-15T12:07:07 1778846827

It's not just about documentation.

If stuff really goes wrong, you need people who deeply understand the codebase so that they know where to look and how to diagnose the issue. It might be the case in the future that LLMs become so powerful they'll diagnose any issue (I doubt it), but until then, we need people in the loop.

plewd · 2026-05-12T13:40:32 1778593232

I honestly don't think it's possible for platforms to have "nice" algorithms like this without slowly slipping into the "maximum-engagement" algorithms we're plagued with now. I remember seeing this happen with Instagram, slowly going from a chronological feed to a confusing one where you can never be certain you've caught up with your network.

In a perfect world it would be great to have a platform that allows open-sourced algorithms for people to choose from, although that's a crazy pipe dream.

plewd · 2026-05-12T13:26:17 1778592377

Most of them only got to that position from being loud in the first place, so I'd think you could still put them in the latter category.

plewd · 2026-05-12T13:24:43 1778592283

I re-realized this about a week ago when the "red button vs blue button" debate started appearing a lot on Reddit and Instagram. It's frustrating when every comment is just a shallow knee-jerk reaction from one side re-iterating their perspective or clowning on the other.

The whole debate could be summarized in a paragraph or two, but the social media environment is unfortunately curated towards diluted opinions (as you said) instead of nuanced ones.

All that to say I'm happy HN is still holding strong in terms of quality as compared to other platforms.