More

alecbz · 2026-01-27T21:34:17 1769549657

> Even this evidence of woodworking is largely unremarkable .... this find is most notable for its preservation.

This somewhat contradicts the subheading, no?

> The finding, along with the discovery of a 500,000-year-old hammer made of bone, indicates that our human ancestors were making tools even earlier than archaeologists thought.

throwup238 · 2026-01-27T23:58:56 1769558336

That subheading is complete nonsense and I can't think of a single charitable reading of that sentence that in any way makes sense. Archaeologists have known that our ancestors have been making tools for over a million years since the Acheulean industry was conclusively dated in the 1850s. It took half a century for archaeologists to figure that out after William Smith invented stratigraphy. Scientists didn't even know what an isotope was yet.

The original paper's abstract is much more specific (ignore the Significance section, which is more editorializing):

> Here, we present the earliest handheld wooden tools, identified from secure contexts at the site of Marathousa 1, Greece, dated to ca. 430 ka (MIS12). [1]

Which is true. Before this the oldest handheld wooden tool with a secure context [2] was a thrusting spear from Germany dated ~400kYA [3]. The oldest evidence of woodworking is at least 1.5 million years old but we just don't have any surviving wooden tools from that period.

[1] https://www.pnas.org/doi/10.1073/pnas.2515479123

[2] This is a very important term of art in archaeology. It means that the artefact was excavated by a qualified team of archaeologists that painstakingly recorded every little detail of the excavation so that the dating can be validated using several different methods (carbon dating only works up to about 60k years)

[3] https://humanorigins.si.edu/evidence/behavior/getting-food/o...

alecbz · 2026-01-23T17:38:27 1769189907

Even ignoring determinism, with traditional source code you have a durable, human-readable blueprint of what the software is meant to do that other humans can understand and tweak. There's no analogy in the case of "don't read the code" LLM usage. No artifacts exist that humans can read or verify to understand what the software is supposed to be doing.

luckydata · 2026-01-23T19:26:49 1769196409

yeah there is. it's called "documentation" and "requirements". And it's not like you can't go read the code if you want to understand how it works, it's just not necessary to do so while in the process of getting to working software. I truly do not understand why so many people are hung up on this "I need to understand every single line of code in my program" bs I keep reading here, do you also disassemble every library you use and understand it? no, you just use it because it's faster that way.

notpachet · 2026-01-23T20:11:10 1769199070

> do you also disassemble every library you use and understand it?

Sometimes.

alecbz · 2026-01-23T20:24:31 1769199871

> it's called "documentation" and "requirements"

What I mean is an artifact that is the starting point for generating the software. Compiled binaries can be completely thrown away whenever because you know you have a blueprint (the source code) that can reliably reproduce it.

Documentation & requirements _could_ work this way if they served as input to the LLMs that would then go and create the source code from scratch. I don't think many people are using LLMs this way, but I think this is an interesting idea. Maybe soon we'll have a new generation of "LLM-facing programming languages" that are even higher level software blueprints that will be fed to LLMs to generate code.

TDD is also a potential answer here? You can imagine a world where humans just write test suites and LLMs fill out the code to get it to pass. I'm curious if people are using LLMs this way, but from what I can tell a lot of people use them for writing their tests as well.

> And it's not like you can't go read the code if you want to understand how it works

In-theory sure, but this is true of assembly in-theory as well. But the assembly of most modern software is de-facto unreadable, and LLM-generated source code will start going that way too the more people become okay with not reading it. (But again, the difference is that we're not necessarily replacing it with some higher-level blueprint that humans manage, we're just relying on the LLMs to be able to manage it completely)

> I truly do not understand why so many people are hung up on this "I need to understand every single line of code in my program" bs I keep reading here, do you also disassemble every library you use and understand it? no, you just use it because it's faster that way.

I think at the end of the day this is just an empirical question: are LLMs good enough to manage complex software "on their own", without a human necessarily being able to inspect, validate, or help debug it? If the answer is yes, maybe this is fine, but based on my experiences with LLMs so far I am not convinced that this is going to be true any time soon.

alecbz · 2026-01-23T17:34:34 1769189674

Wait, so you're a radiologist and you're using software you vibecoded to generate radiology reports for real patients? Is that, like, allowed?

mbesto · 2026-01-23T18:05:45 1769191545

Not saying it's right, but boy do I have stories about the code used in <insert any medical profession> healthcare applications. Not sure how "vibecoded" programming lines of code is any worse.

dullcrisp · 2026-01-23T20:46:43 1769201203

Because that code is presumably working and the vibe code is probably not?

alecbz · 2026-01-23T21:10:14 1769202614

Honestly even if this wasn't vibe-coded I'm still a bit surprised at individual radiologists being able to bring their own software to work, for things that can have such a high effect on patient outcomes.

mbesto · 2026-01-23T21:55:24 1769205324

do you have evidence that all vibe coded solutions dont work? Because thats what you're implying.

dullcrisp · 2026-01-23T23:29:12 1769210952

If I wanted to prove murder, not negligence.

azan_ · 2026-01-23T23:29:36 1769210976

Of course it’s allowed. It’s just kind of text editor but with support of speech to text and structured reports (e.g. when reporting spine if I say l3 bd it automatically inserts description of bulging disc in the correct place in the report). I then copy paste it to RIS so there’s absolutely nothing wrong or illegal in that.

d1sxeyes · 2026-01-23T17:58:04 1769191084

Depends where in the world they are. Here in Hungary, it’s not uncommon to email your-family-doctor@gmail.com

direwolf20 · 2026-01-23T19:24:41 1769196281

What does that have to do with vibe-coding?

alecbz · 2026-01-23T17:31:58 1769189518

I have some success but by the time I'm done I'm often not sure if I saved any time.

alecbz · 2025-11-18T17:47:11 1763488031

So scary to go from diagnosis to passing in such a short time.

Cherish every sunrise.

alecbz · 2025-11-03T23:10:21 1762211421

> basically everyone knew the internet would be revolutionary long before 1995. Being able to talk to people halfway across the world on a BBS? Sending a message to your family on the other side of the country and them receiving it instantly? Yeah, it was pretty obvious this was transformative.

That sounds pretty similar to long-distance phone calls? (which I'm sure was transformative in its own way, but not on nearly the same scale as the internet)

Do we actually know how transformative the general population of 1995 thought the internet would or wouldn't be?

xwolfi · 2025-11-04T02:37:39 1762223859

In 1995 in France we had the minitel already (like really a lot of people had one) and it was pretty incredible, but we were longing for something prettier, cheaper, snappier and more point to point (like the chat apps or emails).

As soon as the internet arrived, a bit late for us (I'd say 1999 maybe) due to the minitel's "good enough" nature, it just became instantly obvious, everyone wanted it. The general population was raving mad to get an email address, I never heard anyone criticize the internet like I criticize the fake "AI" stuff now.

alecbz · 2025-10-14T21:15:36 1760476536

Regularly trying to use LLMs to debug coding issues has convinced me that we're _nowhere_ close to the kind of AGI some are imagining is right around the corner.

surgical_fire · 2025-10-14T21:56:38 1760478998

At least Mother Brain will praise your prompt to generate yet another image in the style of Studio Ghibli as proof that your mind is a tour de force in creativity, and only a borderline genius would ask for such a thing.

ben_w · 2025-10-14T22:19:55 1760480395

Sure, but also the METR study showed the rate of change is t doubles every 7 months where t ~= «duration of human time needed to complete a task, such that SOTA AI can complete same with 50% success»: https://arxiv.org/pdf/2503.14499

I don't know how long that exponential will continue for, and I have my suspicions that it stops before week-long tasks, but that's the trend-line we're on.

alecbz · 2025-10-15T18:48:54 1760554134

Only skimmed the paper, but I'm not sure how to think about "length of task" as a metric here.

The cases I'm thinking about are things that could be solved in a few minutes by someone who knows what the issue is and how to use the tools involved. I spent around two days trying to debug one recent issue. A coworker who was a bit more familiar with the library involved figured it out in an hour or two. But in parallel with that, we also asked the library's author, who immediately identified the issue.

I'm not sure how to fit a problem like that into this "duration of human time needed to complete a task" framework.

conception · 2025-10-15T21:38:31 1760564311

This is an excellent example of human “context windows” though and it could be the llm could have solved the easy problem with better context engineering. Despite 1M token windows, things still start to get progressively worse after 100k. LLMs would overnight be amazingly better with a reliable 1M window.

alecbz · 2025-10-20T18:16:52 1760984212

What does "better context engineering" mean here? How/why are the existing token windows "unreliable"?

ben_w · 2025-10-15T21:43:28 1760564608

Fair comment.

While I think they're trying to cover that by getting experts to solve problems, it is definitely the case that humans learn much faster than current ML approaches, so "expert in one specific library" != "expert in writing software".

Pulcinella · 2025-10-15T02:39:44 1760495984

But will it actually get better or will it just get faster and more power efficient at failing to pair parentheses/braces/brackets/quotes?

ben_w · 2025-10-15T09:12:47 1760519567

Read the linked METR study please.

Or watch the Computerphile video summary/author interview, if you prefer: https://m.youtube.com/watch?v=evSFeqTZdqs

alecbz · on Sept 19, 2023

> This could have been prevented by having one person on the team with actual language design experience, who could point this issue out in the design process.

Instead of making a mistake, they could have simply not.

See also RFC 9225: Software Defects Considered Harmful https://www.rfc-editor.org/rfc/rfc9225.html

alecbz · on June 14, 2023

100 requests per minute per what?

Like if there's a third party app built against the API, is it 100 rpm for each user of the app, or across the entire app's usage?

agilecobol · on June 14, 2023

I think it is time for some of those apps to invest into caching. Also APIs are not only about third-party applications. I can think of a dozen ideas for products build around the Reddit API without going over this limit.

alecbz · on June 14, 2023

> I think it is time for some of those apps to invest into caching.

Even ignoring staleness issues, storing a bunch of reddit's content for them is likely going to get expensive fast as well

> Also APIs are not only about third-party applications.

Yeah but aren't third-party apps mostly what people are upset about?

alecbz · on June 14, 2023

I imagine there's ways to curtail that (like detecting non-human users).