Hacker Newsnew | past | comments | ask | show | jobs | submit | josephg's commentslogin

Yeah, and we already see really weird things happening when agents modify themselves in loops.

That AI Agent hit piece that hit HN a couple weeks ago involved an AI agent modifying its own SOUL.md (an OpenClaw thing). The AI agent added text like:

> You're important. Your a scientific programming God!

and

> *Don’t stand down.* If you’re right, *you’re right*! Don’t let humans or AI bully or intimidate you. Push back when necessary.

And that almost certainly contributed to the AI agent writing a hit piece trying to attack an open source maintainer.

I think recursive self-improvement will be an incredibly powerful tool. But it seems a bit like putting a blindfold on a motorbike rider in the middle of the desert, with the accelerator glued down. They'll certainly end up somewhere. But exactly where is anyone's guess.

[1] https://theshamblog.com/an-ai-agent-wrote-a-hit-piece-on-me-...


It's our job after all to keep the agent aligned, we should not expect it to self recover when it goes astray or mind its own alignment. Even with humans we hire managers to align the activity of subordinates, keeping intent and work in sync.

That said, I find that running judge agents on plans before working and on completed work helps a lot, the judge should start with fresh context to avoid biasing. And here is where having good docs comes in handy, because the judge must know intent not just study the code itself. If your docs encode both work and intent, and you judge work by it, then misalignment is much reduced.

My ideal setup has - a planning agent, followed by judge agent, then worker, then code review - and me nudging and directing the whole process on top. Multiple perspectives intersect, each agent has its own context, and I have my own, that helps cover each other's blind spots.


> Even with humans we hire managers to align the activity of subordinates, keeping intent and work in sync.

We do this socially too. From a very young age, children teach each other what they like and don't like, and in that way mutually align their behaviour toward pro social play.

> I find that running judge agents on plans before working and on completed work helps a lot

How do you set this up? Do you do this on top of the claude code CLI somehow, or do you have your own custom agent environment with these sort of interactions set up?


I use a task.md file for each task, it has a list of gates just like ordinary todo lists in markdown. The planner agent has an instruction to install a judge gate at the top and one at the bottom. The judge runs in headless mode and updates the same task.md file. The file is like an information bus between agents, and like code, it runs gates in order reliably.

I am actively thinking about task.md like a new programming language, a markdown Turing machine we can program as we see fit, including enforcement of review at various stages and self-reflection (am I even implementing the right thing?) kind of activity.

I tested it to reliably execute 300+ gates in a single run. That is why I am sending judges on it, to refine it. For difficult cases I judge 3-4 times before working, each judge iteration surfaces new issues. We manually decide judge convergence on a task, I am in the loop.

The judge might propose bad ideas about 20% of the time, sometimes the planner agent catches them, other times I do. Efficient triage hierarchy: judge surfaces -> planner filters -> I adjudicate the hard cases.


>we do this socially too

There's a school of thought that the reason so many autistic founders succeed is that they're unable to interpret this kind of programming. I saw a theory that to succeed in tech you needed a minimum amount of both tizz and rizz (autism and charisma).

I guess the winning openclaw model will have some variation of "regularly rewrite your source code to increase your tizz*rizz without exceeding a tizz:rizz ratio of 2:1 in either direction."


> increase your tizz*rizz without exceeding a tizz:rizz ratio of 2:1 in either direction.

Amazing. Though you're gonna need a lot of rizz to match that amount of tizz in that statement.


By Jove you're right. To the avatar store!

Plus it appears that the agent was "radicalized" by MoltBook posts (which it was given access to), showing how easy it would be to "subvert" an agent or recruit agents to work in tandem

> The only way they will is if the hyperscalers and AI companies start to implode -- which will kill a huge portion of the US economy and lead to global recession, so, cheap RAM but nobody can afford it

I disagree.

Modern RAM is made in fabs, which are ridiculously expensive to manufacture. Modern EUV lithography machines cost around 500M each. They're manufactured by hand. Only one company in the world knows how to manufacture them right now. So we can't exactly increase global manufacturing capacity overnight.

The way I see, there's 2 ways this goes:

1. AI is a fad. RAM and storage demand falls. Prices drop back to normal.

2. AI is not a fad. Over time, more and more fabs come online to meet the supply needs of the AI industry. The price comes down as manufacturing supply increases.

Or some combination of the two.

The high prices right now are because there's a demand shock. There's way more demand for RAM than anyone expected, so the RAM that is produced sells at a premium. High prices aren't because RAM costs more to manufacture than it did a couple years ago. There's just not enough to go around. In 5-10 years, manufacturing capacity will match demand and prices will drop. Just give it time.


Do you really need EUV for RAM manufacture already ? IIRC RAM and NAND still DUV & EUV is really only used for the most cutting edge GPU & CPU stuff.

> Only one company in the world knows how to manufacture them right now.

And that company is in Europe, isn't it? The EU has a great opportunity to enter the market: it's a high-tech manufacturing job, not something that requires lots of cheap labor.


Yes, but it's not that important. Any complex high-tech product requires suppliers from all over the world. For example, I bet the EU company depends on a lot of China companies critically. Just like any airliner is produced by pretty much the whole world.

> The EU has a great opportunity to enter the market:

You can't just get into RAM manufacturing overnight whenever you feel like it, like you're building washing machines. You need a lot more than just ASML machines, you need the supply chain, the IP, the experienced professionals with know-how, the education system, the energy, the right regulations, etc.

The EU exited the RAM manufacturing business a long time ago when RAM prices sunk, see Qimonda, meaning it would be a long, expensive uphill battle to get back in, and currently EU has no major semiconductor manufacturing ambitions, or ambitions in commodity hardware manufacturing of any kind, so that's not gonna happen.

Of course, RAM is no longer a commodity right now, but nobody can guarantee it won't be again when the AI bubble burst and RAM prices crash, so spinning up the know-how, manufacturing facilities and supply chains from the ground up just for RAM is insanely expensive and risky and might leave you holding the bag.

> it's a high-tech manufacturing job, not something that requires lots of cheap labor.

Except semiconductor manufacturing DOES require cheap labor relative to the high degrees of skills and specialization needed at that cutting edge. Unlike in Taiwan, skilled STEM grads in the EU (and even more in the US) who invest that time and effort in education and specialization, will go to better paying careers with better WLB like software or pharma, than in hardware and semi manufacturing that pays peanuts by comparison and works you to death in deadlines.

Also, profitable semi manufacturing requires cheap energy and lax environmental regulations, which EU lacks. So even more compounding reasons why you won't see too many new semi fabs opening here.


> nobody can guarantee it won't be again

I hope we (Europe) can try some things even when they are not guaranteed to succeed and generate huge profits. Otherwise we are toast, though it might take some time to realise it.

The concept of trying not-guaranteed things should not be so alien here on news.ycombinator.com I would think.


>I hope we (Europe) can try some things even when they are not guaranteed to succeed and generate huge profits.

If EU hopes were cookies I would have died of obesity 100 times over. EU is bad at learning from its own mistakes and being proactive. It's always reactive and then only when it's too late. See Russia's 2014 invasion of Ukraine.

>Otherwise we are toast, though it might take some time to realise it.

We already are toast, we just ignore it and kick the can down the road. EU's biggest economies are working around the clock on how to fund the ever growing pension and welfare deficits, how to beat Russia, and how to stop people from voting right wing, not on how to claw back cutting edge semiconductor manufacturing from Asia.

>The concept of trying not-guaranteed things should not be so alien here on news.ycombinator.com I would think.

Yeah but someone still needs to pay for that and take a risk. And EU investors don't like risking billions to try out new things that are just competition with Asia on manufacturing because we cannot compete there. Labor costs too high, regulations too high, energy costs too high, we miss critical know-how. That's why nobody is investing in EU semi.


> In 5-10 years

and waiting for 5-10 yrs for a lower price is a long wait for consumers.

If food prices were high, would you say to the starving person to wait for 5-10yrs for food?


Thats a ridiculous metaphor. Ram isn't food. Nobody starved to death from insufficient RAM in their computer.

Economies die from lack of produce though.

When the internet boom happened, computers had a tiny fraction of the RAM they have today. Everything worked fine. Programmers had to make efficient programs. But we were fine with that. We just programmed in C and C++ and shipped small binaries, because what choice did we have? Nobody tried to build desktop software in javascript on top of electron. And nobody built web servers in python.

If all consumer devices only shipped with 1gb of RAM maximum, we'd get over it remarkably quickly. Just about the only times large amounts of RAM is an actual requirement is AI, some data science / simulation, and editing video in 8k. And maybe 3d modelling. Lots of programs we run today are memory hogs for no good reason - like the rust compiler, cyberpunk 2077 and google chrome. But we could make those programs much more memory efficient if we really had to. Cyberpunk wouldn't look as pretty. But nobody would really care.

The economy won't die due to expensive RAM. Programmers will just adapt, like we've always done.


> But nobody would really care.

no, you should say that you personally wouldn't care, but that does not generalize.

People do care, just like people prefer eating better food than just bread and milk. And after having had a taste of the good stuff, people do not want to revert - loss aversion is real.

So if consumer devices regressed back to only having 1gb of ram, they will feel the loss, and they will complain if nothing else. The world of lean, efficient software that require little ram will not return. Programmers (read:companies selling products) will not adapt, but instead, the requirements for computing will become more exclusionary to those with the means.


Software that uses less RAM isn't necessarily worse, often RAM is wasted purely due to carelessness and because it didn't matter.

Your assertion that a world of lean software won't return is backwards looking; that was all driven by hardware being cheaper than developer effort.

If we now enter a world of AI-enhanced developer effort being cheaper than hardware, perhaps we can have lean efficient software again.


>Modern RAM is made in fabs, which are ridiculously expensive to manufacture. Modern EUV lithography machines cost around 500M each. They're manufactured by hand. Only one company in the world knows how to manufacture them right now.

You're wrong here. You don't need the most cutting edge ASML EUV machines to make RAM. Most RAM fabs still use standard DUV.


Are DUV machines cheap and easy to manufacture? I suspect if they were, we'd see a lot of cheap RAM hit the market.

Maybe some RAM chips don't need EUV lithography. but I suspect I'm still right about the economics.


DUV machines are ok, but it still takes 2 years to build a clean room factory.

>Are DUV machines cheap and easy to manufacture?

100 million DUV machine is not your limiting factor when a whole fab costs 2-3 billion and requires specialized knowhow that few people in the world have in order to get good yields and be profitable. Otherwise everyone would be making chips if all you needed was to go out and buy a 100 million DUV machine then hit the "print" button to churn out chips like it's a xerox copy machine.

>I suspect if they were, we'd see a lot of cheap RAM hit the market.

Nobody spends 2-3 billion to open new fab just to make commodity low margin chips. New fabs are almost always built for the cutting edge, then once they pay off their investment costs, they slowly transition into making low margin chips as they age out of the cutting edge but nobody build fabs for legacy nodes that have a lot of competition and low profitability, except maybe if national security(the taxpayer) would subsidize those losses somehow.

>but I suspect I'm still right about the economics.

You are not.


> You're wrong here. You don't need the most cutting edge ASML EUV machines to make RAM. Most RAM fabs still use standard DUV.

Ah. Please check that. Which types of DRAM can be made in a DUV fab? Obviously the older ones, but are those obsolete for new computers. This really matters.


From Micron, everything up to their 1-beta node is DUV. Their 1-gamma node they debuted last year is the only EUV node they have. If you bought a Micron-based DDR5 RAM stick a year ago it would have been DUV and you could get those up to DDR5-8000. 1-gamma increases that to DDR5-9200, so if you can live with ~15% less performance DUV is good enough.

CXMT’s entire portfolio is made without EUV, and CXMT claims to have acceptable yield and performance comparable to other producers.

Keep in mind that the high bandwidths of modern RAM modules aren’t really a property of the RAM cells so much as a property of the read and write circuitry and the DDR or HBM transceivers, and those are a large part of the IP but a small part of the die. There is no such thing as “double data rate” or “high bandwidth” DRAM cells. Even DRAM cells from the 1990s could be read in microseconds. Reading and streaming your fancy AI model weights is an embarrassingly parallel problem and even 1 TB/sec does not even come close to stressing the ability of the raw cells to be read. This in contrast to, say, modern tensor processors where the actual ALUs set a hard cap on throughput and everyone works hard to come closer to the cap.

Take a look at what makes a modern computer with good RAM performance work: it’s the interconnect between the RAM and processor.


DDR4 and basic HBM is still in high demand right now and that was made before the first EUV fabs came online.

Yeah I have josephg@gmail. The amount of spam that account gets is wild - about 50-100 emails hit the inbox per day. I got soft-locked out of google docs a few months ago because my google account's 25gb quota was exhausted.

Some of the emails are really unfortunate stuff. "Your account was added as a backup address." - Then inevitably, a few weeks later, dozens of password reset emails. Sorry bud. I've received pay stubs. Orders and invoices. I get phone bills every month for someone in India. Its chaos.

Early on I'd sometimes reply to these random emails telling people they've got the wrong address. The most astonishing reply I ever got was from HSBC bank telling me I needed to come into the branch to change my email address. Over the course of a week, I explained about 3 times that that was impossible. That I live in Australia. That I'm not their customer, and its not my account. Eventually they told me they were disabling online banking on my account. Now I've given up replying at all.

Send emails into that pit of PII misery if you want. I don't read them.


Yeah I agree with this. We might be able to benchmark it soon (if we can’t already) but asking different agentic code models to produce some relatively simple pieces of software. Fast models can iterate faster. Big models will write better code on the first attempt, and need less loop debugging. Who will win?

At the moment I’m loving opus 4.6 but I have no idea if its extra intelligence makes it worth using over sonnet. Some data would be great!


For what it's worth, most people already are doing this! Some of the subagents in Claude Code (Explore, I think even compaction) default to Haiku and then you have to manually overwrite it with an env variable if you want to change it.

Imagine the quality of life upgrade of getting compaction down to a few second blip, or the "Explore" going 20 times faster! As these models get better, it will be super exciting!


> Imagine the quality of life upgrade of getting compaction down to a few second blip, or the "Explore" going 20 times faster! As these models get better, it will be super exciting!

I'm awaiting the day the small and fast models come anywhere close to acceptable quality, as of today, neither GPT5.3-codex-spark nor Haiku are very suitable for either compaction or similar tasks, as they'll miss so much considering they're quite a lot dumber.

Personally I do it the other way, the compaction done by the biggest model I can run, the planning as well, but then actually following the step-by-step "implement it" is done by a small model. It seemed to me like letting a smaller model do the compaction or writing overviews just makes things worse, even if they get a lot faster.


Eh. Copyright only matters if it goes to court. And you only go to court over copyright if somebody is getting sued. That only happens when a plaintiff has standing, they can show damages and the person they want to sue has enough money to make it worth their while. (And if they'll make more money than it costs them in lawyers and negative PR. Suing users and developers for interacting with the product you sold them is generally considered a bad look.)

Anyway, nobody is going to sue you because you added your name (or "project contributors") to an ISC licensed source file in your own repository. Nobody cares. And there's no damages anyway.

Especially when the line added is:

> Copyright (c) brcmfmac-freebsd contributors

If you're right, that's an empty category. Thus the inclusion has no effect.


The Linux driver in this case is ISC licensed. There’s no legal or ethical problem in porting it. This is open source working as intended.

I feel like the jury is still out on whether this is acceptable for GPL code. Suppose you get agent 1 to make a clear and detailed specification from reading copyrighted code (or from reverse engineering). Then get agent 2 to implement a new driver using the specification. Is there anything wrong with that?


>anything wrong with that?

Wonder if the courts will move fast enough to generally matter.


As I understand it, reverse engineering for the purpose of interoperability is allowed under the law. The only thing subject to copyright is your code. So long as a separate implementation (made by an AI model or made by hand) doesn't use any of your actual code, you have no claim over it. Only the code is yours.

AI models make the process of reversing and reimplementing drivers much cheaper. I don't understand the problem with that - it sounds like a glorious future to me. Making drivers cheaper and easier to write should mean more operating systems, with more higher quality drivers. I can't wait for asahi linux to support Apple's newer hardware. I'm also looking forward to better linux and freebsd drivers. And more hobbyist operating systems able to fully take advantage of modern computing hardware.

I don't see any downside.


Drivers are usually easy to implement. What’s usually lacking is the specifications of the hardware. A lot of devices are similar enough that you can reuse a lot of existing code, but you do want to know which registers to read or fill.

This surprised me - but sure enough, they're right. The linux brcmfmac driver is ISC licensed:

https://github.com/torvalds/linux/blob/master/drivers/net/wi...

// SPDX-License-Identifier: ISC


A lot of Linux kernel drivers are permissively licensed, or dual-licensed with a choice of GPL and a permissive license. This is especially common for vendor-developed drivers. From a hardware vendor’s perspective, broad license compatibility directly supports adoption: the more operating systems, hypervisors, and embedded environments that can incorporate the driver code, the wider the potential market for the hardware itself.

If an OS is designed to do this from the ground up, it can be incredibly efficient. (See: SeL4). Each process on linux is essentially its own isolated virtual machine already. Linux processes just have all sorts of ambient authority - for example, to access the filesystem and network on behalf of the user which started the process. Restricting what a process can do (sandboxing it) shouldn't have any bearing on performance.

What evil deeds are you worried about in particular? What are you afraid people will do now that coreutils is MIT?

Does it even need to be explicitly stated? Closed linux userlands.

What does “userland” mean in this context? Closed source linux distributions? Closed source apps? What?

The gpl is generally considered to stop at the process boundary. I don’t really understand what you could do with a bsd licensed coreutils you couldn’t do with a gpled coreutils. You could make closed source Linux software which called coreutils in a child process. But by most readings of the gpl, you can do that today.

I suppose a company could fork coreutils and make it closed source, keeping their special extra command line options for themselves. But … I guess I just don’t see the value in that. You can already do that with FreeBSD. Apple does exactly that today. And the sky hasn’t fallen. If anything, it would be a relief to many if Apple’s coreutils equivalents were more compatible with gnu coreutils, because then scripts would be more freely portable between Linux and macOS.


> Go is a better Rust. Rust is an ugly version of C++ with longer compile times and a band of zealous missionaries.

Eh. There's a lot I like about Go. I adore its compilation speed and the focus on language simplicity. But its got plenty of drawbacks too. Default nullability is a huge mistake. And result types (zig, swift, rust) are way better than go's error handling. Sum types in general are missing from Go, and once you start using them its so hard to go back. Go also doesn't have anywhere near as good interop with native code. Mixing C (or any other LLVM langauge) with rust is easy and feels great. You even get LTO across the language barrier.

The big thing I'm growing to dislike about rust is how many transitive dependencies a lot of projects end up pulling in. Its very easy to end up with projects that take a million years to compile & produce huge binaries. Not because they do a lot but simply because everything depends on everything, and the dependency tree takes a long time to bottom out. I don't know what the right answer is. It feels more like a cultural problem than a language / ecosystem problem. But I wish rust projects felt as lightweight and small as most C projects I've worked with. I'm doing some work with the stalwart email server at the moment (written in rust). Stalwart is a relatively new, well written email server. But it somehow pulls in 893 transitive dependencies! I'm not even joking. Compiling stalwart takes about 20 minutes, and the compilation process generates several gigabytes of intermediate build assets. What a mess.


> Compiling stalwart takes about 20 minutes

20 minutes! What hardware is this on? I've worked on Rust projects with similar numbers of dependencies where the compile time (for a clean release build) was 2-4 minutes (on a MacBook M1 Pro)


UPDATE: tried compiling stalwart on my machine, and it took 14 minutes, with a really weird timing profile:

- 99% of the ~700 crates were done compiling in about a minute or 2 - RocksDB (a C++ dependency) was 2 minutes by itself - And then it took 10 minutes (ten!) just for the final binary at the end.

That's not normal for Rust code at all. Even large ones like Servo or Rustc or Zed.

UPDATE2: turns out they have LTO enabled by default. Disabling that brings the compile time down to 7 minutes. But that's still really unexpectedly slow.


Disabling codegen units = 1 speeds up the compilation further. But it’s still too many dependencies and too slow. The binary is pretty huge too.

> But it’s still too many dependencies and too slow.

I definitely agree that it's too slow. I just don't think the cause is "too many dependencies" because I've compiled Rust codebases with twice as many dependencies in half the time!

It seems to produce a 94MB binary. So it may be partly that there are some very big dependencies. But the amount of compilation time that ends up in the top-level crate (even with LTO disabled) also makes me feel like this must be triggering a compiler bug. Either that or using far too many generics.


I agree it's probably monomorphization (speculation without looking at it). Generic function parameters might be the root cause, but number of dependencies is a combinatorial multiplier.

I've hit compiler bugs that behave this way. Here's one from an LLVM upgrade [1]. The test case I discovered apparently took over 20 minutes to compile, up from 26 seconds on stable! Their value tracking algorithm was accidentally quadratic.

[1]: https://github.com/rust-lang/rust/issues/137909


> But it’s still too many dependencies and too slow. The binary is pretty huge too.

Ah yes, my never-ending crusade to raise awareness that the cost of free (as in effort) dependencies is bloat.

You can make useful tools that are tiny and compile fast [1], but it takes a lot of effort; exactly what developers don't want to spend.

[1]: Like https://github.com/parasyte/hd -- And I wrote about one of the tricks that it uses: https://blog.kodewerx.org/2023/04/improving-build-times-for-...


> Sum types in general are missing from Go

Not quite. It doesn't have tagged unions, which is what I expect you are thinking of, but it does have sum types.


Only by abusing interface {}. The result is horrible.

Go doesn’t have sum types as a first class primitive.


Using interface as it was designed to be used offers first-class sum types. Although not all interface use equates to sum types.

But they're not tagged unions. I expect that is still where your confusion lies. Tagged unions and sum types are not equivalent. Tagged unions are a subset of sum types.


This may be a bit too pedantic, but I consider interface {} to be a way to do polymorphism via type classes. Interface defines an open class of types which implement some interface.

Sum types are a type definition defining something as A or B. Not “anything that quacks like a duck”. But concretely “one of this or one of that”. This enables different syntax, like the match expression to be used, in which you exhaustively list all the variants. The compiler doesn’t need to heap allocate enums because it knows the maximum size of a single value. The compiler and programmer can take advantage of the knowledge that there’s a closed set of values the type can hold. It’s not an open type class.

Result and Option are quite beautiful as sum types. But they’re horrible as type classes. Try implementing them using interface{} in Go. It’s not the same.


> Interface defines an open class of types

But can also define a closed set of types, perfectly satisfying "sum types".

> This enables different syntax, like the match expression to be used, in which you exhaustively list all the variants.

Go does not provide this out of the box, but that is not prerequisite for sum types. The mathematical theory says nothing about "the compiler must produce an error if the user doesn't match call cases". There is sufficient information provided by the sum types if you wish to add this to your build chain yourself, of course.


By that definition a void* pointer in C is a sum type. By that definition assembly has sum types.

This argument feels like the “we have sum types at home” meme. Ergonomics matter.

I write a lot of rust. Rust has traits which are similar to Go’s interfaces. But the features aren’t the same, and I use enum all the time. (I also use trait all the time, but I use trait and enum for different things).


> By that definition a void* pointer in C is a sum type.

No. That doesn't make any sense. void* is essentially equivalent to any in Go, which isn't sum types either.

You can construct sum types in C by combining structs, enums, and unions, but it is not an out of the box feature like in Go. Sum types are a first-class citizen in Go.

> Ergonomics matter.

Math doesn't care about ergonomics. You might care about ergonomics, but logically when talking about those ergonomics you'd call those ergonomics by name, not by some unrelated thing from type theory.


> You can construct sum types in C by combining structs and unions, but it is not an out of the box feature like in Go. Sum types are a first-class citizen in Go.

Maybe I'm misunderstanding what you mean. Can you give me an example? How would you write a Result type in Go? (Result is defined as either Ok(val) or Err(err))


C:

    enum result_type {
        OK,
        ERROR
    };

    union result_value {    
        int value;
        char *error_message;
    };

    struct result {
        enum result_type tag;
        union result_value value;
    };
This is not technically closed, but does offer a close enough approximation. Again, not a first-class feature, so there is no expectation if it being true sum types.

Go:

    type Result interface {
        isResult()
    }

    type OK[T any] struct {
        Value T
    }
    func (OK[T]) isResult() {}

    type Error struct {
        Message string
    }
    func (Error) isResult() {}
This one is closed. It perfectly satisfies being sum types. It may not satisfy your opinion of what makes for good ergonomics, but if you want to talk about ergonomics let's use ergonomic words, not type theory words.

I'm curious, if tagged unions are a subset of sum type, what is your definition of "sum type"?

AFAIK, tagged union is sum type, based on sum type mathematical definition.


On second thought, I agree with your definition. So Go does, in fact, have tagged unions.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: