More

tyushk · 2026-02-20T23:48:02 1771631282

Data tagging? 20k tok/s is at the point where I'd consider running an LLM on data from a column of a database, and these <=100 token problems provide the least chance of hallucination or stupidity.

tyushk · 2026-02-16T05:03:40 1771218220

Code blocks unreadable if the user's system reports dark mode and dark mode is toggled on for the web page.

Cool writeup. Have you had to do any other weird shenanigans with getting FFI between Rust and Clojure other than needing to use CStrings?

tyushk · 2026-02-15T01:52:31 1771120351

In Rust, wouldn't implementing BitOr for Fn/FnOnce/FnMut violate the orphan rule?

wavemode · 2026-02-15T02:28:59 1771122539

I'm envisioning that in Rust (and Python), the operator overload would be on a class/struct. It would be the macro/decorator (the same one that adds logging) which would turn the function definition into an object that implements Fn.

graemep · 2026-02-15T15:12:47 1771168367

I have done exactly that as an exercise in what you can do with Python: overload |, and a decorator that you can use to on any function to return an instance of a callable class that calls that function and overloads |.

Whether it is a good idea to use it is another matter (it does not feel Pythonic), but it is easy to implement.

carterschonwald · 2026-02-15T14:55:02 1771167302

somehow this counts like model cot.

tyushk · 2026-02-10T17:35:30 1770744930

I don't think local as it stands with browsers will take off simply from the lead time (of downloading the model), but a new web API for LLMs could change that. Some standard API to communicate with the user's preferred model, abstracting over local inference (like what Chrome does with Gemini Nano (?)) and remote inference (LM Studio or calling out to a provider). This way, every site that wants a language model just has to ask the browser for it, and they'd share weights on-disk across sites.

radarsat1 · 2026-02-13T10:40:24 1770979224

It sounds good, but I'm not sure that in practice sites will want to "let go" of control this way, knowing that some random model can be used. Usually sites with chatbots want a lot of control over the model behaviour, and spend a lot of time working on how it answers, be it through context control, guardrails or fine tuning and base model selection. Unless everyone standardizes on a single awesome model that everyone agrees is the best for everything, which I don't see happening any time soon, I think this idea is DOA.

Now I could imagine such an API allowing to request a model from huggingface for example, and caching it long term that way, yes just like LM Studio does. But doing this based on some external resource requesting it, vs you doing it purposefully, has major security implications, not to mention not really getting around the lead time problem you mention whenever a new model is requested.

tyushk · 2026-01-11T18:58:41 1768157921

> "Open source" to me is sharing the input required [...]

I don't disagree with your sentiment, I am also more interested in human-written projects, but I'm curious about how this works. Would a new sorting network not be open source if found by a closed source searching program, like AlphaDev? Would code written with a closed source LSP (ie. Pylance) not be open source even if openly licenced? Would a program written in a closed source language like Mojo then be closed source, no matter what the author licences it under? The line between input and tool seems arbitrary at best, and I don't see what freedoms are being restricted by only releasing the generated code.

OlaProis · 2026-01-13T10:17:58 1768299478

the line is blurry for shure. Code generated by a closed-source compiler (or LSP) is still 'your' code. Maybe the difference is whether humans can reproduce and learn from the process? With traditional code, you can read commit history and understand the author's thinking. With AI-generated code, that context is lost unless explicitly shared. Food for thought.

tyushk · 2026-01-01T02:45:41 1767235541

I don't think your ultimatum holds. Even assuming LLMs are capable of learning beyond their training data, that just lead back to the purpose of practice in education. Even if you provide a full, unambiguous language spec to a model, and the model were capable of intelligently understanding it, should you expect its performance with your new language to match the petabytes of Python "practice" a model comes with?

lovidico · 2026-01-01T03:43:13 1767238993

Further to this, you can trivially observe two further LLM weaknesses: 1. that LLMs are bad at weird syntax even with a complete description. E.g. writing StandardML and similar languages, or any esolangs. 2. Even with lots of training data, LLMs cannot generalise their output to a shape that doesn’t resemble their training. E.g. ask the LLM to write any nontrivial assembler code like an OS bootstrap.

LLMs aren’t a “superior intelligence” because every abstract concept they “learn” is done so emergently. They understand programming concepts within the scope of languages and tasks that easily map back to those things, and due to finite quantisation they can’t generalise those concepts from first principles. I.e. it can map python to programming concepts, but it can’t map programming concepts to an esoteric language with any amount of reliability. Try doing some prompting and this becomes agonisingly apparent!

tyushk · 2025-12-11T16:12:10 1765469530

Would this be similar to how Rust handles async? The compiler creates a state machine representing every await point and in-scope variables at that point. Resuming the function passes that state machine into another function that matches on the state and continues the async function, returning either another state or a final value.

RossBencina · 2025-12-13T11:29:39 1765625379

It's only related in so far as it involves separate storage for the data. I'm thinking of functions that run to completion, not functions that yield and resume, but maybe it's not hard to do coroutines by storing the continuation pointer in the state struct.

tyushk · 2025-12-07T00:56:25 1765068985

> A BBC journalist ran the image through an AI chatbot which identified key spots that may have been manipulated.

The image is likely AI generated in this case, but this does not seem like the best strategy for finding out if an image is AI generated.

1659447091 · 2025-12-07T01:10:12 1765069812

Under the other photos it says A photo taken by a BBC North West Tonight reporter showed the bridge is undamaged and A BBC North West reporter visited the bridge today and confirmed it was undamaged

They may have first ran the photo through an AI, but they also went out to verify. Or ran it after verification to understand it better, maybe

lazystar · 2025-12-07T01:49:32 1765072172

So.. is this where the AI hype train starts to lose steam? One AI hallucinated and caused the incident, and another AI program just wasted everyone's time after it was unable to verify the issue. Sounds like AI was utterly useless to everyone involved.

bigiain · 2025-12-07T03:43:07 1765078987

> One AI hallucinated and caused the incident

I suspect that AI was prompted to create the image, not that this was an incidental "hallucination".

Cynical-me suspects this may have been a trial run by malicious actors experimenting with disrupting critical infrastructure.

permonst · 2025-12-07T05:05:31 1765083931

There is precedent for state actors putting a lot of effort into a hoax like this: https://en.wikipedia.org/wiki/Columbian_Chemicals_Plant_expl...

1659447091 · 2025-12-07T02:12:10 1765073530

> Sounds like AI was utterly useless to everyone involved

Maybe.

Imo, I think the advances in AI and the hype toward generated everything will actually be the current societies digitally-obsessed course-correction back to having a greater emphases on things like theater, live music, conversing with people in-person or even strangers (the horror, I know) simply to connect/consume more meaningfully. It'll level out integrating both instead of being so digitally loop-sided as humans adapt to enjoy both.*

To me, this shows a need for more local journalism that has been decimated by the digital world. By journalism, I mean it in a more traditional sense, not bloggers and podcast (no shade some follow principled, journalistic integrity -- as some national "traditional" one don't). Local journalism is usually held to account by the community, and even though the worldwide BBC site has this story, it was the local reporters they had that were able to verify. If these AI stories/events accelerate a return to local reporting with a worldwide audience, then all the better.

* I try to be a realist, but when I err, it tends to be on the optimist side

nicoburns · 2025-12-07T03:11:04 1765077064

The tech giants sucking up all the ad revenue is what killed local journalism. Unless you can find a solution to that problem (or an alternstove fundong model), it's not coming back.

DrewADesign · 2025-12-07T04:47:48 1765082868

But just think of all the people that didn’t have to receive a paycheck because of all this efficiency!

It’s really incredible how the supposedly unassailable judgement of mass consumer preference consistently leads our society to produce worse shit so we can have more or it, and rewards the chief enshittifiers with mega yachts.

kiba · 2025-12-07T05:16:06 1765084566

They have powerful untaxed monopolies in excess of the economic value tech companies themselves generate.

At some point, the value of their services come from the people who use their sites.

ceejayoz · 2025-12-07T01:58:19 1765072699

> Sounds like AI was utterly useless to everyone involved.

Not the hoaxer!

skissane · 2025-12-07T03:13:33 1765077213

Someone I know is a high school English teacher (being vague because I don’t want to cause them trouble or embarrassment). They told me they were asking ChatGPT to tell them whether their students’ creative writing assignments were AI-generated or not-I pointed out that LLMs such as ChatGPT have poor reliability at this; classifier models trained specifically for this task perform somewhat better, yet also have their limitations. In any event, if the student has access to whatever model the teacher is using to test for AI-generation (or even comparable models), they can always respond adversarially by tinkering with an AI-generated story until it is no longer classified as AI-generated

frenchtoast8 · 2025-12-07T04:18:11 1765081091

A New York lawyer used ChatGPT to write a filing with references to fake cases. After a human told him they were hallucinated, he asked ChatGPT if that was true (which said they were real cases). He then screenshotted that answer and submitted it to the judge with the explanation "ChatGPT ... assured the reliability of its content." https://www.courtlistener.com/docket/63107798/54/mata-v-avia... (pages 19, 41-43)

henry2023 · 2025-12-07T05:16:52 1765084612

I hope he was disbarred.

FridgeSeal · 2025-12-07T05:30:06 1765085406

He was probably offered a role at some ai obsessed firm because of his “ai-native workflow”.

IAmBroom · 2025-12-08T20:39:35 1765226375

I'm just worried he was tapped for a position in the current administration.

euroderf · 2025-12-07T06:47:20 1765090040

Or sent to court-ordered LLM Awareness classes.

techjamie · 2025-12-07T03:21:58 1765077718

Reminds me of a Reddit story that made the rounds about a professor asking ChatGPT if it wrote papers, to which it frequently responded afirmatively. He sent an angry email about it, and a student responded by showing a response from ChatGPT claiming it wrote his email.

gblargg · 2025-12-07T04:09:41 1765080581

> student responded by showing a response from ChatGPT claiming it wrote his email

Which is actually fine. Students need to do their own homework. A teacher can delegate writing emails.

recursive · 2025-12-07T04:27:15 1765081635

But if he didn't delegate, and it said he did, that would suggest that the methodology doesn't really work.

arcanemachiner · 2025-12-07T04:16:44 1765081004

I believe you just got whooshed.

gblargg · 2025-12-07T06:38:23 1765089503

Yes, I missed the student using the teacher's trust in those tools to make them even more angry and neuter their angry email that they (probably) actually wrote themselves. Well-played.

MengerSponge · 2025-12-07T04:49:50 1765082990

A person arguing in favor of LLM use failed to comprehend the context or argument? Unpossible!

gblargg · 2025-12-07T13:13:44 1765113224

I realize you might have failed to comprehend the level of my argument. It wasn't even about LLMs in particular, rather having someone/something else do your work for you. I read it as the student criticizing the teacher for not writing his own emails, since the teacher criticizes the students for not writing their own classwork. Whether it's an LLM or them hiring someone else to do the writing, this is what my rebuttal applied to. I saw what I thought was flawed reasoning and wanted to correct it. I hope it's clear why a student using an LLM (or another person) to write classwork is far more than a quality issue, whereas someone not being tested/graded using an LLM to prepare written material is "merely" a quality issue (and the personal choice to atrophy their mental fitness).

gblargg · 2025-12-07T06:37:03 1765089423

I don't think I was arguing for LLMs. I wish nobody used them. But the argument against a student using it for assignments is significantly different than that against people in general using them. It's similar to using a calculator or asking someone else for the answer: fine normally but not if the goal is to demonstrate that you learned/know something.

I admit I missed the joke. I read it as the usual "you hypocrite teacher, you don't want us using tools but you use them" argument I see. There's no need to be condescending towards me for that. I see now that the "joke" was about the unreliability of AI checkers and making the teacher really angry by suggesting that their impassioned email wasn't even their writing, bolstered by their insistence that checkers are reliable.

IAmBroom · 2025-12-08T20:41:30 1765226490

Two posts from you addressing a one-line reply? May be time to put down the coffee and take a drag from the mood-altering-substance of your preference.

gblargg · 2025-12-07T06:42:01 1765089721

Apologies to everyone I upset by this comment. It was just an innocent mis-reading of the joke. Lesson learned.

weird-eye-issue · 2025-12-07T04:48:42 1765082922

You missed the entire point lol

gblargg · 2025-12-07T06:39:31 1765089571

Yeah, I'm really sorry. I didn't realize it would upset so many people.

ikr678 · 2025-12-07T05:41:49 1765086109

Students (and some of my coworkers) are now learning new content by reading AI generated text. Of course when tested on this, they are going to respond in the style of AI.

jameslk · 2025-12-07T03:55:14 1765079714

ChatGPT: This looks like AI. I can tell from some of the pixels and from seeing quite a bit of training data in my time.

informal007 · 2025-12-07T02:45:50 1765075550

This is the fast way they can try, but it shouldn't be the most trustworthy way and shouldn't be in report.

maxlin · 2025-12-07T03:29:10 1765078150

Yeah that hardly talks of the "journalist" being good at their job. At worst they asked a biased question like "has this photo been AI generated and if then how" or worse.

People tend to think that AI is like a specific kind of human which knows other AI things better. But we should expect better from people that do writing as their job.

vkou · 2025-12-07T01:14:05 1765070045

It's not, but when you have 30 minutes to ship a story...

Nasrudith · 2025-12-07T20:07:47 1765138067

Yeah, it is frankly just plain bad epistemology to expect an AI chatbot to have answers on a matter such as this. Like trying to get this week's lotto numbers by seeking a reading in bible passages and verses. There is no way that the information was encoded within in there as it would violate causality. At best you'd have coincidental collisions only.

Davidzheng · 2025-12-07T03:29:19 1765078159

If it's nano banana you can give it to Gemini bc it has artifacts

RestartKernel · 2025-12-07T05:33:13 1765085593

All these tool integrations are making it increasingly difficult to explain to non-tech people what these chatbots are capable of. Even more so as multi-modality improves (at some point image generation went from a distinct tool to arguably an inherent part the the models).

Blackthorn · 2025-12-07T01:42:32 1765071752

Yeah, talk about begging the question. Yikes.

hexbin010 · 2025-12-07T07:26:22 1765092382

Do you not think even BBC "journalists" are suffering from immense pressures to use AI for efficiency? It's everywhere

tyushk · 2025-11-15T02:48:38 1763174918

If it really is fully autonomous, that first video is insane. I struggle to put those little tags into the slot in the box sometimes, and I'm pretty sure I'm human, but the bot gets it on the first attempt.

krasin · 2025-11-15T04:12:17 1763179937

Yeah, this company (GeneralistAI) is, in my opinion, the most advanced robotics+AI company in the world. Slightly behind them Google DeepMind Robotics and Physical Intelligence, and then the rest.

tyushk · 2025-10-06T21:18:29 1759785509

I see the idea, but you're competing with Microsoft Word and Overleaf for non-techies, and LaTeX/Typst for techies, and that sounds like a losing battle on both fronts. Non-techies want something familiar that they already know how to use, like Word, just with bib and their university's template. Techies probably don't want a cloud only service for a mostly solved problem. I don't see the value as a techie, and I don't see why I wouldn't just use my University's Word template from a non-techies view.

cheesekunator · 2025-10-07T00:14:10 1759796050

And you'll always have a professor say, "Send me the word document for review", then they will provide inline feedback and return the file back to you. In these cases the technology isn't the constraint, the existing process from the institution is.