Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The solution could be great. I really don't like the way culture always goes to the same tropes, calling any potential innovation "out of Star Trek" (with attendant distorted expectations), right down to expecting an interface based on literal hand-waving in Minority Report. If copyright held works ("USS Enterprise") could be removed, yet the actual essential concepts (space ship, naming things) retained, it would be a tremendous breakthrough.

I think what NYT &c want is for large companies like Apple to pay them for access to their works. This to me is the wrong path, just leading to more silos and walled gardens, special access for the elite.

An alternative is base models trained on Wikipedia and public domain (science journals, etc). Foundations could support high quality, well rounded current events reporting. Wikimedia provides a good model for this, with referenced summaries that I don't think can be said to reasonably violate copyright. The models would need to be improved to support references, or RAG attribution would have to be widely used when bringing in works that have a current copyright.



Science journals are mostly under copyright of a few big publishers who are extremely hostile to any kind of ML being performed on the content.


That's not as true as it used to be, and there are still plenty of useful open journals/open science publications, though proper attribution would often be important.

[edit] you could pretty much say that on principle, any significant development should have a publication in the open.


and yet, who pays? This is fine if we are going to go full communist - I have no objection personally - but selective appropriation of peoples livelihoods is more full mafia or full feudal.

I don't see that as a step forward.


I am not sure there are many significant science breakthroughs that aren't basically built on a 'full communist' (publicly funded) model.

There is a very significant tension between making all works that are produced on the back of giants all the way down free (practically speaking, everything, unless there are significant works developed by feral people), vs keeping individuals fed and happy, vs giving corporations so much power they only serve themselves.

I think the benefits of saying any significant development must at least publish its information metadata including descriptions has too many benefits from any perspective to not be supported, and I'm not sure it'd be that expensive if its incorporated in existing systems. It would create its own network effects.


I see where you are coming from. I think that there is a coupled issue that we are going to have to sort out as a society very soon which is that science / publishing is busily disintegrating under the weight of publication counts.

I've noticed two things this year, arxiv has become the target for many teams in a rush to get priority and make an impact while some insight is still part of the zeitgeist and the twitterarti hasn't moved on. This is because people have got used to the idea of things going "viral" and getting citations and impact because of fame - not significance. None of this is peer reviewed, some of it is bullsh*t. There is no penalty for the bullshit, and folks know it.

The second is that I am getting lots of citations on work I did a long time ago. I think that some of this is genuinely because that work is now more relevant and people are trying to do the things that we did with (effectively) bits of stick and good hopes with their shiny supercomputers, but I am vain. In reality these papers are getting cited because the paper mill machines have figured out that they look more genuine by sticking them in as a slightly obscure but relevant reference.

Both of these things are part of the collapse of trust and communication in publishing. It's not just compsci - there were 28k publications in astronomy last year. My cousin is a cell biologist and when we had a few drinks she told me that her her peers flat out don't trust publications and keep a share a list of authors that are trustworthy - if you don't have the list and aren't on it, you probably don't know it exists. This is the only way that they can avoid losing months trying to use techniques that are just lies.

So we do need a way of capturing and reframing this knowledge and the current system of peer review isn't it. Maybe LLM's can help us, but we have to set them up to make the dominant strategy honesty and parsimony in sharing - so that the font of knowledge isn't a pool of crap.


Hmm I wonder if that's also because many more people are involved, which is a good thing. After all, the Web started as a research system, the more it returns to that the better. Intuitively I'd think LLMs/other algos could help since they can generalize, extract uniqueness, identify power laws, though that will be possibly gamed, that also helps create more robust algorithms. And it has a fascinating relevance to all works of society.


> special access for the elite.

I think that this is about property rights, the news industry has been gutted in the last 30 years, a lot of content creators (journalists) have lost their livings. The ones that are left are going to lose their livings if the content they generate is rendered valueless because there is no way of protecting that value.

In terms of special access, think about your shoes. They are nice, but only you are allowed to use them. This is not fair. You are the elite...

This goes to difficult places.


I don't think it will be rendered valueless, but it shouldn't be completely exclusive. Basically as an extension of today's search model (which is a large part of what LLMs are, along with a grab-bag of useful ML algorithms), people should be able to access information universally, but if they want to go to perspectives or very fine details, then a pay model is acceptable, as long as there's a trail from freely available information and evaluable models. Ultimately imo there are bigger problems with elite/gatekeepered information than finding new ways to produce or support information development, given the power a few corporations are gaining and the opportunity to overcome stratified society.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: