They could sue for damages beyond the means of the defendant. Then the defendant would need to hire expensive lawyers. You either lose the case and are bankrupt due to damages, or you run out of money spent on counsel.
Pardon me if this is different in the states, but in a number of EU countries you're entitled to a state-sponsored lawyer if you're at a threat of being unable to defend your rights simply because of financial means. This generally only applies when you're the one who is sued, plus caveats depending on the jurisdiction.
According to the European e-Justice Portal [1], the right to legal aid is contained in art 6(3)(c) of the ECHR and art 47 of the CFR. However, the first right only applies in criminal cases and the second right is limited 'in so far as such aid is necessary to ensure effective access to justice.' Apparently, 'the broad objective in some States seems to be to make legal services and access to justice generally available, whereas in others, legal aid can be available only to the very poorest.'
Based on that summary, it seems like the position in most of the EU is probably much the same as the common law world (US/UK/CA/AU/NZ): you'll probably get legal aid if you're destitute and facing potential imprisonment, but you're unlikely to get legal aid to defend a copyright infringement case, especially if your net assets are non-zero and you can easily make the case go away by just removing some web links. That may mean that you are thereby deprived of your rights, if your legal argument is correct, but resource-starved legal aid bodies will prioritise cases that involve a smaller chance of someone being wrongfully convicted.
I don't see how you went from "some states make legal services generally available, while others make it available to only the poorest" to "in most of the EU you'll probably get legal aid only if you're destitute". I appreciate the reference, but I don't agree with your conclusion.
My conclusion was based on the fact that this is how it works in the common law world, and the letter of the law in the EU is pretty much the same. It might be incorrect. I was hoping someone from the EU would provide more information if so.
Indeed, often in the US you could do better without the public defender. Many public defenders push their clients into plea deals despite innocence to save time.
> Now, reformers are using data in a novel attempt to create such a standard. The studies they have produced so far, in four states, say that public defenders have two to almost five times as many cases as they should.
> The bottom line: Mr. Talaska would have needed almost 10,000 hours, or five work-years, to handle the 194 active felony cases he had as of that April day, not to mention the dozens more he would be assigned that year. (The analysis did not include one death-penalty case on his roster, the most time-consuming type of case.)
When it comes to matters of technology, that burden is exceedingly low. Especially considering you are statistically unlikely to get a judge that has a clue about what is going on.
At a guess you've never taken a case to court. $100K is low; and that's if you win. Never mind being awarded costs because then suddenly it turns out the court believes your lawyers worked for 10% or so of what they invoiced you.
But can't anybody sue anyone for anything anytime? Does that mean anybody can bankrupt anyone at will?
As far as I know, being bankrupt means you owe someone more money then you have.
At what point does this situation occur?
I would think at first a court needs to accept the plaint and order him to write a rebuttal or to appear in front of the court or something? I wonder what exactly would happen.
They are called class action lawsuits. And lawyers wield them like blazing hammers. Basically, corporation A doesn’t like what you are doing so they speak to a lawyer. The lawyer puts out advertisements on busses and craigslist to find people to join the “class.” And once they have enough info they can turn the “class” into millions of people. Now you are facing lawsuit with payout in the tens or hundreds of millions. And you have to fight it and it is extremely expensive. Then at the end of the day you have spent a boat load of money and settle and the lawyer pockets 4% which is usually several million dollars. Then you wait until the next class action comes around. And thats what owning a company in the United States is like.
Elsevier would not be filing a class-action lawsuit in this case. Just a normal lawsuit, in which the plaintiff would be Elsevier, who has an interest in protecting its intellectual property or intellectual property it has an exclusive license to.
Nah what will happen is Elservier will find all the other corporate entities that are also being linked to. Their lawyers will say these companies are “similarly situated” and seek to form a class. The class will end up being literally anyone who has their content being linked to. Then their lawyers will file motions of discovery. Those motions will seek every single piece of information conceivable in regards to how the information was linked. Computer make and model. Software installed. Who did it when where why and how. It’ll be a flood of paperwork. That’s how all major companies operate.
Example? This seems a very unusual scenario for a class action lawsuit to me, I can't think of a comparable example, involving corporations forming a class like this.
I don't think "That’s how all major companies operate," but if it is, it should be easy for you to find an example of a class action lawsuit like this?
(I agree that harrasment lawsuits where it is very expensive to defend regardless of your chances of victory -- are how all companies with enough money to do it operate. Just without the class action component).
My impression is that almost all class actions are filed by individuals on behalf of classes consisting primarily of individuals rather than companies. I'd be curious to see if you have a counterexample of companies abusing the class action process as plaintiffs.
It may be a sometimes-abused legal tactic, but you seem to have described an effective consumer protection system. How else would you defend everyday people from regular corporate civil lawbreaking?
I’ll concede that point in that the very threat of a class action is enough to keep most companies in line. But look at the Experian thing. A lifetime of threat for the people who had data exposed, a payout in the range of cents per person, and then lawyers pocketing millions of dollars. And Experian walks away with essentially no lasting effects. It’s just very frustrating.
I'd love to see a norm develop where the 'authoritative link' to an article is expected to be the most open. So, if there's a closed journal and an Arxiv pre-print, Arxiv gets the link, with the journal's publication status considered 'about the article', but not the thing itself.
I think it moves us towards a clearer understanding of Academic Journal publication as peer review's 'stamp of approval', rather than the explanatory event per se. And this will make easier to move towards long-term, sustainable practices for publication and science.
Journals used to have several important roles: curation of articles, maintaining a reputation of quality (peer review, etc), and the actual physical publication and distribution of the papers. Cheap personal computers capable of "desktop publishing" and the internet made publication and distribution really cheap and easy. Those tasks no longer require a lot of expensive specialized skills and expensive typesetting/printing tools. This means journals need to stop treating those tasks as if they were still a scarce resource, and rework their business model around the tasks people still value highly: high quality curation and a reliable and trustworthy reputation.
The actual hosting of the PDFs (and TeX, and hopefully even the raw data) is something that universities or whomever the researchers are working for could host cheaply and easily. When I was attending UC Davis in the late 90s, the university hosted a huge archive that not only included their own publications, it also mirrored the publications of the other UCs and many important public archives like kernel.org.
Compared to huge archives of Linux distros and pre-GIT source code histories, hosting a bunch of PDF/TeX is effectively free. Reliable curation that saves a lot of people from wasting their own time and effort trying to find useful/interesting papers is extremely valuable.
First we need a "verified" badge in biorxiv/arxiv to verify that the current preprint version is exact copy of the one published in the journal. Then DOI could arrange to redirect to that copy instead
That is not currently how the DOI infrastructure works at all.
Individual entities register DOIs, and decide where they redirect (and can change the resolution at any time). In these cases, the publisher (such as Elsevier) is the one who has registered the DOI, and they get to decide where it redirects/resolves. They also paid for the DOIs.
There are actually a (small-ish) number of DOI registrars. The largest, and most likely by far to be used for scholarly articles, is CrossRef.
Neither CrossRef nor the DOI foundation have the authority to change where a DOI resolves to, against the wishes of the DOI registrant. (It would be like a DNS registrar or the IANA deciding news.ycombinator.com should resolve somewhere other than Y Combinator wants it to -- indeed DOI works pretty analogously to DNS, probably intentionally by inspiration).
What you propose would require major changes to the social and business setup of DOI. Probably to the business/sustainability model too, because a registrant would probably be less excited to pay for a DOI they don't actually get to control the resolution of. (CrossRef and the International DOI Foundation are both non-profits. They still need to pay for their operations, and the DOI infrastructure. That is currently funded by charging registrants for DOIs). It would also require some kind of "regulatory regime" to determine who has the authority on what basis to determine where a DOI resolves (and those 'regulators' would probably increase expenses, which you need a new plan for funding), compared to the current situation where whatever entity registered a DOI decides where it resolves to (similar to DNS).
You need neither.
Simply hash both articles, and reference it by hash.
Then you will automatically get the right paper, no matter the source (it could even be from a bittorrent magnet link).
DOI are horrible invention, they are prone to man in the middle attacks and dead links, please don't use them.
A slight impediment to that is that ArXiv discards PDFs that have not been accessed in a while, and rebuilds them from TeX source if later accessed. The result may not have the same hash - I sometimes even see ArXiv PDFs with today's date in them despite being published a long time ago, because the author used the \today macro. So you would need reproducible builds for the hashes to be valid, or for ArXiv to no longer have the storage concerns thst lead them to this practice. Or you could hash the TeX I suppose.
Yeah, you should hash the TeX. It's a pity really that PDF has become the dominant publication format, it's just so bad and non machine readable. It's absurd to me that scientific publications haven't switched over to HTML, I mean that format was invented for scientific publication...
References to third party websites that can break. HTML is a living spec, so browsers can decide to break things that work today (as happened with Marquee for eg).
Even if you disallow JS entirely, and stick with just HTML/CSS, it has enough warts to not look and behave consistently over time.
A link could easily be a URN that identifies the target by its hash, all the protocols for that are already in place e.g. magnet links.
PDF doesn't have JS and hyperlinking, so I guess all you'd need would be HTML, even ignoring CSS, which could be tailored to the reader, e.g. Latex style, troff style etc.
Vanilla html with images embedded as data URNs should be pretty darn portable for the forseeable future.
Heres the kicker, it's a text based protocol, and a dead simple one, eben if we should loose all browsers in the big browser war of 2033, its super easy to reverse engineer. PDF not so much.
A subset of HTML + CSS (+ ECMAScript?) could replace PDF for this purpose. However, is there a standard subset and familiar, understandable tools for working with it? In general, using the 'save as' function in a web browser won't produce a document that looks the same 10 years later. Rewriting the source document using a tool like wget can achieve this, but it doesn't always work (eg. what if the content was pulled in asynchronously?), and you need a computer expert to create and explain how the archived format relates to the live content. 'Save as PDF,' despite its technical inferiority, is easy and widely understood.
HTML/CSS is extremely backwards compatible, modern browsers don't have problems of displaying the page differently.
How does pdf solve link rot problem? Pdf is good for print, it's consistant. But fails when display size other than big screen, especially e-ink displays that don't tend to be your standard A4.
PDFs don't solve link rot. But in HTML, it's conventional to rely on links for stylesheets and sometimes even content (images, asynchronous DOM elements), so link rot is a bigger problem.
Yeah for publishing you don't want content in links, but you can solve that with data URIs that embed images and other data directly into the link [1].
Imo tex isn't much more machine readable, depending on what you want to do. Reformatting or lossy conversion to plaintext? Sure. Determining semantics? Good luck.
The journal version and the arxiv version will never hash to the same value because they are not bit-identical. But you want to link to the peer reviewed version, or one which is semantically identical to the peer reviewed version. So somebody needs to check that the arxiv version is semantically identical to the journal version.
You should hash the TeX, not the PDF. Alternatively you could have both documents PGP signed by the author with a hash of the original tex, if you want to make sure you get the right "semantically the same but different" version.
But tbh that seems to be a slippery slope that I wouldn't want to go on, where do you draw the line for your semantic differences? Imagine you quote something which gets edited out, suddenly it looks like you quote nonsense while it's the original references fault.
There is no TeX source for the journal version. The point is that you don't want to trust the author to verify that the peer-reviewed+accepted version is the same as the arxiv version, and that it will not be changed. That's why people generally cite the journal version. Because it's immutable.
Journal versions are simply not immutable because they are referenced by name, not by content. I regularly see a good percentage of dead or wrong DOI, and I've hunted my fair share of papers that were supposedly released in a journal, but that only ever existed in preprint.
Arxiv already accepts latex and compiles it for you, we should expect the same from journals and ask them to publish the hash of the document they received.
Journal versions are reference by journal name, volume, year, page number, indexing a hard copy version you can find in a library. Seems pretty immutable to me.
The journals I published in all accepted latex. But they convert it to use their layouting software. The last correction steps are typically done only in this version, and the author has to backport them into their tex code.
Why should the journal have any interest in making the arxiv version more attractive?
Even if we ignore reprints, editorial series that rearrange papers (and make a paper citable more than one way), and proceedings (which often don't properly distinguish between papers, but use author + proceeding).
Science simply doesn't operate on journal published papers most of the time. The paper mills run so
hot that you regularly cite preprints, that get exchanged between authors directly.
It happens regularly that the proof is supposedly in the "full paper" only that the "full paper" was never published.
Essentially the same reason we need peer review in the first place. Many authors have strong but wrong opinions. But even without malice:Some don't care that the arxiv version is slightly different from the paper.
I dont see why anyone would put different content in the two papers since its so trivial to be ridiculed for that. I dont think arxiv has resources to review if the preprints are the same as the final, and it seems an overkill thing to do .
Also in many cases there is a final round of modifications done by the publisher that you are not free to distribute. For journal paper I was told that sometimes you cannot even publish the corrected version after rebuttal.
it's not the same file - just the same, final text proof. It will be different from the final formatting in the journal.
I dont think authors have incentive to abuse the system. Just upload the final proof of your manuscript to arxiv, click "final version" , and this lets people know that this is the same article as in the journal.
DOIs are ubiquitous and they would serve the purpose of redirecting to the free pdfs rather then the journal site. This can be applied to existing articles retroactively. Plus, many bibliography styles include the DOI which makes the reference easier to use
Semantic Scholar[0] tends to do this, but their search functionality leaves something to be desired. I tend to use them to discover DOI addresses and find related media if I already know the paper's title (e.g. following-up on a bibliography w/o links, as is the norm in many publications)
2. Was their value proposition true back in the day? How did it change over time?
3. How did the industry evolve?
3b. What are the barriers to entry and how did those evolve?
3c. What was the business model then and what is it now? Who are the clients (universities, I presume)? How do they pay for this (taxpayer money? If so, that's a huge clue). Who is the decision-maker regarding these purchases?
4. How is it possible that they still exist? If they don't provide value anymore, then what is it? Are people habitual creatures? Are they pressuring universities? Did they somehow get a vendor lock-in effect (can people get out)?
I'm simply trying to get why it became what it became and why it isn't dying in the slightest despite researchers trying to organize themselves a bit for open access.
Strange as it is to imagine, scientific journals used to be printed on paper. Elsevier was founded in 1880 as a book publisher, then a magazine publisher before expanding into scientific publishing. Subscribing to a scientific journal was not fundamentally different to subscribing to any other kind of magazine; university libraries paid a fee to receive a printed document every month, containing a collection of articles related to that field.
The internet should have changed this model, but the publishers had amassed a secondary form of value - prestige. Scientists don't only want to share their research with their peers, but share that research in a prestigious publication. Getting a paper published in Nature or the BMJ has a meaningful impact on the author's career.
The internet makes it trivial to distribute documents for free, but nobody thus far has found an adequate substitute for the prestige of a highly respected journal. Universities keep paying to subscribe to prestigious journals because they publish some of the most interesting and significant research; academics keep sending their best work to the most prestigious journals because it's valuable to their career prospects.
This problem can probably only be solved through coordinated action by academics.
Sure, but the "Elsevier" that existed from 1880 to sometime in the 1970s or so was very different from today's Elsevier. Basically, some entity bought it, and set out to monetize its reputation, by raising prices to what universities would tolerate.
We've seen the same process in the pharmaceutical and newspaper industries, and likely others.
> The internet makes it trivial to distribute documents for free
While I agree with pretty much everything else you wrote, I have to take some exception to this. Yes it is trivial to put something online. But library sciences exist for a reason, so if you want a nice collection of things online with some sensible structure it becomes less trivial. Worse yet if you want some way of ranking in importance the things in your collection, it requires a lot of manual intervention.
I think your solution is ultimately correct, but the peer review process is harder to solve than the technical one, and there is still a cost of some sort to implement and maintain the technical one.
> nice collection of things online with some sensible structure
Publishers do not currently do this. Their websites are a confusing jumble with difficult navigation, poor search features, no useful discovery features, ...
The “nice collection” with “sensible structure” is provided by external citation indexes like Google Scholar.
>While I agree with pretty much everything else you wrote, I have to take some exception to this. Yes it is trivial to put something online. But library sciences exist for a reason, so if you want a nice collection of things online with some sensible structure it becomes less trivial. Worse yet if you want some way of ranking in importance the things in your collection, it requires a lot of manual intervention.
Specifically, what structure do paid journals have that Arxiv lacks? How much would it cost to provide that structure outside of the current system, bearing in mind that editors and peer reviewers aren't paid a dime by the publishers?
>, bearing in mind that editors and peer reviewers aren't paid a dime by the publishers?
The peer reviewers may not be paid but the prestigious academic journals (Nature, Cardiology, etc) do have paid editorial staff. E.g. the initial rejections of papers and rounds of back-&-forth copy-editing with the author are handled by the paid editors and not the unpaid peer reviewers.
That's not to say there are also journals that are run by 100% unpaid volunteers but I'm unaware of any particular Elsevier journals with that labor structure.
(However, this is not an endorsement of Elsevier's high subscription fees.)
Science and technology progress on correctness, accuracy, and efficacy, not popularity.
It turns out that gatekeeping, cabal-fostering, orthodoxy-preserving, and kingmaking are lucrative practices. But truth cares not for prestige. As the motto of the Royal Socity states: In nullis verba. By the word of no one. Evidence, not authority.
(The situation may be different in some of the arts and letters. Which speaks to another can of worms.)
This is a nice ideal but science is still an intensely human endeavour. The human part of science brings with it the politics and prestige that is associated with scientific work.
To try to bring an analogy, D. Knuths recent comment on the recent sensitivity proof has brought the discussion of the topic on a different level because it has basically been stamped with the approval of a highly prestigious scientist.
So although the sensitivity proof is a mathematical proof, the recent buzz around it has rocketed it to another level. This brings about real rewards in the form of promotions and invitations to exclusive social circles, all of which are desirable to humans.
To be slightly cynical, the majority of research output is not worth your time to read, so mechanisms that presort (E.g. by author list, keywords, and journal name) help to increase the chance you read a paper worth your time.
"Red herring" in that it's not what the ultimate goal of academic publishing is.
Is prestige very much baked into the practice, profession, career, and culture of academia? Yes. Has it largely always been thus? To the origins of the modern university, if not before -- arguably.
But that's not a necessary role, certainly not for publishing to play, and though giving a thought to side effects and consequences, disrupting the publication cycle might solve a few more problems then it creates.
Or not.
The deeper problem is that demonstrations of correctness don't adhere to publishing, funding, teaching, or even career cycles. Things can be thought wrong for a very long time and proved right, or vice versa. And again, all the prestige in the world won't change the truth value.
Though it will change what truths, or myths, are socialised at any given point in time.
Normally I'd respond to such a comment with an observation on the author's precious naiveté, but since you are well-known on HN with high karma, I'll refrain.
1. Once upon a time there was no internet and not even any computers. Turning academics' scribblings into neatly printed compilations of articles sent out to academic libraries across the world on a regular basis necessarily involved a publisher that could handle that sort of thing.
2. Over time, their journals' proposition evolved to be less "distributor of print" and more "curator of the most respected content". As universities started scrutinising how good their academics were at getting articles published and journals they published in started being ranked by "impact factor", the value proposition of curated content got stronger
3. Computers made it relatively easy for academics to produce print-ready articles themselves. The internet made it straightforward for academics to distribute articles as neatly formatted PDFs and share their early drafts with peers for review without involving any publisher. But also, academics and students started expecting to be able to search through decades of articles online rather than hunt for a single paper copy of each on in library, which opened up a nice little opportunity to sell renewable subscriptions for digital access to the archives.
3b. The barriers to entry used to be not having the facility to print and distribute and sell to libraries as well as the fact your new journal isn't Nature. Now it's just the latter. And also the fact you're simply not allowed to republish the decades worth of content they own, which includes a lot of must-cite texts
3c. The business model was getting academics to submit and review content for free which was then resold to universities staffed by people who expected [their students] to be able to read it as part of their research and expected the library to pick up the tab. Still is, just publishing new content is easier for everybody, and since the journal sub offers digital access to the archives and not just 12 bits of paper for the library they can ask for more eye-watering sums of money.
4. Regardless of whether open access journals ultimately inherit the prestige of the existing ones, academic publishers will continue own the copyright of decades worth of texts which academics and students are expected to have read to properly understand their field, and so university libraries will continue to be expected to subscribe to them. The lockin effect is that you can't do serious academic research without reading papers which they own. Which is also why they're not very keen on people pirating their content.
Academia runs heavily on prestige. The journals are valuable because they are prestigious and it is more beneficial to an academic's career to publish in a more prestigious journal than a less prestigious one.
Because the academics who publish in those journals get jobs at prestigious universities, every university needs to keep those journals in those libraries, because that's where all the highest profile researchers are publishing their articles. Anyone who wants to join that group also needs to compete to publish in those journals. It's a self-reinforcing cycle.
1. They were old-school publishing houses, that had a connection to some prominent academic or an academic society that decided they needed to start a journal.
2. The value proposition was that they were the only way to publish research. Libraries subscribed to them (or bought issues), and reading journals in a library was the major way academics learned of new research in their field, unless they learned of the research in a conference.
3. The Internet.
3b. The barriers to entry used to be needing to have access to a publishing house capable of duplicating issues, and having your journal be in major libraries (which is usually related to being related to a major academic or a academic society).
The barrier to entry today is due to papers published being a major benchmark using which the productivity of an academic is measured. Older, well-known journals that accept only a few papers are known as having a larger "impact-factor", which means that publishing in them is worth more to an academic.
3c. The business model is always selling subscriptions to libraries, mostly university libraries (but also regular ones and research outfits in industry). Once they don't actually print anything their costs are mostly administrative, so they're basically printing money. The decision maker are librarians, usually, but they get set a budget by whoever owns the library.
4. Because they own the major benchmark used to evaluate an academic career. If you are an academic and want to further your career (i.e., get a job, get tenure, get a grant), you have to publish in one of those journals. If you're another academic and you then want to read that research, you then have to have your library subscribed to that journal (or look it up and Sci-Hub).
> 3c. The business model is always selling subscriptions to libraries, mostly university libraries (but also regular ones and research outfits in industry)
The main criticism I read about Elsevier is that universities can’t subscribe to just the most important Elsevier publications but have to subscribe to a big package while the content producers aren’t paid by Elsevier but work at the universities. This only works, because Elsevier is a monopoly and this is the reason why science hub is a thing.
> 2. The value proposition was that they were the only way to publish research.
Was this ever actually the case? Researchers have always published on their own in one way or another. What these journals added was an established peer review mechanism.
Sure, academics have always self-published, but unless you were already a prominent researcher, publishing in a journal was the way to get a wide readership (or at-least by available in the library.)
They do still provide significant value, although it is weakening.
When I see an article punished in an Elsevier journal, it gives me a sign it is trustworthy. When I'm asked to submit something to an Elsevier journal, it gives me confidence the journal is trustworthy.
I, as I guess someone with an academic email address, give dozens of invitations to read, submit to, and help edit, journals every week. I have no choice other than to send the majority of these directly to trash.
Slowly, open source trustworthy journals are appearing, but they need to gain trust and that takes time. We need something like a Web of trust for academic work, but it would be a major undertaking to set up.
Also, who is going to pay for all the work of setting up and running these new journals? And don't assume there are zero costs, just because academics do tend to work for free.
By "sign", I know it isn't perfect. However, 99% of the journals which email me are junk, so we need some method. And I don't just want to read papers by my friends, as that way I end up in a bubble.
That's also questionable. For instance, journals with higher impact factor tend to contain more articles which can be considered bad by objective measures.
> 3b. What are the barriers to entry and how did those evolve?
It 's a chicken and egg problem: Authors don't publish to new journals because they are not reputable yet and journals can't become reputable because authors don't publish to them. I don't know the theory of chicken and egg problems but i think they are solved when either a) a brand new discipline emerges therefore the first bidders win by default or b) some extraordinary action is taken
> 1. How did publishers like Elsevier get started?
I think people are answering this question as if it were about scientific publishers in general. They were just respected magazines, often published by scientific membership societies.
Elsevier specifically was a publisher that just gobbled as many of them that it could up until it got to a monopoly or near-monopoly position by specialization (meaning that it could own every important journal is some professions, and most of them in others.) Then it jacked up the prices infinitely, and their competitors took the signal and jacked up their prices too. Now, some major universities are defecting, which is actually a signal that the pricing is just about perfect for the market.
It's just the same old boring operations of capitalism. This is exactly how pharma, or the music business works.
> 1. How did publishers like Elsevier get started?
Someone had a lot of money and knew many academics in a few fields. They asked some to review papers and others to send them papers.
> 2. Was their value proposition true back in the day? How did it change over time?
They derived value from peer review, prestige, and collecting all pieces.
Peer review is important for any journal to supply and is a useful step to put your own work through to make sure you are accurately conveying your ideas.
Prestige gets you funding and degrees. Some colleges don't let you get a master's or PhD if you don't have N publications from a X impact factor journal.
Before the internet your college library was your source of information. The librarian has many things to do (not now, we are talking pre-internet) and rather than find every single individual in a field and poll then for papers they go to a publisher and ask for a copy of all of the publications the journal gets.
> 3. How did the industry evolve?
Depending on the field not much to very much.
> 3b. What are the barriers to entry and how did those evolve?
Now that the internet is a thing collecting, indexing, and distributing data is "free". The barriers to entry are still prestige and peer review.
> 3c. What was the business model then and what is it now? Who are the clients (universities, I presume)? How do they pay for this (taxpayer money? If so, that's a huge clue). Who is the decision-maker regarding these purchases?
The business model is two steps. Librarians: you pay me an obscene amount of money each year and I'll organize a bunch of relevant papers for you. Researcher: we only accept "the best" so if you get to publish in my journal you are "the best".
Purchasing subscriptions to these journals is likely still done by a librarian but i say "done" for a reason. There is no decision. You must have these journal subscriptions. More info below.
> 4. How is it possible that they still exist? If they don't provide value anymore, then what is it? Are people habitual creatures? Are they pressuring universities? Did they somehow get a vendor lock-in effect (can people get out)?
The lock-in is that, while they suck, researchers will only hire people with a specific number of publications in a specific impact factor journal. These journals are usually evil but it's "the one I read" so to get in front of some very important people you need to have published in some of these places. If you do the same quality work and publish it in a smaller, open access, journal that is great but it won't necessarily move your career as forward as publishing it in an important journal will.
Yes. I am surprised that authors don't see the difference between linking (if link they must) to the authoritative source of the article as opposed to the distribution channel they happened to use to get hold of the full text of the article. To me, it feels like irresponsible bibliographic practice.
All four examples involve citations of the authoritative source that in addition have a link to the actual distribution channel they used. That seems to me to be more responsible bibliographic practice than simply omitting that information. Similarly, if the original article is written in a language you don't speak, you should also cite the translation you used.
To me, a link to an authoritative source would be a link to the web page (or pdf file) on the site of the journal that published the paper.
You wouldn't link to your friend's mailbox if it was your friend who sent you a copy of the article; why would you link to scihub for basically the same?
> Personally I'd never include Scihub into my workflow in a way that shows up in a publication.
Clearly it was not meant to. Especially if you look at reference 4 here [1], it looks like the internal format of whatever reference database the authors used, was accidentally leaked into to references. Probably missing commas somewhere.
Thanks for learning about citationsy.com! I don't publish with Elsevier since the Cost of Knowledge movement. I try my best to not publish with any publisher who locks the article or who asks money from the author (gold OA). But hear me well: it is useless to blame only Elsevier. Go for the academic managers who make possible this. Or the gold OA circus. Otherwise publish with arXiv or better try to make all your work available: data, programs, etc.
> Sci-Hub is a copyright-violating site that provides infringing access to scholarly publications that are behind paywalls. Its ethics are problematic but it’s also proving very difficult to stop.
No. The ethics of sci-hub are not problematic. The ethics of Elsevier and their gang certainly are.
I'm a bit baffled why sci-hub has generated so much controversy, yet library genesis is much less frequently discussed afaics in the anglophone internet. It seems to me that sci-hub is doing something more easily defendable than libgen. (They're related right?). The scale of libgen's copyright-violating operation and vision is staggering, it's like a free amazon for books.
It might be because 100% of the content on SciHub is owned by monopolistic copyright trolls that serve no purpose other than rent seeking. Libgen is diluted by content owned by businesses that perform some level of curation, pay authors, sell direct to consumers, compete with each other etc.
Those businesses have better things to do than piss off their customers and authors as they try to police the internet.
Even more cynically (and predicted by Max Headroom ~35 years ago) academic work has more potential to unlock economic value for the unwashed masses, so it needs to be better policed.
On top of it a student that really needs book and able to will buy. Their loss of sale through libgen must have been negligable so far for it to remain untouched.
The biggest reason is that academic publishing is much more viable to defend than novels. Book piracy has millions of “lost” customers, each representing a low amount of spend. Academic publishing has relatively few customers who pay a lot to access journals.
I use Library Genesis for a lot of academic books I definitely wouldn't buy—I need like one chapter or to take some notes, then have little further need for them, and they're usually $50-150 and still expensive or hard to find on the used market, so somewhat out of my worth-purchasing-for-convenience range—but would otherwise have to track them down on interlibrary loan then wait for them to show up.
Most academic book authors (of books that are a collection of articles) don't get paid for their work - or rather their payment is just 1-2 copies of the book. So you should not feel guilt for pirating them.
Libgen is not just "academic" books. Every single commercial textbook and programming book, for example, is on there. Many of those authors do in fact engage in those multiple-year-long writing projects in the expectation of remuneration. Do you think no O'Reilly author wants royalties? Some of these people are literally professional writers. I don't think they'd appreciate you informing the world that you don't have to "feel guilt for pirating them".
Scihub is completely different from Libgen in terms of ethics. No academic has ever written a manuscript for publication in a peer-reviewed journal because they were going to get paid for it. There is nothing whatsoever ethically problematic in accessing Elsevier's content on Scihub. Those journal articles should be free to the world, not least because the reasearch is often paid for by public funds.
Do not confuse what I'm saying. I am not saying that I think Libgen is a bad thing. I am saying it is a difficult area from a moral/ethical/economic point of view. Scihub and Libgen are the most fantastic libraries the world has ever known. Literally. It seems pretty clear that making them available to someone in rural Angola is a good thing to do. I work in a highly developed country and it's slightly less clear that the anarchy should extend to benefit me, but honestly I'm happy that it does.
>There is nothing whatsoever ethically problematic in accessing Elsevier's content on Scihub. Those journal articles should be free to the world, not least because the reasearch is often paid for by public funds.
Just to add my 0.02$ as a soon-to-be-phd-dropout, I once downloaded my own paper from SciHub after I lost the stupid PDF and of course the publisher wouldn't let me get it for free. If anyone plans to assemble a torch and pitchfork mob, by all means let me know.
OK but parent was tslking about the kind of books i m talking about - Not O’Reilly books but springer books that are usually unpaid collections of reviews. They are usually very expensive and bundled with journals in library subscriptions
Cool, yeah I know the sorts of books you mean. And agreed that the authors do not write those chapters in the expectation of royalties and agreed one shouldn't feel guilt in pirating them. But your post is at risk of being read out of context, as saying that libgen should be used without guilt, and there are people in this thread who seem almost to think that there are no ethical questions surrounding libgen, and who seem to think that libgen is ethically equivalent to scihub.
The people downvoting my question are very confused! It was a question; I'm not expressing an anti-scihub or anti-libgen viewpoint. It's absolutely unarguable the scale of the copyright violation is staggering! Absolutely every commercial technical book is free on libgen. I didn't say that was a bad thing. Never before in the history of humanity has such an incredible collection of technical books and journals been free to so many humans on the planet, each one just a few clicks, less than a minute away.
In this case I believe it is more an issue coming from the authors of that particular article. They probably used some bibliography software that recorded the sci-hub as the address for some articles cited, although published articles are peer reviewed people rarely scrutinizes all aspects of an article that finely to pick such issues.
"Legally", could Elsevier get into trouble for offering unequivocal evidence that they don't care about their claimed copyrights and they don't actually consider Sci-hub infringing?
I see what you're saying about the journal contributing to the legitimacy of what they are purporting to be illegitimate, it's an interesting point.
I would like to add separately however for people unfamiliar with sci-hub, I am unaware of any copyright infringement issues of linking to or downloading from sci-hub. Sharing or downloading links has never been copyright infringement. Distribution is, which is why sci-hub is based in a separate jurisdiction (Russia I believe) that does not recognize US copyright claims.
There’s some more information in this Twitter thread, I’m trying to keep it updated as this develops: https://twitter.com/citationsy/status/1156626811398307840