Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How the Internet Archive digitizes 78rpm records (twitter.com/internetarchive)
148 points by quercusa on April 26, 2021 | hide | past | favorite | 41 comments


It's worth noting that the Internet Archive are always on the lookout for more records to digitize, so if you (or someone you know) has a collection and wants to see them go to a good home where they'll be preserved, you can donate them [1] to the Internet Archive (they also take 33/45rpm records, books, CDs, etc...).

(I'm not affiliated with the Internet Archive)

[1] https://help.archive.org/hc/en-us/articles/360017876312-How-...


I don't know why I didn't think to check into this sooner, but I inherited my grandfather's record collection that is >5,000 records, and I just couldn't bear donating it to some random thrift store, and I knew that selling them would just mean someone picking through the collection and leaving the dregs. This is a solution that I've been looking for. I do listen to some of the records, and I enjoy it, but I don't have time to digitize the whole collection, but this is actually a better solution, because now other people will be able to enjoy my grandfather's collection besides my family.


Hi, Jason Scott at Internet Archive. Let's talk. Jason@textfiles.com


Update: We're talkiing


This is the kind of updates that I really enjoy seeing. :D Thank you for your work, and thanks to prometheus76 for offering his collection!


We'd love to work with you on a donation! Here's how to get the process started. https://help.archive.org/hc/en-us/articles/360017876312-How-...


Truly, this was the internet I was promised — or at least hoped for — decades ago when it began.

Somehow I imagined the internet as a giant library — the Library of Alexandria for the modern age. The Encyclopedia Galatica.

Others thought it might better be the world's largest catalog. The Sky Mall.

I've personally scanned and cleaned up many old children's science-related documents and uploaded them to the Internet Archive.


The late Ken Perkins of Australia had a collection of 80,000 records when he past away in 2018:

* https://www.abc.net.au/news/2018-06-18/australias-largest-re...

Though this article from 2016 indicates one Zero Freitas has six million "in a 25,000-square-foot former candle factory in São Paulo":

* https://www.reedyoung.com/vinyl/

* https://en.wikipedia.org/wiki/Zero_Freitas

A 2013 article about John Tefteller, "who has stockpiled over 75,000 blues and rock & roll 78s":

* https://www.factmag.com/2013/10/01/tommy-johnson-blues-becom...

See also this 2009 documentary about Paul Mawhinney, who had three million items, and which was sold to Freitas:

* https://www.youtube.com/watch?v=SwXayHbUQ2o

* https://en.wikipedia.org/wiki/Record-Rama


It's amazing to see how much care and work is put in to efforts like these by people who clearly care about what they're doing.

If corporate America had their way, those records would have had DRM, there'd have been half a dozen different formats, and they'd have stopped working after twenty years.

A silly example, but looking at how records are preserved today, then imagining how media of today might be preserved for a century from now, should be something we think about when we're choosing what to buy, yet most people keep buying from companies that just disable our media whenever it benefits them.


This made me remember what happened with what.cd. Never had an account myself, but I feel like we truly have failed to a common, lasting, platform for storing arbitrary data.

bittorrent was ahead of it's time, the tech is IMO really cool. It is just too bad that it does not really make anyone rich and thus cannot be accepted.


https://archive.org/details/whatcdcrawl

Although I doubt there's anything in there that isn't also in the Goodbye Release:

https://twitter.com/whatcd/status/923941782253121539

> I feel like we truly have failed to a common, lasting, platform for storing arbitrary data.

As wonderful as What.CD was, and as much of a void as it left, it wasn't that. I'm increasingly convinced that any general purpose metadata store is going to be insufficiently granular for true specialists, while conversely, most specialist metadata stores are going to be overwhelming for a casual user. For example, What.CD's most successful successor has not one, not two, not seven, but forty-two different releases of Wish You Were Here. And even that is a small portion of the 768 versions listed on Discogs[0].

But does a casual user even care? Hell no. They'll keep listening to the CD they got at Wal-Mart some time in the mid-'90s, happily, and none the wiser. This only becomes a problem when, eventually, there is something that casual user does care deeply about – model trains, or Funko Pop! figures – but they're turned off (or banned) from contributing the site because they got in a flame war with a moderator over the omission of one particular re-release of Star Wars: Episode V. Über-metadata sites are either insufficiently granular (see: Wikipedia), insufficiently well-curated (see: Internet Archive), or inaccessible (in both meanings of the word; see: the hypothetical combination of all specialist, private filesharing sites).

[0]: https://www.discogs.com/Pink-Floyd-Wish-You-Were-Here/master...


I thought there was something in the early 2000s that used a laser (an array of them, actually) to map the grooves on a vinyl (or wax) record with perfect precision that allowed for the virtual audio processing of the record in higher quality than any physical stylus or pre-amp system could deliver. (There was also the side-benefit that the reading process didn't involve physical wear on the record or cylinder to optically record the grooves.)

Also, not being a vinyl expert, I would like to learn more about the different types of stylus and how that affects playback... does anyone have any links?


Optical turntables have major issues with dust and dirt. The records need to be extensively cleaned prior to being played [1], even for unplayed "virgin" vinyl. 78rpm (shellac) records are older than vinyl records, and tend to have more dust / dirt, so would require a lot of cleaning prior to archiving.

Although the more expensive of the optical turntables (the LT-master) has 78rpm capabilities, such capabilities are optional even on this high-end version.

I think it's basically down to costs - the costs of cleaning each record (I know they're cleaned already, but the process may need to be more involved to remove all dust), along with the tens-of-thousands for the player, probably don't make this economical on the scale the Internet Archive are looking for.

[1] https://en.wikipedia.org/wiki/Laser_turntable#Performance


Here's what our digitization expert, George Blood, has to say: LPs are soft and played with a very small diamond. 78s are made from shellac which is hard and abrasive. Originally they were played with steel needles at 10x the tracking weight used by our modern equipment. They also use a much larger diamond with 5-10x the surface area. The tracking force is spread over a larger area than vinyl playback. The contact pressure is less than 10% of vinyl playback.


There is such a laser record player, which costs $15k [1] and can only play black vinyl records [2].

[1] https://diffuser.fm/laser-turntable/ [2] https://www.theabsolutesound.com/articles/elp-lt-1lrc-laser-...


I suspect you really in fact do want to record the audio as it was intended to be played back. With the mass of the tone arm, the tracking and such as the tone arm swings closer to the inner radius of the disc, the RIAA equalization....

Seems if you strictly are extracting the topography of the record ... good for printing a new one, but would require all kinds of simulation to playback accurately?


I just donated to the internet archive. I had not before and they keep doing such great work.


Records from this era (1898-1950s) had at least 2 different groove widths. If the playback stylus is thicker than the groove, the record is damaged. This is what has prevented me from listening to my great-grandparents' collection. Every time I go to purchase 78 RPM styli, I'm dumbfounded at the options and don't know which stylus to use for each record.


Fascinating that we have so much history recorded on what started as insect secretions (shellac).


Very cool. It's the kind of volunteer gig I could see being involved in.

Thinking back on the records I remember looking at in record stores back in the day, and how few of them made it to CD or streaming services, hopefully the 33 1/3 world has a big back-up effort. One huge problem I can see is the collision between archiving/access and copyright.


I'm a little surprised to learn that they use normal styluses instead of reading the grooves optically.


These are not "normal styli" at least not for the old shellac records. It was very educational to watch this Techmoan video - https://youtu.be/AYn7aZZBMRA - and learn that the original shellac records were harder than the steel styli; these wore down in one or two playings and so were bought in bulk, and the tracking force was on the order of 100 grams.

Why does this matter? Because it probably keeps down the fine high-frequency crackle from very minor dust and surface imperfections. But I can see that you wouldn't subject a precious antique 78 to that kind of playback these days.


More than that, they use 4 different styluses and let you choose which one you think sounds best for a particular record.

I think it's kind of sad that this is a purely volunteer effort and not an official effort of the Library of Congress.


I'm on the opposite side; I think it's an amazing example of private actin for a public service.


I'm annoyed that it is necessary though, as the official institution who is tasked with preserving recorded works is failing to do their job.


I would encourage you to contact the LoC to express your dismay and your desire to see them improve:

https://www.loc.gov/contact/

You will want to address your correspondence to Carla Hayden, the current Librarian of the LoC.


I expect it's better to contact your member of Congress to request a budget increase for the LoC. According to Wikipedia, the LoC has only a $6-$8 million/year budget for digitization.


That is also a necessary effort, and I appreciate you pointing it out. My apologies for not enumerating in my above comment.


I'm not. I think the LoC is a single point of failure for curation. No single institution is going to preserve everything, so having multiple institutions, public and private, is far better than relying on a single public institution to get things right.


Here's what George Blood, the digitizer, reports: "laser turntables are designed to play 33rpm discs. Most will not spin at 78rpm. There is broad agreement they do not sound as good as stylus playback. I.R.E.N.E. (https://en.wikipedia.org/wiki/IRENE_(technology)) captures the disc as an image, then transforms that into sound. It is not way slower than real time, and while it is arguable that the resolution of the image is superior to stylus playback, the software is not yet able to produce results as good as stylus playback. And it introduces it’s own, idiomatic noise spectrum. On the other hand it is far and away the best system for ‘playing’ damaged or compromised discs, such as lacquers that are beginning to delaminate."


Here's what George Blood, the IA digitization expert, said: "LPs are soft and played with a very small diamond. 78s are made from shellac which is hard and abrasive. Originally they were played with steel needles at 10x the tracking weight used by our modern equipment. They also use a much larger diamond with 5-10x the surface area. The tracking force is spread over a larger area than vinyl playback. The contact pressure is less than 10% of vinyl playback. "


Any dirt in the grooves screwed up the playback in the case of using a laser. Remember the pits that are read on CD/DVD are under a clear coating and cannot get dirt in them.


They do thoroughly clean each record before it is played.


Yes, I saw that. Was replying to the person that questioned why they do not use lasers. I have a record collection and so need to upgrade my kit to that level of cleaning magic ;)


4 styluses... Why not a laser reader[0]?

[0] https://en.wikipedia.org/wiki/Laser_turntable


Ok, I turned to the expert himself, George Blood, who responded: "laser turntables are designed to play 33rpm discs. Most will not spin at 78rpm. There is broad agreement they do not sound as good as stylus playback. I.R.E.N.E. (https://en.wikipedia.org/wiki/IRENE_(technology)) captures the disc as an image, then transforms that into sound. It is not way slower than real time, and while it is arguable that the resolution of the image is superior to stylus playback, the software is not yet able to produce results as good as stylus playback. And it introduces it’s own, idiomatic noise spectrum. On the other hand it is far and away the best system for ‘playing’ damaged or compromised discs, such as lacquers that are beginning to delaminate."


I work with the Internet Archive, and asked George Blood about the wear and tear using a stylus. Here's what he had to say: "LPs are soft and played with a very small diamond. 78s are made from shellac which is hard and abrasive. Originally they were played with steel needles at 10x the tracking weight used by our modern equipment. They also use a much larger diamond with 5-10x the surface area. The tracking force is spread over a larger area than vinyl playback. The contact pressure is less than 10% of vinyl playback. "


laser readers are worse than using a needle because you can't rely on the laser to push dust & detritus out of the way nor to ignore smaller faults that would not even register using a stylus, so the recording ends up requiring more noise reduction and attention than using bog standard low price styli.

previous discussion: https://news.ycombinator.com/item?id=14964384


Are they digitizing cylinders?


I believe so. At least, they accept donations of wax cylinders.

https://help.archive.org/hc/en-us/articles/360017876312-How-...


Someone here have some stellar techno LP's to upload? Might be the first.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: