Hacker Newsnew | past | comments | ask | show | jobs | submit | edweis's commentslogin

It is the first time I see a specification 3881 pages long!


Check out the 5252 pages long "Intel® 64 and IA-32 Architectures Software Developer’s Manual Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" :)

https://www.intel.com/content/www/us/en/developer/articles/t...

Direct link: https://cdrdv2.intel.com/v1/dl/getContent/671200


Or the ~11k pages long ARM specification.

That's actually two specs in one, both ARM64 and ARM32 are part of this.


Didn't someone try to calculate how many printed pages some set of browser specifications would be?


There is a 5000-page standard for Docx I used for a Word export feature. And it was mostly devoid of details and I reverse engineered Word's output files countless times to figure out the actual format. IIRC there was a single 14000-page pdf.

https://ecma-international.org/publications-and-standards/st...


It's been that way for awhile? 4.0 and 5.0 were ~2.8k pages. Even 2.1 is 1420 pages

https://www.bluetooth.com/specifications/specs/?types=adopte...


All the wireless standards are like that. IEEE 802.11 from 2012 is nearly 2800 pages, and I'm sure the latest version has far exceeded that.

...and the GSM/UMTS/LTE/NR standards are at least an order of magnitude even bigger.


If I remember correctly, the entirety of the original GSM is ~9000 pages, things just got crazier (by orders of magnitude) from there.

That's comparing apples to oranges, though. Those standards also specify the interaction between network components, not just between your phone and the network.

Mobile phone standards are more like the entire RFC collection than like the 802.11 specifications.


UEFI specification is also over 2300 pages long now. For comparison, Open Firmware (IEEE 1275) was 268 pages.


Things are far more complicated these days vs the 90s. These specifications still seem to lack important details which you notice if you try implementing the spec.


It is crazy. It does not make it easy for devs. Imagine this running in a monolithic kernel.


A lot of it is classic mode, the spec has accumulated a lot of cruft over the years.


wait for the AI-generated version.


Written by AI?

Sizes like that nicely lock out newcomers from the market, as it can't be entered without a strong financial backing.


You don't need to implement the full spec. Most devices only support the parts relevant to them. Hardware in general is very expensive though so I doubt a very long spec that helps you achieve comparability with existing devices is the thing holding you back.


Do you really need a database for this? On a unix system, you should be able to: CRUD users, CRUD files and directories, grant permissions to files or directories

Is there a decade-old software that provides a UI or an API wrapper around these features for a "Google Drive" alternative? Maybe over the SAMBA protocol?


How would you implement things like version history or shareable URLs to files without a database?

Another issue would be permissions: if I wanted to restrict access to a file to a subset of users, I’d have to make a group for that subset. Linux supports a maximum of 65536 groups, which could quickly be exhausted for a nontrivial number of users.


As for the permissions, using ACLs would work better here. Then you don't need a separate group for every grouping.


TIL about ACLs! I think that would nicely solve the group permission issue.


The final project for my senior year filesystems class thirty years ago was to implement ACLs on top of a SunOS 4 filesystem. That was a fun project.


Write up? Code? :D


Then let me also introduce you to extended attributes, aka xattrs. That's how the data for SELinux is stored.


There is no support for writing multiple xattrs in one transaction.

There is no support for writing multiple xattrs and file contents in one transaction.

Journaled filesystems that immediately flush xattrs to the journal do have atomic writes of single xattrs; so you'd need to stuff all data in one xattr value and serialize/deserialize (with e.g JSON, or potentially Arrow IPC with Feather ~mmap'd from xattrs (edit: but getxattr() doesn't support mmap. And xattr storage limits: EXT4: 4K, XFS: 64k, BTRFS: 16K)

Atomicity (database systems) https://en.wikipedia.org/wiki/Atomicity_(database_systems)


Backup files the way Emacs, Vim,... do it: Consistent scheme for naming the copies. As for sharable URLs, they could be links.

The file system is already a database.


Ok this product will be for project with less than 65k users.

For naming, just name the directory the same way on your file system.

Shareable urls can be a hash of the path with some kind of hmac to prevent scraping.

Yes if you move a file, you can create a symlink to preserve it.


Encode paths by algorithm/encryption?


This wouldn’t be robust to moving/renaming files. It also would preclude features like having an expiration date for the URL.


Well sure there’s a bevy of features you’re missing out on, but it would work. Object store and file metadata solves both of those though feels like cheating.


Use sym link in that case to keep the redirect.


> How would you implement things like version history

Filesystem or LVM snapshots immediately come to mind

> or shareable URLs to files without a database?

Uh... is the path to the file not already an URL? URLs are literally an abstraction of a filesystem hierarchy already.


> Filesystem or LVM snapshots immediately come to mind

I use ZFS snapshots and like them a lot for many reasons. But I don’t have any way to quickly see individual versions of a file without having to wade through a lot of snapshots where the file is the same because snapshots are at filesystem level (or more specifically in ZFS, at “dataset” level which is somewhat like a partition).

And also, because I snapshot at set intervals, there might be a version of a file that I wanted to go back to but which I don’t have a snapshot of at that exact moment. So I only have history of what the file was a bit earlier or a bit later than some specific moment.

I used to have snapshots automatically trigger every 2 minutes and snapshot clean up automatically trigger hourly, daily, weekly and monthly. In that setup it was fairly high chance that if I make some mistake with an edit to a file I also had a version of it that kept the edits from right before as long as I discover the mistake right away.

These days I snapshot automatically a couple of times per day and cleanup every few months with a few keystrokes. Mainly because at the moment the files I store on the servers don’t need that fine-grained snapshots.

Anyway, the point is that even if you snapshot frequently it’s not going to be particularly ergonomic to find the version you want. So maybe the “Google Drive” UI would also have to check each revision to see if they were actually modified and only show those that were. And even then it might not be the greatest experience.


If you are on windows with a Samba share hooked up to zfs you can actually use the "previous versions" in file explorer for a given folder and your snapshots will show up :) there are some guides online on setting it up


Take a look at "cockpit", because if there were, that's where it "should" be.

https://cockpit-project.org/applications

--

    With no command line use needed, you can:

    Navigate the entire filesystem,
    Create, delete, and rename files,
    Edit file contents,
    Edit file ownership and permissions,
    Create symbolic links to files and directories,
    Reorganize files through cut, copy, and paste,
    Upload files by dragging and dropping,
    Download files and directories.


> Do you really need a database for this?

I have no idea how this project was designed, but a) it's expectable that disk operations can and should be cached, b) syncing file shares across multiple nodes can easily involve storing metadata.

For either case, once you realize you need to persist data then you'd be hard pressed to justify not using a database.


I don't know of one- have thought this before but with python and fsspec. Having a google drive style interface that can run on local files, or any filesystem of your choice (ssh, s3 etc) would be really great.


I'm unironically convinced that a basic Samba share with Active Directory ACLs is actually probably the best possible storage system...but the UI for managing permissions sucks, and most people don't have enough access to set it up the way they want.

Like broadly, for all configuration Hashicorp Vault makes you do, you can achieve a much more useful set of permissions with a Samba fileshare and ACLs (certainly it makes it easy to grant targeted access to specific resources - and with IIS and Kerberos you even have an HTTP API).


Perhaps they are using MongoDB GridFS instead of storing files on disk.


I need to remind that the time when a service's tenant — be it a file, email, whatever else — automatically meant there was an OS user account for that user, has also been decades ago.


You expose SAMBA shares outside your home network?


I do, password-protected of course. It is the only "native" way I found to get server files access to my iPhone without downloading a third party app (via Files).


I really hope you lock it down to something like Tailscale so that you have a private area network and your Samba share isn’t open to the entire world.

Samba is a complicated piece of software built around protocols from the 90s. It’s designed around the old idea of physical network security where it’s isolated on a LAN and has a long long history of serious critical security vulnerabilities (eg here’s an RCE from this month https://cybersecuritynews.com/critical-samba-rce-vulnerabili...).


It seems like every network filesystem is irredeemably terrible. SMB and NFS the stuff of security nightmares, chatty performance issues, and awkward user id mapping. WebDAV is a joke. SSHFS is slow. You can get really crazy with CephFS or GlusterFS, and for all that complexity, you don't get much farther way from SMB/NFS issues with those either.

My solution: Share nothing and use rsync.


Well one problem is that filesystem in general is a terrible abstraction both in terms of usability and in terms of not fitting well with how you design network applications.

I’d say Dropbox et all is closer to a good design but their backend is insanely crazy optimized to make it work and proprietary. There’s an added challenge that everything these days is behind a NAT so you usually end up needing to have a central rendezvous server where nodes can find each other.

Since you’re looking at rsync where you want something closer to Dropbox, I’d say look at syncthing. It’s designed in a way to make personal file sharing secure.


I think you should figure out how to quit while you're ahead. I wouldn't expose Samba to most of the devices on my LAN, never mind the internet.


Search for wannacry. You may rethink your setup.


... well, it makes sense to be able to do a "join" with the `users` and `documents` collections, use the full expressive range of an aggregation pipeline (and it's easy to add additional indices to MongoDB collections, and have transactions, and even add replication - not easy with a generic filesystem)

put all kinds of versioned metadata on docs without coming up with strange encodings, and even though POSIX (and NodeJS) offers a lot of FS related features it probably makes sense to keep things reeeeally simple

and it's easy to hack on this even on Windows


An SCP or FTP client maybe?


Definity. Though SAMBA supports authentication natively. With SCP and sFTP you'll need another admin server to create users.


With SAMBA you just get boring old authentication, but with SCP you need to file a Form-72B with Site Command, ensure all new users pass a Class-3 memetic hazard screening, and then hope that the account doesn't escape containment and start replicating across subnets.

Sure, it's more overhead, but you can't put a price on preventing your NAS from developing sentience.


Can you name a single Google Drive clone that doesn’t use a database?

Would love to see your source code for your take on this product.


The Synology Drive version mirrors the filesystem, though I’m sure it has a database for sharing metadata. Is that what they mean?


I would say that basically all these software options use a database for things like preferences and user management.

Using a database isn’t some kind of heavy-handed horrendous thing depending on the implementation (e.g., as long as it leaves your content files alone).


Nextcloud too.

There is a database in most if not all useful cases, but there could also be the actual files separately.


I would rather have more integrations with third parties rather than this.

My main problem is that I have to put a lot of effort to not use gmail for my business because most of third-parties (like CRMs) work only/better with gmail.

Fastmail team, how about a Gmail compatible API ?


Do you mean to access Fastmail as a client for your Gmail-hosted inbox? How about using the fetch feature that moves all emails to your Fastmail inbox, and the 'send as' feature where you can send emails out from your Gmail SMTP via Fastmail. Those two features already exist.


Although when setting this up, Gmail really make you feel like they might pull support at any minute.

I wonder if Fastmail could log you in to Gmail in a way that's consistent with Google's security model, similar to how you can "log in with Google" on many services. I'd much prefer it over the app password thing.


Now that you mention it, I remember that's how Fastmail works today. They use app passwords for non-Google-hosted email inboxes, and 'login with Google' for Google-hosted inboxes.


Not exactly. I use Fastmail to avoid using Google.

However there are tons of services that require Google (or outlook) to work with your emails.


What a bummer the website https://www.calligraphr.com is a subscription model. I could impulsively pay $100 to get my handwriting as a TTF font and be quite happy about it.


TFA goes into this in some depth: there's an option to subscribe for one month with a one-time payment. After the month is up, your account automatically reverts to the free plan and you get an email with your fonts attached.


The subscription is only for backups and ongoing changes - you get to keep your font forever. I think the author mentions that the whole experience cost them about $10


Is there a way to efficiently use Lit without using a bundler?


Lit's just a JavaScript library published as standard modules, so it doesn't require a bundler. Using Lit efficiently is the same as using any other library.

HTTP/3, import maps, and HTML preloads can make unbundled app deployment almost reasonable in some cases. You'd still miss out on minification and better compression form a bundle. Shared compression dictionaries will make this better though.


Nice, thank you for the insight.

I can tell that the on-premises will be deployed on AWS accounts. We can manage the resources ourselves.

We have few fanouts that can be refactored. So Redis/Valkey for SQS is OK, hopefully it can also cover our SNS use case.

I am afraid Kubernetes is overkill for our lambda needs.

If we manage to bundle our whole app in one server and have only 1-2 clients on -premise, do you still suggest Kubernetes or a simpler rsync on all servers is enough?

Also, should we have a separate database instance for each client, or a Postgres cluster sharded? The latter seams more manageable.


> If we manage to bundle our whole app in one server and have only 1-2 clients on -premise, do you still suggest Kubernetes or a simpler rsync on all servers is enough?

rsync is nice and simple. Personally I'd say at least use Ansible, with its built-in rsync support. Then you can do more than just copy files.

> Also, should we have a separate database instance for each client, or a Postgres cluster sharded? The latter seams more manageable.

Depends on the size. Run postgres separately for your on-prem clients. For your cloud clients, I'd say keep them on the same server until you start to get over 100GB-1TB of data, then start to think about sharding. RDS gets super expensive, so sharing too early may be uneconomical.

> I am afraid Kubernetes is overkill for our lambda needs.

For just Lambda, I agree. But if running everything outside of AWS (i.e. racking servers) then it shines, because then you have your app, postgres, valkey, etc, all balanced between servers.


Thank you again, this is gold. We'll do a POC with 1-2 services and see how it goes.


You're welcome! Always happy to give advice. Let's have a call next time. I'm adam@ (company domain in bio)


Thank you for your input.

All your points are valid and can be included in the contract: - We'll be the ones to choose the cloud provider, - We'll take servers that are big enough, - The client should upgrade their server according to our specs, - Upgrades are mandatory.

Any idea what other (simpler) recovery plans we can have besides on-premise?


You could demonstrate the app working on a different cloud platform. If you can make it work on Azure for example that could satisfy them that you have a backup plan.


Easy way to do this: search for the word "unsubscribe" in your email and delete all of them.

I did this 4 years ago on my personal email address and I never had to recover any email.


Thanks to LLMs, many of the spam messages I receive have synonyms for unsubscribe, instead of the magic word itself. I once talked to a B2B outreach company, and they touted the fact that they basically rewrite all of their emails in minor ways to evade spam filters. They pitched it as "personalization" but in reality it was just spam filter dodging.


I would say the "unsubscribe" rule still catches about 80-90% of the SPAM for me. (I thought the US had a law that any promotional email must include a link with "unsubscribe")

Then my "uninteresting sender" rules catch most of the remaining SPAM/uninteresting emails. These are accounts like "noreply" that automated emails often come from.

I had to set up a very special rule for a single company because they successfully dodged my other filters but always started with "Because you're a valued Vanguard client, we thought you'd be interested in this information."

More details: https://blog.leftium.com/2023/11/automatic-inbox-cleanup-wit...


> I thought the US had a law that any promotional email must include a link with "unsubscribe"

I think they need a way to unsubscribe, but doesn't have to be a link. So the circumlocution suffices, or so it seems.


This x100. I move it all to a folder called “Unsubscribe” and go through and unsubscribe from everything once in a while.

You can also make it a bit smarter by searching for the header “List-unsubscribe” instead. Less false-positives when someone forwards you an email that contains the word unsubscribe.


You have to be a little careful about this. That works for semi-ethical marketing departments, but for actual spammers it can send a signal that there's a warm body behind the email address, making it far more valuable and more likely to receive even more spam in the future.


I made a filter so the "unsubscribe" emails never reach my inbox (and thus never trigger a new mail notification)

The emails are not deleted; they are labeled and skip the inbox.

https://blog.leftium.com/2023/11/automatic-inbox-cleanup-wit...


A few years ago, I did the same and started unsubscribing from newsletters as soon as they arrived. Now I keep only emails in my inbox that require action - everything else I archive or delete.


Despite being terrible, Yahoo mail has a bulk "delete all from sender and block" button that's way more convenient than Gmail. Found out when helping an elderly family member with 200K unread of spam. She's blocked thousands of addresses on her own now.


I unsubscribe a lot every few years tbh but that might work better for scam emails that mention unsubscribe in order to appear legit.


By "unsubscribe" you mean "mark as spam", right? Unless you actually manually subscribed to the lists of course.


If you expand “More” in Gmail, there’s a Manage Subscriptions view that shows a list of senders along with unsubscribe buttons.


Are such racks tech used at an industrial level?


If you are interested in the topic, you can check the autobiography of the biggest manufacturer of LSD in history: "The Rose Of Paracelsus" by William Leonard Pickard [1].

It is both poetic and fascinating. It's not an easy read but I recommend it.

[1] https://www.goodreads.com/book/show/28930020-the-rose-of-par...


There is the story in Dr. James Ketchum‘s memoir of a barrel containing 40lbs of LSD turning up in his offices. He worked on the Edgewood arsenal as part of the US military. This was enough LSD to make several hundred million people trip and worth nearly $1 billion at street value. Are you suggesting that Pickard manufactured more than that?


Easily. DEA alleged he made 2lbs every 5 weeks, from the late 80's to late 90's. They also seized 'up to' 80lbs.


Where’s the demand for all that LSD. I can’t imagine more than 30 million people ever doing it and probably less than 2 million habitual users in the US.


The production of LSD was highly concentrated, the DEA claimed that after Pickford's arrest, availability dropped by 99.5%, it looks quite a while to recover. His supply basically was the entirety of demand at the time.


Street prices went from $1/dose to well over $10. Basically overnight.


you can increase the amount you take over time. and obviously lots of people take it more than one time in their life without being habitual.

But maybe the guy was a bit weird and produced more than was actually needed.


Jam band tours


Im not sure there are habitual users, in the way there are habitual users of tobacco or alcohol


In Australia during university around the second the summer of love period we ended up doing acid at least once a week (mostly weekends) for near two years.

Probably about half time in the city at clubs and parties, and half time out in the forests and remote beaches.

Strawberry double dips were prevalent, a few other designs, microdots etc.

I'm pretty sure it gave me a perspective still today, that I would not otherwise have.

I don't think it caused any "damage". I still graduated with one of the hardest 4 year degrees and went on to use the degree in a career. I don't have any regrets about it all and the events that unfolded.

There were, however, many, many notable incidents and events, most of them more than little bit funny, even today.

I don't really remember expressly how, why it ended, it was like organic decay. I think it just ran it's course with availability of LSD, us having the time and places to do it, and eventually I think life just took over.

But, definitely, there were some regular repeat customers.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: