Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The reason is that you don't have to deal with it. Amazon isn't magic and they use the same hardware available to everyone else. Sure, there's some scale involved when it comes to labor and power and bandwidth, but they can't undercut what you can do yourself.

However, spraying files everywhere is a pain! MogileFS makes it a lot better, but you're still in charge of monitoring it and making sure it's healthy. With only two boxes, you have to be always on call so that you can order another box from your provider fast.

Plus, there's the issue of multiple data centers. S3 doesn't just make redundant copies of your data. It makes copies across data centers. So, you're paying $0.10/GB for data in, but you don't have to pay for when it replicates copies into several data centers.

You also have to realize that you have to pay for excess capacity anytime you're doing your own storage system. If you like to keep a 50% buffer (a reasonable size), you're going to be paying 1.5x the base cost of $0.10/GB that you've come up with.

And then there's the issue of having to make sure you're monitoring it and that if you see a spike in storage usage you can add drives fast enough. . .

You pay for a bit of convenience with S3. I'm not going to argue that it's cheaper, but it's definitely a lot less headache. Are you going to colo several boxes in different data centers, constantly monitor the storage, make sure that they serve the files properly, making sure that more copies get replicated if one server dies, replace drives as they fail, adding more servers as needed. . .

If you're on a large scale, I'd say you should do your own storage because you can justify making that someone's job (or a large enough portion of their job). I'm not sure I agree with SmugMug using S3, but I'm not sure I disagree either - it allows them to concentrate on what they want to do. Remember, for every tech person on HN, there's 100 that will say they're doing backups and aren't (ok, maybe not true, but you have to find an employee to manage your storage who you trust as much as Amazon).

However, most people don't have that much to store. If you're storing 100GB of data, you'd then be paying for multiple servers all with RAID and managing MogileFS or the like for what? 20% savings? $150/year? I'm as cheap as the next person, but I also like sleep. I don't want a pager calling me telling me that one of my two file stores is down and that I need to provision and configure a new box at 2am. And do you want to focus your time on creating a compelling product that your customers think is awesome or do you want to spend your time creating an awesome file store that works really well? Life has tradeoffs. You're not wrong, but I don't see Amazon as ripping people off with their pricing and I don't mind someone profiting from giving me a hassle-free, no-lock-in solution.

EDIT: I personally think your estimate of buying boxes and colo'ing them is a tad low so my 20% might be your 50% and so it might make sense by your numbers more. Maybe I've just seen crappy colo offers. Link if you know good ones! I love being proved wrong.



+1 for "I'm as cheap as the next person, but I also like sleep."

Pretty much sums it up.


Thank you for such a detailed and reasoned answer. I'm learning a hell of a lot from this thread already.


Well, I think what it really comes down to is what your business is. If your business is storage, get good at doing your own storage. If your business is web applications, get good at web applications. Then there are companies on the cusp. Flickr is clearly storage heavy enough that they need to be a storage company. 37signals, on the other hand, stores attachments and some pictures, but it their primary function isn't storage - it's the interaction with that stored content.

Is storage such a thing for your business that you're willing to put a lot of labor behind its solution? Or is storage important to your business, but as long as you get something reliable it doesn't have to be the most efficient possible because it's a small part of your business relative to the other things you do (like HTML, CSS, Ruby/Python/Perl/et al., MySQL/PostgreSQL)? Your time might be better spent on other company work than on storage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: