While S3 might be "extremely durable" from a technology standpoint, its unceremonious dumping of Wikileaks as a customer shows it to be politically fragile.
You might think of Wikileaks as "extreme," but this is an organization that was neither convicted nor even charged with breaking any laws, which Amazon dumped as a customer on very vague TOS grounds following pressure from Sen Joe Lieberman.
This could be an issue for reasons like...
-You make a Web app that Hollywood deems to somehow encourage or abet piracy
-You provide a service used by a customer deemed to be politically controversial
-You facilitate financial transactions deemed to be potential helpful to "terrorists" or the wrong sort of activists (e.g. Wikileaks).
Werner, since you submitted this entry from your personal blog, maybe you could clarify what safeguards Amazon has put in place to prevent a repeat of the Wikileaks situation. Many companies will stand behind a customer barring a court order, but for Amazon this clearly is not the case. How do you decide when to abandon a customer?
I think you nailed the TINY niche of things that amazon could take issue with and pull your site down. So now back to 99.9999999% of content producers on the internet: S3 is "extremely durable"
I've heard of other (large) providers pulling sites that contained "objectionable" content down. Wikileaks just had the media's attention at the time.
At least one customer had issues removing data without court order. So in order to satisfy your 99.9999999% estimation amazon should have at least 1B S3 customers which is way to optimistic.
Are those really issues for a static site? (Heck, would any of your examples be a static site in the first place?) And if a static site is dumped by Amazon S3, is that really so bad? You're a rsync and a DNS edit away from the site being back up - that's the beauty of a static site, it's just files in directories.
If you go to the bottom of the page at http://allthingsdistributed.com/, you'll see that the blog actually requires "Movable Type Pro".
AWS doesn't provide a way to serve from S3 without the help of a CNAME redirect, which means that you're out of luck if you want to use the Jekyll+S3 setup with a naked domain name (naked, as in no "www" or "blog" subdomain). And it also means that you're going to have to get some other server (Google Apps can do it), to redirect your *.domain.com queries to www.domain.com. And then your users' DNS is running all over the place, incurring, in my opinion, unneeded delay.
Actually that was my fault. I had just switched off the redirect of allthingsdistributed.com to www.allthingsdistributed.com and as such you ended up at the old MT installation. That is now corrected.
You are correct; to map to an S3 bucket you need a CNAME. But DNS doesn't allow the apex to be a CNAME so you will need to redirect that. Route53 solves that for EC2 with the help of ELB. But there is no such solution for S3 (yet).
I am using the www subdomain as much as possible, so the redirect only happens if a visitor actually types in the apex name, in all other cases they will get where they need to be directly. But I agree that it would be better to solve this at a different level.
http://www.gwern.net is built on Hakyll as well; I'm currently hosting the static files on NFSN, but they're noticeably more expensive than Amazon S3 and I've been thinking of doing the same thing. What did you have to do to get S3 working?
I'm also using Hakyll for my site (http://www.wunki.org). Used Jekyll before, but I didn't think the markdown libraries were up to par with Pandoc. You can browse the source code of my site here: https://github.com/wunki/www.wunki.org.
I also threw together a script (./publish) that first gzips the static files and then uploads them to S3 with the correct headers (gzip and cache-control). Finally, it invalidates the old files on Cloudfront. Combined I get a very fast site, while keeping the cost low. Again, you can find it all in the github repository.
Aha! That's how he did it. I couldn't figure out how commenting worked. He does have a comment count link at the top of the page. Is that fragment from Disqus also?
Any open source alternatives to Disqus that I could host myself? I don't want to use Disqus - apart from wanting to "own" my comments, it is also incompatible with my browser configuration (blocking 3rd party cookies).
It doesn't sound like he coded all that much from the post. He even mentions that Cactus is a little bit too much work since there's not much of an existing community surrounding it. I was just surprised that it was the CTO of Amazon after I read the post.
Actually I did do some coding for the conversion, etc. :-) But I like it when doing something new to be able to look at how other people solved similar problems. And it is a bit early for Cactus in that respect, and Liquid feels much simpler than Django templates.
The extension and plugin mechanisms will make it easier for me to start adding my own code without having to modify the core framework. But it is always more fun to add these kind of things if there is a community to give you feedback.
This is more about the S3 part than the jekyll one.
But yes, I guess there is some movement back in time. I'd say the two main reasons are a return to more minimal blogging (i.e. something like tumblr as opposed to blogrolls, widgets, plugins etc.), and the fact that you can do some dynamic stuff in the client now via JavaScript (e.g. comments with Disqus).
I started using a method really similar to this to host a blog a few months ago, shortly after the S3 static website feature was released. However, shortly after a post ended up on the front page of hacker news, requests to anything on the S3 bucket started responding with 503 errors.
Not entirely sure what the issue was, since I use S3 to host static assets for other sites that see similar traffic levels, and haven't gotten any 503 errors. And clearly ATD
seems to be handling the HN traffic just fine.
It's an interesting use case. However, if you have almost static content you can also make use of heavy caching. That is a low end dynamic site with a powerful caching/delivery layer. I guess this is also doable with Amazon Cloudfront. It is all about how comfortable it is to update your site.
This is awesome. I was going to put my blog onto GitHub (you know, being a hacker it just makes sense since I already pay for it anyway), but it is intriguing to be able to put it on S3, especially with CloudFront.
You might think of Wikileaks as "extreme," but this is an organization that was neither convicted nor even charged with breaking any laws, which Amazon dumped as a customer on very vague TOS grounds following pressure from Sen Joe Lieberman.
This could be an issue for reasons like...
-You make a Web app that Hollywood deems to somehow encourage or abet piracy
-You provide a service used by a customer deemed to be politically controversial
-You facilitate financial transactions deemed to be potential helpful to "terrorists" or the wrong sort of activists (e.g. Wikileaks).
Werner, since you submitted this entry from your personal blog, maybe you could clarify what safeguards Amazon has put in place to prevent a repeat of the Wikileaks situation. Many companies will stand behind a customer barring a court order, but for Amazon this clearly is not the case. How do you decide when to abandon a customer?