It's not about suing, but defining expectations about how you can rely on a service.
For example, my team has people across the world for HW bringup, so we can't allow our code hosting or CI to be down for more than a few hours. Of course, backups have different uptime requirements, but as for everything, it's a tradeoff between features, of which an SLA is one.
Tarsnap's features are granularity of cost, reliability of storage, and encryption, but not 99.999% uptime.
> It's not about suing, but defining expectations about how you can rely on a service.
Meeeeh, my ISP cut of around 100+ fiber connections in my town and spend three weeks fixing it. My neighbor have business line, there's an SLA on those that among other things, require them if reestablish his connection within 3 - 5 hours. It took them over 500 hours, so that SLA is useless for anything but forcing compensations.
The problem is that the SLA should give an indication of available resources, but in reality it's mostly a contractual thing for most companies, they'll pay the "fine" or refund a customer if they fail to hit their SLA and that's about it. Tarsnap most likely have better availability than many midsize competitors simply because it's just one person who really cares about it. Doesn't help if he's hit by a bus though.
SLAs can be meaningless like that. However the better ISPs have in place a backup system that doesn't use the same fiber/wires. Sure the backup might be a radio or satellite feed and so be slower, but it will get/keep you online. This costs are lot more per month though, so if you are not paying for that service your SLA will probably just be we give you a free month (which hurts them enough that they will do some things to prevent downtime, but not enough that they put redundant fiber paths in the ground)
A company could... if you have N users and you pay M for storage per user and downtime cost you X then it could be that a discount of Y means (M - Y) * N = X
Agree. You get a discount if something breaks. But SLA really only works for larger services where the cost of fixing something is small when compared to the discounts.
I know it's a joke, but I think if an SLA involved putting a CTO in stocks and throwing eggs at him then that'd encourage me to sign up for the service. Especially if the video of it were posted after every incident.
Instead we get refunded some pitiful amount when our business is seriously disrupted for an extended period of time.