> A better rule is for one service to own writes for a table, and other services can only read that table,
Been there: how do you handle schema changes?
One of the advantages that having a separate schema per service provides is that services can communicate only via APIs, which decouples them allowing you to deploy them independently, which is at the heart of microservices (and continuous delivery).
The way I see it today: everyone complains about microservices, 12 factor apps, kubernetes, docker, etc., and I agree they are overengineering for small tools, services, etc., but if done right, they offer an agility that monoliths simply can't provide. And today it's really all about moving fast(er) than yesterday.
For us, we started off with a world where each service communicates to each other only via RabbitMQ, so all fully async. So theoretically each service should be able to be down for however it likes with no impact to anyone else, then it comes back up and starts processing messages off its queue and no one is the wiser.
Our data is mostly append-only, or if it's being changed, there is a theoretical final "correct" version of it that we should converge to. So to "get" data, you subscribe to messages about some state of things, and then each service is responsible for managing its own copy in its own db. This worked well enough until it didn't, and we had to start doing true-ups from time to time to keep things in sync, which was annoying, but not particularly problematic, as we design to assume everything is async and convergent.
The optimization (or compromise) we decided on, was that all of our services use the same db cluster, and that if the db cluster goes down, it means everything is down. Therefore, if we can assume the db is always up, even if a service is down, we consider it an acceptable constraint to provide a readonly view into other services db. Any writes are still sent async via MQ. This eliminates our syncing drifting problem, while still allowing for performant joins, which http apis are bad at and our system uses a lot of.
So then back to your original question, the way that this contract can break is via schema changes. So for us, since we use postgres, we created database views that we expose for reading. And postgres view updates are constrained that they must always be backwards compatible from a schema perspective. So then now our migration path is:
- service A has some table of data that you like to share
- write a migration to expose a view for service A
- write an update for service B to depend upon that view
- service B now needs some more data in that view
- write a db migration for service A that adds that missing data, but keeping the view fully backwards compatible
> So then back to your original question, the way that this contract can break is via schema changes. So for us, since we use postgres, we created database views that we expose for reading. And postgres view updates are constrained that they must always be backwards compatible from a schema perspective. So then now our migration path is:
> - service A has some table of data that you like to share
>
> - write a migration to expose a view for service A
>
> - write an update for service B to depend upon that view
>
> - service B now needs some more data in that view
>
> - write a db migration for service A that adds that missing data, but keeping the view fully backwards compatible
I don't think I understand. You need to update (and deploy) service B every time you perform a view update (from service A), although it's backward compatible?
if service B needs some new data from the view that isn't being provided, then you first run the migration on service A to update that view and add a column. Then you are able to update service B to utilize that column.
If you don't need the new column, then you don't need to do anything on service B, because you know that existing columns on the view won't get removed and their type won't change. You only need to make changes on service B when you want to take advantage of those additions to the view.
This only works if you apply backward compatible changes all the time. Sometimes you do want to make incompatible changes in your implementation. Database tables are an implementation detail, not an API which you're trying to expose as a view, etc.
But hey, every team and company has to find their strategy to do things. If this works for you, that's great!
I would never claim that our setup uses microservices. Probably just more plainly named "services".
And yes, that is correct, we agree that once we expose a view, we won't remove columns or change types of columns. Theoretically we could effectively deprecate a column by having it just return an empty value. Our use cases are such that changes to such views happen at an acceptable rate, and backwards incompatible changes also happen at an acceptable rate.
Our views are also often joins across multiple tables, or computed values, so even if it's often quite close to the underlying tables, they are intentionally to be used as an abstraction on top of them. The views are designed first from the perspective of, what form of data do others services need?
Yeah... The issue lies in cases where the decomposition is so extreme, that you end up not able to deploy independently.
And any benefit of a microservice owning it's own rDB is still, that schema changes aren't easily reversible. Specially when new, non predefined, data has been flowing in.
Stateless microservices are great, in the sense that you don't have to build multiple versions of APIs... but stateful microservices are just a PITA.
No, because that's not the service API. That's just a view over a table - an internal data structure used to represent some business domain model which should be properly exposed through some implementation-agnostic API. Service B should not care how service A implements it.
And the fact you must keep backward compatibility ( see the OP answer above) at the implementation level shows how fragile this approach is - you will never be able to change a database schema as you wish just because you have consumers that rely on the internal details of your implementation - a table. If you want to change a field from char to int, you can't. How is it important for service B to know that level of detail? An API could still expose a domain model as char, if you want to, and maybe introducing new fields, new methods, whatever way. Or maybe nothing at all, maybe it's not necessary because the database field is never exposed but only used internally (!!).
On the other hand, if you expose a database agnostic API (e.g., http, rpc, ... whatever) you can even swap the underlying database and nobody will notice.
A good rule of thumb is: if I change the implementation, do I need to ask/tell another team? If the answer is yes, that is not a microservice.
> a view would allow you to change a field's type.
Which proves my point: if I change a field type ( in the table ), I will have to change the view type. I need to change 2 services because one of them changed an implementation detail.
Been there: how do you handle schema changes?
One of the advantages that having a separate schema per service provides is that services can communicate only via APIs, which decouples them allowing you to deploy them independently, which is at the heart of microservices (and continuous delivery).
The way I see it today: everyone complains about microservices, 12 factor apps, kubernetes, docker, etc., and I agree they are overengineering for small tools, services, etc., but if done right, they offer an agility that monoliths simply can't provide. And today it's really all about moving fast(er) than yesterday.