Hacker News new | past | comments | ask | show | jobs | submit login
Modules, not microservices (newardassociates.com)
1073 points by PretzelFisch on Jan 3, 2023 | hide | past | favorite | 660 comments



Microservices, while often sold as solving a technical problem, usually actually solve for a human problem in scaling up an organization.

There's two technical problems that microservices purport to solve: modularization (separation of concerns, hiding implementation, document interface and all that good stuff) and scalability (being able to increase the amount of compute, memory and IO to the specific modules that need it).

The first problem, modules, can be solved at the language level. Modules can do that job, and that's the point of this blog post.

The second problem, scalability, is harder to solve at the language level in most languages outside those designed to be run in a distributed environment. But most people need it a lot less than they think. Normally the database is your bottleneck and if you keep your application server stateless, you can just run lots of them; the database can eventually be a bottleneck, but you can scale up databases a lot.

The real reason that microservices may make sense is because they keep people honest around module boundaries. They make it much harder to retain access to persistent in-memory state, harder to navigate object graphs to take dependencies on things they shouldn't, harder to create PRs with complex changes on either side of a module boundary without a conversation about designing for change and future proofing. Code ownership by teams is something you need as an organization scales, if only to reduce the amount of context switching that developers need to do if treated as fully fungible; owning a service is more defensible than owning a module, since the team will own release schedules and quality gating.

I'm not so positive on every microservice maintaining its own copy of state, potentially with its own separate data store. I think that usually adds more ongoing complexity in synchronization than it saves by isolating schemas. A better rule is for one service to own writes for a table, and other services can only read that table, and maybe even then not all columns or all non-owned tables. Problems with state synchronization are one of the most common failure modes in distributed applications, where queues get backed up, retries of "bad" events cause blockages and so on.


I just want to point out that for the second problem (scalability of CPU/memory/io), microservices almost always make things worse. Making an RPC necessarily implies serialization and deserialization of data, and nearly always also means sending data over a socket. Plus the fact that most services have some constant overhead of the footprint to run the RPC code and other things (healthchecking, stats collection, etc.) that is typically bundled into each service. And even if the extra CPU/memory isn't a big deal for you, doing RPCs is going to add latency, and if you get too many microservices the latency numbers can start really adding up and be very difficult to fix later.

Running code in a single process is MUCH lower overhead because you don't need to transit a network layer and you're generally just passing pointers to data around rather than serializing/deserializing it.

There are definitely some cases where using microservices does make things more CPU/memory efficient, but it's much rarer than people think. An example where you'd actually get efficiency would be something like a geofence service (imagine Uber, Doordash, etc.) where the geofence definitions are probably large and have to be stored in memory. Depending on how often geofence queries happen, it might be more efficient to have a small number of geofence service instances with the geofence definitions loaded in memory rather than having this logic as a module that many workers need to have loaded. But again, cases like this are much less common than the cases where services just lead to massive bloat.

I was working at Uber when they started transitioning from monolith to microservices, and pretty much universally splitting logic into microservices required provisioning many more servers and were disastrous for end-to-end latency times.


I was working at Amazon when they started transitioning from monolith to microservices, and the big win there was locality of data and caching.

All of the catalog data was moved to a service which only served catalog data so its cache was optimized for catalog data and the load balancers in front of it could optimize across that cache with consistent hashing.

This was different from the front end web tier which used consistent hashing to pin customers to individual webservers.

For stuff like order history or customer data, those services sat in front of their respective databases and provided consistent hashing along with availability (providing a consistent write-through cache in front of the SQL databases used at the time).

I wouldn't call those areas where it makes things more efficient rare, but actually common, and it comes from letting your data dictate your microservices rather than letting your organiation dictate your microservices (although if teams own data, then they should line up).


I think Amazon (and Google and other FAANG entities) work at another scale than 99.9% of the rest of the world. The problems they face are different, including the impact of data locality.

I've seen many systems where there are a few hundred to a few thousand users, not hundred of millions or billions of users. There can also be teams of 5-10 developers who manages +20 microservices. I still don't think those projects have the same needs and could have done something else.


I started at Amazon in 2001 when we had around 400 servers running the website, it was large, but it was around the scale that most Fortune 500 companies operate at.

I'll agree that a team of developers with 20 microservices sounds obviously wrong (which is really my point that the site should be divided around data and the orgs should probably reflect the needs of the data--not the other way around). And Amazon back then didn't have that problem.


Thank you for pointing out caching. I was going to reply, at some point, that architecture in a well-designed distributed system considers both locality of data and the ability to cache data as a means of improving both latency and throughput and often does appreciably.

But I don't have to now.


Most web sites I've seen in industry have just two servers, and only for active-active availability. Because of this, cache locality makes very little difference, because both nodes need to cache common data anyway.

Cache locality starts making a difference when you have over ten servers, and a significant fraction of the data is cacheable.

For "transactional" Enterprise CRUD apps, often there is very little that can be cached due to consistency concerns.


I don't see the affinity between microservices and locality. Any kind of distributed storage, including cache, is going to include a choice about how data are partitioned, ordered, and indexed. If you want certain records to be in memory on the same nodes, you should not need a separate cache cluster to accomplish that, let alone a separate cache service implementation.


Microservices are less efficient, but are still more scalable.

Servers can only get so big. If your monolith needs more resources than a single server can provide, then you can chop it up into microservices and each microservice can get its own beefy server. Then you can put a load balancer in front of a microservice and run it on N beefy servers.

But this only matters at Facebook scale. I think most devs would be shocked at how much a single beefy server running efficient code can do.


You know, I don't really think microservices are fundamentally more scalable. Rather, they expose scaling issues more readily.

When you have a giant monolith with the "load the world" endpoint, it can be tricky to pinpoint the the "load the world" endpoint (or, as is often the case, endpoint*s*) is what's causing issues. Instead, everyone just tends to think of it as "the x app having problems."

When you bust the monolith into the x/y/z, and x and z got the "load the world" endpoints, that starts the fires of "x is constantly killing things and it's only doing this one thing. How do we do that better?"

That allows you to better prioritize fixing those scaling problems.


It sounds like creating problem, then spending time=money on fixing it and calling it a win?

There is a point when it all starts to make sense. But that point is when you go into billions worth business, hundreds of devs etc. And going there has large cost, especially for small/medium systems. And that cost is not one off - it's a day-to-day cost of introducing changes. It's orders of magnitude chaper and faster (head cound wise) to do changes in ie. single versioned monorepo where everything is deployed at once, as single working, tested, migrated version than doing progressive releases for each piece keeping it all backward compatible at micro level. Again - it does make sense at scale (hundreds of devs kind of scale), but saying your 5 devs team moves faster because they can work on 120 micoservices independently is complete nonsense.

In other words micoservices make sense when you don't really have other options, you have to do it, it's not good start-with default at all; and frankly Sam Newman says it in "Building Microservices" and so do people who know what they're talking about. For some reason juniors want to start there and look at anything non-microservice as legacy.


> It sounds like creating problem, then spending time=money on fixing it and calling it a win?

It sort of is.

It's not a perfect world. One issue with monoliths is that, organizations like to take a "if it ain't broke, don't fix it" attitude towards things. Unfortunately, that leads to spotty service and sometimes expensive deployments. Those aren't always seen as being "broken" but just temporary problems that if you pull enough all nighters you can get through regularly.

It takes a skilled dev to be able to really sell a business on improving monoliths with rework/rewrites of old apis. Even if it saves money, time, all nighters. It's simply hard for a manager to see those improvements as being worth it over the status quo. Especially if the fact of running the monolith on big iron masks the resource usage/outages caused by those APIs.


"It's simply hard for a manager to see those improvements as being worth it over the status quo"

That often is, because new solitions tend to come with their own problems.


Another way to look at this is microservices reduce the blast radius of problems.


How so? If functionality A is critical to functionality B, how will wrapping it in an HTTP call (microservices) reduce the damage from breaking functionality A?

I can see an advantage regarding resource hogging, but the flip side is the extra point of failure of network calls in microservices.

Not saying which is better, but deployment is orthogonal to logical dependence and correctness.


Most features in a modern app are not critical functionality, though.

For instance, in a shopping site, why should a crash in the recommendations engine result in a non-functional webpage (rather than a working purchase page with no recommendations)?

Personally I think microservices start to make sense when you have several hundred developers (an environment I'm currently keen to never enter again - $work has 5 devs and might one day have 6).


That makes sense. I am still trying to understand why does it differ between a monolith and microservices. The app in the monolith can make the call to non-critical functionalities time-limited and fault-tolerant, just like a network call has a time-out and can return nothing (in a simplified manner, it can wrap that call with a timer and an exception handler).

I agree that microservices are suitable for large organization, where the organization practically has multiple products (which could be purchased from a vendor or sold to another company).


If your monolithic service OOMs, hits a large GC pause causing dependent requests to time out, locks a shared file descriptor, or a bunch of other things then the monolithic service as a whole can hit a fault or stall even if other threads/tasks are still executing. While classes of errors like OOMs go away when multiple processes are executing.


A monolith can also scale vertically with mechanisms to redeploy on fatal errors. If all starts failing, you may have a problem. But you can get the same problems with a microservices that is in the critical path

Networks could have unexpected delays, routing errors and other glitches. At least with a monolith you can often find a stacktrace for debugging. I have seen startups that have limited traceability and logging when using micro services.

When a small startup has to manage "scalable" K8s infrastructure in the cloud, distributed tracing and monitoring is often not prioritized when you are a team of 5 developers trying to find a product market fit.

I am not against microservices (I work with them daily) but you just trade one type of stability problem with another


Right I'm not advocating for one over the other, I was just explaining issues solved by microservices. Now instead of the OOM Killer taking your service down, you have a flaky NIC on another microservice box and now you need to figure out how to gracefully degrade.

I love working with microservices at the scale of $WORK, but we're Big Tech. I can't imagine why a 5 person startup would want k8s and microservices. You don't need that scale until you have more than 2 teams, and you're pushing at the very least 15 engineers at that point and usually the sales and marketing staff to make that investment worth it.


"Classes of errors such as OOMs go away when multiple processes are executing"

You're going to have to explain what you mean by that a bit more... You surely cant mean it as it is written.


I don't think it was well expressed, but to reuse my last example: OOM-killer ending the recommendations process mid-request is less of a big deal if the main store server can keep running and serving traffic.

If the recommendations team write code that causes the OOM-killer to end their process, making them run it on separate infrastructure insulates your "main store team" from the bugs they write.


It was about the OOM killer as the sibling comment says, yeah. I'm surprised you're so incredulous. OOM Killer and GC stalls are some things I've run up against in my career frequently. I'm sorry my comment didn't live up to your expectations, it was hastily typed on mobile.


His point was that the comment was unclear if you'd also read it hastily :-)

I imagine his logic was something like: "How can OOMs happen less often if you run more processes (possibly on the same machine)?", while your comment actually wants to say: "if a specific service is affected by an OOM, with microservices only that specific microservice goes down, since it's probably running on its own hardware".


Resource hogging is a huge class of errors, though. Everything from a bad client update DDoSing a feature, file handles, memory leaks, log storage (a little outdated now perhaps), and so many more...


Sometimes. If the services are all interrelated, you’re as dead on the water with Microservices was you would be in a monolith.


Sure depends on the architecture. When the auth service is down, everything else should be down. But when the "optional feature" service is down, a core component should be unaffected by that.

Split up services where it makes sense and don't over do it. That's how I design my projects.


When the auth service is down, only new authentications should fail. Existing auth sessions should continue to function just fine. This exact failure mode occured at Google in early 2021 which caused a fairly big outage but not as big as it could have been because of this design choice.


For many websites, there is considerable amount of content that should be available if auth is down. Think eshops, news portals, etc.


You design your microservices so that they gracefully degrade.

So if a database service is not available you simply return stale, cached data until the service is back up.


> I don't really think microservices are fundamentally more scalable

It depends on what you are scaling. I think microservices are fundamentally more scalable for deployment, since changes can be rolled out only to the services that changed, rather than everywhere. Unless your language and runtime support hot-loading individual modules at runtime.


I disagree, in my opinion micro-services hinder scalability of deployment, and development - at least the way I see most businesses use them. Typically they break out their code into disparate repositories, so now instead of one deployment you have to run 70 different ci/cds pipelines to get 70 microservices deployed, and repo A has no idea that repo B made breaking changes to their API. Or lib B pulled in lib D that now pollutes the class-path of lib A, who has a dependency on lib B. Often you need to mass deploy all of your microservices to resolve a critical vulnerability (think log4shell)

The solution to this is to use the right tool, a build system that supports monorepos like Bazel. Bazel solves this problem wonderfully. It only builds / tests / containerizes / deploys (rules_k8s, rules_docker) what needs to be rebuilt, retested, recontainerized, and redeployed. Builds are much faster, developers have God like visibility to all of an organizations' code, and can easily grep the entire code base and be assured their changes do not break other modules if Bazel test //... passes. It is language agnostic so you can implement your services in whatever language best suits it. It allows you to more easily manage transitive dependencies, manage versions globally across your org's codebase.

Of course Bazel has a steep learning curve so it will be years before it is adopted as widely as Maven, Gradle etc. But in the banks I've worked at it would've saved them tens of millions of dollars.

Also git would need to catch up to large codebases. I think Meta released a new source control tool recently that is similar to git but could handle large monorepos.


Man, I wish my colleagues would read your comment (and at least question their believes for one brief moment)…

> I disagree, in my opinion micro-services hinder scalability of deployment

…and of anything related to testing:

- Want to fire up the application in your pipeline to run E2E tests? Congratulations, you must now spin up the entire landscape of microservices in k8s. First, however, you need to figure out which versions of all those microservices you want to test against in the first place, since every service is living in a separate repository and thus getting versioned separately.

- Want to provide test data to your application before running your tests? Well, you're looking at 100 stateful services – good luck with getting the state right everywhere.


Breaking APIs do not happen often at least in my projects. We mostly add new Endpoints for new features (non breaking change).

We keep object definitions in a separate repo to not duplicate them and generate a maven dependence from that. Pretty simple actually. We mainly run on Java thought. Would be more complicated when there are multiple languages.

Protobuf or Json schema could come to the rescue if needed.


If you have many app servers and they all run copies of the same app you can roll out new versions to a few servers at a time. You just have to handle the db update first but you need to do that with microservices anyway (they might use smaller databases and therefore making it somewhat easier).


Git and/or feature flags exist for this reason. Adding a network layer isn't the fundamental insight here, but it can cause additional consequences.


I'm talking about the scalability of actually delivering new code to servers (or "serverless" runtimes). Feature flags don't help with that.

Admittedly, it isn't a scalability problem you will run into right away.

But when you need to roll out an emergency fix, there is a big difference between deploying to thousands of servers that all have everything, and ten servers running a single service.


That’s a really interesting point - something that could probably be addressed by module-level logging and metrics. That said, even as a pro-monolith advocate, I can see why it’s preferable to not allow any one module/service to consume all the resources for your service in the first place. The service boundaries in microservice architectures can help enforce resource limits that otherwise go unchecked in a larger application.


It's one I've ran into a few times in my company (which has a large number of these types of endpoints).

The silly thing is that something like the JVM (which we use) really wants to host monoliths. It's really the most efficient way to use resources if you can swing it. The problem is when you give the JVM 512GB of ram, it hides the fact that you have a module needlessly loading up 64gb of ram for a few seconds. Something that only comes to a head when that module is ran concurrently.


Microservices enable independent scale _domains_. This means they can be more scaleable and dense, but it's not necessarily true.


> Servers can only get so big. If your monolith needs more resources than a single server can provide, then you can chop it up into microservices

200 threads, 12TB of RAM, with a pipe upwards of 200GB/s. This isn't even as big as you can go, this is a reasonable off the shelf item. If your service doesn't need more than this, maybe don't break it up. :)

I believe that this level of service can no longer accurately be described as "micro".


This cannot be emphasized enough. The top of the line configuration you can get today is a dual EPYC 9654. That nets you 384 logical cores, up to 12 TB of RAM, and enough PCIe lanes to install ~40 NVMe drives, which yields 320 TB of storage if we assume 8 TB per drive. All for a price that is comparable to a FAANG engineer's salary.


And you're missing one little thing - if you get all of your processing on two of those, you save way more than 4 engineers salaries in development/maintenance costs.

Let alone - AWS Lets you get that machine for less than a junior engineer salary ($7.5 per hour, roughly equivalent to $32 of hourly wage = $67k)


How about 2 erlang OTP nodes (homogenous)? I don't have a real world use case of that load but I often imagine I would have 1:2 RAM ratio to be able to vertical scale each time. For example, start with 1TB(A):2TB(B), if that's not enough, scale A to 4TB. When load climbs start to exceeds, scale B to 8TB .. so on alternately.


Helps to have a language that natively uses more CPU cores and/or training for the devs.

Ruby, Python, PHP and Node.js startups have to figure out how to use the cores while C++, Rust, Erlang and Go have no issues running a single process that maxes out all 64 CPU cores.


This is exactly what I do. When it comes to your regular backend business server I write stateful multithreaded monolith in C++ sitting on the same computer as the database hosted on some multicore server with gobbles of RAM (those are cheap now). Performance is insane and is enough to serve any normal business for years or decades to come.

So it does not work for FAANG companies but what do I care. I have the rest of the world to play with ;)


Even for non-FAANG, less-than-a-million-user business applications, there are two problems: 1. Your feature/function scope is not all fully defined at the start and is not static till the end of life. Software has to evolve with business. In this case, it is easier to build a loosely coupled shared nothing architecture that can scale easily than to build a shared-everything-all-in-one-binary monolith architecture. 2. Your customer base isn't one size fits all. You usually different growing businesses that need solution at different scale points but still with very high unit economics. This means you need a incremental scaling solution – this is where old-school big-chassis systems build blade scalable server architectures. But because of custom/proprietary backplane design they become unmanageably complex and buggy.

Instead, if you build an architecture that can scale the number of corporate users by adding cheap $2k pizza box 1u servers as the company grows, that's much more attractive. Also, you can keep your systems design flexible enough to recompile and take advantage of advancements in hardware tech every 18 months – this gives you better operating margins as your own business starts to grow.


Sorry but sounds like your typical FUD with absolutely no no basis. Almost like it was written by bullshit generator or bot.


>So it does not work for FAANG companies but what do I care. I have the rest of the world to play with ;)

As long as hype chasers in middle management don't get in the way after convincing themselves they too must be like FAANG with a few orders of magnitude less of a consumer base.


the middle management especially half cooked engineers who drank the cool aid and became managers are hard to reason with.

They want to be both the architect and the manager and anything you say would be over ruled and since they are the boss its hard to ignore them.

This service is a monolith because it has 10K code and it needs to be broken up.The product is at MVP and its rock solid on Java Spring and it hardly crashes. We are never going to lose data based on the design choices we made. None of that matters.

We need zero down time upgrade, when we had zero customers.


When I do work for my clients I usually bypass that level.


Doesn't really matter that much if a language is using all cores from one process or if there is multiple processes.


we have all sorts of problems maxing out all the cores on a c6.24xl if youre memory or network conatrained. Even if CPU constrained it can be hard.


You can have more than one server per monolith.

I don't think you actually understand what microservices are. You don't put a load balancer to load balance between different services. A load balancer balances trafic between servers of the same service or monolith.

Microservices mean the servers of different services run different code. A load balancer only works together with servers running the same code.


>A load balancer only works together with servers running the same code.

Uh - what?

>A load balancer balances traffic between servers

Correct.

> of the same service or mononlith

Incorrect.

Load balancers used to work solely at Layer 4, in which case you’d be correct that any 80/443 traffic would be farmed across servers that would necessarily need to run the same code base.

But modern load balancers (NGINX et al) especially in a service mesh / Kubernetes context, balance load at endpoint level. A high traffic endpoint might have dozens of pods running on dozens of K8s hosts having traffic routes to them by their ingress controller. A low traffic endpoint’s pod might only be running on few or one (potentially different) hosts.

The load balancer internally makes these decisions.

I hope this clears up your misunderstanding.


Each set of upstream hosts in nginx is a single instance of load balancing. You aren't load balancing across services, you're splitting traffic by service and then load balancing across instances of that service.

The split is inessential. You can just as easily have homogeneous backends & one big load balancing pool. Instances within that pool can even have affinity for or ownership of particular records! The ability to load balance across nodes is not, as you claimed, a particular advantage of microservices.


> The ability to load balance across nodes is not, as you claimed, a particular advantage of microservices.

Are you replying to the right comment? I made no such claim.


I don't really think of route based "load balancing" as load balancing. That's routing, or a reverse proxy. Not load balancing. Load balancing is a very specific type of reverse proxy.

The point is, if a client makes a request to a server, the response should always be the same, no matter where the load balancer sends the request to. Which means it should run the same code.

Nginx doesn't even mention route based or endpoint based load balancing in their docs. Maybe they don't consider it load balancing either.

https://www.nginx.com/resources/glossary/load-balancing/


https://www.nginx.com/blog/nginx-plus-ingress-controller-kub...

Describes exactly what I’m talking about.


That is still routing. Not load balancing. The load balancing is only between pods of the same service.

Just because Nginx is doing it, doesn't mean it's load balancing. It's an extra function tacked onto a load balancer.


Friend, you don’t know what you’re talking about and if linking NGINX documentation literally describing load balancing algorithms applied across Kubernetes pods hosting endpoints doesn’t clear things up for you, I don’t think anything will.


Yeah, it's bin packing, not straight efficiency. Also, people seem to exaggerate latency for RPC calls. I often get the feeling that people who make these latency criticisms have been burned by some nominal "microservices" architecture in which an API call is made in a hot loop or something.

Network latency is real, and some systems really do run on a tight latency budget, but most sane architectures will just do a couple of calls to a database (same performance as monolith) and maybe a transitive service call or something.


Couple of calls is normal. But if you make everything a microservice. And there are dependencies between them, then by design its destined that some hotter loop eventually will contain rpc.


> Microservices are less efficient, but are still more scalable.

Not at all. You can run your monolith on multiple nodes just as you can do with microservices. If anything, it scales much better as you reduce network interactions.


You can run multiple instances of most stateless monoliths


It's also way easier to design for. The web is an embarrassingly parallel problem once you remove the state. That's a big reason why you offload state to databases - they've done the hardest bits for you.


It is about striking a balance.

Little big concept or mono-micro people like the call it is where it is at. Spending too much time making a complex environment benefits noone. Spending too much time making a complex application benefits noone.

Breaking down monoliths into purposed tasks and then creating pod groups small containers is where it is at. Managing a fleet on kubernetes is easier than managing one in some configuration management stack. Be it puppet or salt. You have too many dependencies. Only your core infrastructure kubernetes should be a product of the configuration management.


Running a single server often can’t meet availability expectations that users have. This is orthogonal to scalability. You almost always need multiple copies.


You can run most applications on a single server - IBM S/390 big iron has very high reliability, redundancy and I/O bandwidth.


Yes but it’s literally a single point of failure. You probably want at least two servers in separate physical locations. Also how do you do deployments without interruption of service on a single server?


> it’s literally a single point of failure.

Those machines are made very redundant.

All components in a Parallel Sysplex configuration can operate in redundant mode.

Redundant power. Redundant central processor complex. Redundant multichip module (what would be a chipset on microcomputer platforms). Redundant RAM Memory. Redundant Storage. Redundant internal interconnects. Redundant network adapters.

That's why corporations that require high availability to not get sued for millions/billions use them.


Redundant physical location. That means different power dependencies and different cities.


> Redundant physical location. That means different power dependencies and different cities.

Geographically Dispersed Parallel Sysplex.


Blue/green deployments and request draining like with any other deployment topology.


What if I have $500/month to spend and still need high reliability and redundancy?


That's a budget for a local crafts store website hosting, not "high availability" system


> That's a budget for a local crafts store website hosting, not "high availability" system

I'm not sure about that: you could still put something together within that budget, say, a few different VPSes across Hetzner and Contabo (or different regions within the same provider's offerings), with some circuit breaking and load balancing between those. Probably either a managed database offering, or a cluster of DBs running on similar VPSes.

Of course, this might mean that you have 1-3 instances of a service instead of 10-30, but as long as availability is the goal and not necessarily throughput, that can go pretty far.


If you're a amateur something, that just does it for fun - sure

Otherwise what you have is a budget that is lower than the possible implications of temporary downtime. That doesn't make sense in the real world.


> If you're a amateur something, that just does it for fun - sure

  sed "s/amateur/comparatively poor, from a third world country, without VC money, or have cheap labor/g"
Not everyone can afford advanced tools or platforms, or even using something like AWS/Azure/GCP. Some of those can indeed be amateur use cases (e.g. side project or bootstrapped SaaS), others simply stretching your money for any number of considerations (e.g. non profit, limited budget etc.), but it's definitely possible. In some countries it probably makes more sense to just build your own solution, as long as you're not doing anything too advanced.

500 USD a month would get you approximately the following resources (taxes vary) on the aforementioned platforms:

  Contabo
  Nodes: 15 to 83 (depending on configuration)
  CPU: 150 to 332 cores
  RAM: 664 to 900 GB
  SSD: 4150 to 6000 GB
  
  Hetzner
  Nodes: 7 to 110 (depending on configuration)
  CPU: 86 to 192 cores
  RAM: 192 to 384 GB
  SSD: 2200 to 4800 GB
  
  (this includes regular VPS packages, not storage optimized ones, or dedicated hosting etc.)
I'm not sure about you, but in my experience that could be enough for some pretty decent systems, albeit some storage heavy workloads would need the storage packages instead of the regular VPS ones. It's mostly a matter of picking a suitable topology and working towards your goals.

> Otherwise what you have is a budget that is lower than the possible implications of temporary downtime. That doesn't make sense in the real world.

This (depending on the circumstances) does sound like a good point! Maybe "the real world" isn't the best wording, though, and choosing "enterprise settings" or anything along those lines would be more suitable.


I could probably design and deploy an HA system for way less. Maybe less than $200/month. It wouldn't be the most performant, but would be HA in three regions.

But it leads me back to my original statement - extreme requirements for uptime don't come out of nothing.

If you're in a location where IT related labor is extremely cheap - you're just going to have people keep one server up.

I know I used to do exactly that, because the server was more than my annual income. But that didn't last long. After the first 20 minute downtime, the budget for HA solution was allocated. But before a certain point downtime wasn't expensive.

Non-profits would probably be the only reasonable exception, where HA and low budgets could coincide. Otherwise - nah...


Those are all fair points, perhaps even more so given the trend of compute and other resources generally becoming more cheap with time (things like Wirth's law and limited IPv6 support aside), thanks for expanding on your arguments!


>> Otherwise what you have is a budget that is lower than the possible implications of temporary downtime. That doesn't make sense in the real world.

> Maybe "the real world" isn't the best wording, though, and choosing "enterprise settings" or anything along those lines would be more suitable.

This is the point - for-profit corporations, by definition, don't want to waste money, and if "high reliability" isn't required (or they don't know about it), they don't waste money. Most of the time, they don't waste the money.

However, if "being cheap" would means billions in lost income (and they know about it), they really want to have reliable, redundant infrastructure and systems around.


well, to be fair, you could host a highly available local craft store website for under 500 dollars on $cloudprovider easily. You could also trivially get regional redundancy.


Go to Digitalocean or a similar provider. Launch a managed HA db. (starts at ~$120/month). Launch an autoscaling HA K8 cluster (starts from ~$80) , deploy 1-2 stateless pods. There you go.


That's what I do. But it's not IBM.


> Servers can only get so big. If your monolith needs more resources than a single server can provide, then you can chop it up into microservices and each microservice can get its own beefy server. Then you can put a load balancer in front of a microservice and run it on N beefy servers.

I've almost never seen situations where a single request would need more resources available, than the entire server has (outside of large GPT models for text, though maybe that's because I couldn't afford beefy machines for that myself).

Instead, if your monolith needs X resources to run (overhead) and Y resources to serve your current load, then in case of increased load you can just setup another instance of your monolith in parallel with another set of X + Y resources (same configuration) and it will generally almost double your capacity.

Now, there can be some issues with this, such as needing to ensure either stateless APIs or sticky sessions, but both are generally regarded as solved problems (with a little bit of work). Monoliths themselves shouldn't be limited to running just a single instance and aren't that different from a scalability perspective than microservices.

Where microservices excel, however, is that you can observe individual services (e.g. systemd services or them running in containers) better and see when a particular service is misbehaving or scale them separately, as well as decrease that X overhead since each service has a smaller codebase when you have lots of instances running. This does come at the expense of increased operational complexity and possibly noisy network chatter, especially if you've drawn your service boundaries wrong.

However, at the same time I've seen actual monoliths that can never have more than one instance running due to problematic architecture, so therefore I propose the following wording (that I've heard elsewhere):

  SINGLETONS - a monolithic application that can ever only have a single instance running, for example, when business processes are stored in memory for a bit, or have user sessions or something like that stored locally as well; these will ONLY ever scale VERTICALLY, unless you re-architect them
  MONOLITHS - applications that contain all of your project's logic in a single codebase, although multiple instances can be launched in parallel, depending on your needs; can be scaled BOTH VERTICALLY and HORIZONTALLY; they have more overhead though and observability can be a bit challenging
  MICROSERVICES - applications that contain a part of the total project's logic, typically across multiple separate codebases, possibly with shared library code, pieces of your project can be scaled separately, BOTH VERTICALLY and HORIZONTALLY; they are operationally more complex, can involve more network chatter and while you can observe how services perform, now you need to deal with distributed tracing
Of course, there can be more nuance to it, like modular monoliths, that still have one codebase, but can have certain groups of functionality enabled or disabled. I actually wrote about that approach a while back, calling them "moduliths": https://blog.kronis.dev/articles/modulith-because-we-need-to...

I don't actually expect anyone to use these particular terms, but I dislike when someone claims that monoliths have the issues of these "singleton" applications when in fact that's just because they've primarily worked with bad architectures. Sometimes they wouldn't need to shoot themselves in the foot with microservices if they could just extract their session data into Redis and their task queues into RabbitMQ. Other times, microservices actually make sense.


You can horizontally scale a monolith without going full microservices.


I really miss the old days where when an app was down it was obvious... Now apps just go unresponsive because of connectivity issues between services... We used to have frameworks that could communicate errors in one log, it was a beautiful thing. As cloud infrastructure became more powerful to better host so called "monolithic" systems, it was all scrapped to move to microservices, which were made for less powerful infrastructure. Everything was refactored to the point where it's expensive to go back... Pretty crazy how much money the industry wastes on refactoring...

On top of that, the tendency to get complacent with unstructured data in a lot of systems is really creating a very complicated lock-in when systems are developed for unique services on each cloud provider... Bowls of spaghetti.

Hire Solutions Architects for dev projects, make your apps future proof... I warn you. Too much microservice customization leads to vendor lock in and expensive operational costs... This is why a lot of apps get sunsetted early.


I think all your points are correct but none of them address scalability. What do you do if your one efficient machine is too small for your problem? To be fair, not a lot of problems reach that point in practice, definitely not every problem a microservice is written for.

It might still be completely viable to rewrite something so that it's only 10% as efficient but you can scale it horizontally easily.


I do think using something like Flatbuffers or CapnProto where the SerDe is removed (aside from the travel over the network) could be a huge win. Another thing I have always wanted to try is to develop a system using microservices with a RPC layer that uses immutable objects so you can later combine the services and convert RPCs to function calls.


So tell me, who is making microservices for calls that could be done wholly in memory? How are you handling rollbacks and such for this in memory logic?

Aren't you still writing out to external data stores? If that's the case then its really not a comparison of in process to a RPC, it just two RPC hops now, no?


When you move to microservices -- or rather, when you split your data across several DBs -- you sometimes end up basically reimplementing DB synchronization in business logic layers that could (sometimes, and depending on scalability constraints) be solved by using DB transactions.

So instead of a single DB transaction, often with a single SQL roundtrip, microservices can sometimes force through coordination of separate transactions to several DBs.


This has been my major criticism of them; you cement the design of the app by creating an organizational structure around your software components.

What happens if you need to redesign the architecture to meet new needs? That's right; it's not going to happen, because people will fight it tooth and nail.

You also cannot produce any meaningful end-user value without involving several teams.

Microservices is just "backend, frontend, and database team" reincarnated.

My take: do Microservices all you want, but don't organize teams around the services!


Implicit to this discussion is the belief that growth inevitably means more people, and that the collateral damage of more people is worth the cost of spending so much time to corral said people. So industry clings to frameworks with a single blessed path, languages that provide only one way to do things, caps expressiveness (just mention macros and people start frothing from the mouth).

This isn't easily fixable, but I'd like technologists to at least be able to perceive the extent to which the surrounding business culture has permeated into technical culture. I'm probably considered old AF by a lot of you (41) but I'm becoming increasingly interested in tools/methodologies that enable fewer devs to do a lot more, even if it means that there are sharp edges.


Fewer devs doing a lot more has always been my goal, and in hardware we would call this a vertical scaling versus horizontal scaling problem.

Horizontal scaling always incurs communication overheads faster, and as we know, developers have high communication overhead to begin with.

Human scaling is its own problem set and companies like to ramp projects up and then down again. Most of those people only think about how to get more people on board, not how to run the ship once attrition starts to condense responsibilities and reduce redundancy, especially for unpleasant tasks.

I’ve been trying to beat the idea of ergonomics into a smart but mule-headed developer (too old not to know better) for years now and it wasn’t until a combination of team shrink and having to work with code written by a couple of people who emulate his development style that all of a sudden he’s repeating things I said four years ago (of course with no self awareness of where those ideas came from). It remains to be seen if it affects how he writes code, or whether he starts practicing Do As I Say, Not As I Do, but I’m a short-timer now so it’s mostly an anthropological study at this point.


It seems natural and correct to me that the business concerns and technical concerns would be mixed together. Engineers should understand the business needs of the organization and account for them, just as leadership should understand technical needs.

As projects get larger, the problems become more organizational in nature.


Conways law: Tech follows communication patterns (.i.e org setup)

hence, if you want a certain architecture, you likely need to execute a "reverse conway law" org first. To get the org into the target config, the software will follow.


It’s okay for one team to have several logical units of code. In fact from a Conway First model, it’s helpful because it allows horse trading when it turns out the team competencies don’t match responsibilities very well. I’m not dismantling my team, I’m just transferring module ownership to another team.

What does not work is splitting ownership of many tings for a long time. It’s too many plates to spin. But some things can be shared by a few people without tipping over into anarchy.


Conway's law is an adage that states organizations design systems that mirror their own communication structure. It is named after the computer programmer Melvin Conway, who introduced the idea in 1967. His original wording was:

  Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure.

  — Melvin E. Conway
https://en.wikipedia.org/wiki/Conway's_law


> My take: do Microservices all you want, but don't organize teams around the services!

Exactly. In order to reap the benefits of modular/Microservices architecture, you need teams to be organized around product/feature verticals.

IMO it’s not about the internals of a product/service architecture, but about encapsulating, isolating, and formalizing inter-organizational dependencies.


Organizing teams around microservices makes ... a lot of sense?

Also - talking of redesign should be a major lift for any product/service that has real paying clients. The risk of breaking is huge, and the fact that during that time you won't be delivering incremental value is also going to look bad.


One of the principal engineers at Amazon called that “Org-atecture”.


Amazon is a good example of a huge organization absolutely crippled by micro-service fiefdoms driven by sr.mgr whims and not strong technical guidance from such PEs.


Depends entirely on the org. My org actually broke SOA (thousands of services and some intense technical debt) and is now in a years long process of moving toward a centralized service orchestration model. This couldn’t have happened without PE, Sr PE and DE involvement.


Now that sounds interesting. I assume you're not counting each API endpoint as a service, so can you shed any more light on this? The scale sounds mind-boggling. Thousands of separate services being pulled (more) together. Can you give an idea of the size or scope of these services?


>1000 services, each with many APIs :)

Amazon's warehouses have incredible domain complexity which drives the complexity of the software architecture. It's also one of the oldest parts of the company with a direct lineage going back 20+ years. (for example, our inventory databases still have the same schema and are still using a RDBMS, albeit RDS Postgres instead of Oracle).

About five years ago we started rearchitecting to a centralized orchestration service which is being used to abstract this complexity from process teams. This is to avoid projects where 20+ service teams must be engaged just to launch a new solution.


And look how much damage it's done to their business.


Without the organizational benefits of a microservice you are just just making intermodular calls really slow, and fail intermittently.


> What happens if you need to redesign the architecture to meet new needs?

How often does major reorganization happen? For most companies the answer is never.

The "what if" argument is what leads to all sorts of premature optimizations.


I've had three reorgs in the past six months. Each one changed our projects and priorities.

Yeah, didn't get a lot done...


> How often does major reorganization happen? For most companies the answer is never.

For most companies – regularly.


The kind of reorg we're talking about here is not the typical musical chairs, but a full realignment. This happens a lot less frequently in my experience.


reorgs usually happen higher up the tree in larger companies, been through a bunch and its never affected any part of our work.


Reorgs rarely happen.

But the need for them does :)


> but don't organize teams around the services

This is actually the whole point (see: Reverse Conway).


indeed, you might as well build a distrubuted monolith if your priority is absolute flexebility and ability to re-design on the fly


"This has been my major criticism of them; you cement the design of the app by creating an organizational structure around your software components."

Conway's Law is basically "You don't have a choice." You will cement the design of your app around your organizational design. (At least, beyond a certain org size. If you have only 4 engineers you don't really have the sort of structure in question here at all.) je42 narrowly beats me to the point that if this is a problem you can try to match your organization to your problem, but that takes a fairly agile organization.

"What happens if you need to redesign the architecture to meet new needs? That's right; it's not going to happen, because people will fight it tooth and nail."

Unfortunately, in the real world, this is not so much "a disadvantage to the microservice approach" as simply "an engineering constraint you will have to work around and take account in your design".

Despite what you may think after I've said that, I'm not a microservice maximalist. Microservices are a valuable tool in such a world but far from the only tool. As with everything, the benefits and the costs must be accounted for. While I've not successfully rearchitected an entire organization from my position as engineer, I have had some modest, but very real success in moving little bits and pieces around to the correct team so that we don't have to build stupid microservices just to deal with internal organizational issues. I don't mean this to be interpreted as a defeatist "you're doomed, get ready for microservices everywhere and just deal with it"; there are other options. Or at least, there are other options in relatively healthy organizations.

But you will have code that matches the structure of your organization. You might as well harness that as much as you can for the benefits it can provide because you're stuck with it whether you like it or not. By that I mean, as long as you are going to have teams structured around your services whether you like it or not, go in with eyes open about this fact, and make plans around it to minimize the inevitable costs while maximizing the benefits. Belief that there is another option will inevitably lead to suboptimal outcomes.

You can't engineer at an organizational level while thinking that you have an option of breaking the responsibility and authority over a codebase apart and somehow handing them out to separate teams. That never works long term. A lot of institutional malfunctioning that people correctly complain about on HN is at its core people who made this very mistake and the responsibility & authority for something are mismatched. Far from all of it, there are other major pathologies in organizations, of course. But making sure responsibility & authority are more-or-less in sync is one of the major checklist items in doing organization-level engineering.


You never want code with shared ownership to tolerate feature creep. That’s impossible to keep on the rails. If you’re going to use SOLID anywhere, it’s in shared code.

If your org doesn’t suffer feature creep willingly, I believe that means you can have a lot more flexibility with respect to Conway’s Law. A low-churn project can maintain a common understanding. Not all of them will of course, but it’s at least possible. Feature factories absolutely cannot, and you shouldn’t even try.

What really kills a lot of projects though is an inverted dependency tree. And by inverted I mean that the most volatile code is at the bottom of the call graph, not the top. In this case every build, every deploy, or in the microservices scenario every request, can have different behavior from the previous one because some tertiary dependency changed under you. Now you need all sorts of triggers in your CI/CD pipeline to guarantee that integration tests are rerun constantly to figure out where the regressions are being introduced. Your build pipeline starts to look like one of those server rooms with loose wires everywhere and a bucket under the AC unit. Nothing is stable and everything is on the verge of being on fire at a moment’s notice.


I mentioned this elsewhere, but I think it's a good idea to read the actual text of conways law:

> Any organization that designs a system (defined broadly) will produce _a design whose structure is a copy of the organization's communication structure_.

Organization is the primary method by which we organize communication (indeed, that's its only point). That's why architecture follows organization. It is possible to "break" that boundary by organizing other forms of communication, for example the feature teams of SAFE (not a proponent of safe, just an example).


This is the "reverse Conway's law" that some others are referring to, and I alluded to. I've had success doing it in small places here and there. I'm yet to convince a management organization to do a really large-scale reorg because of some major pathology, though. Though I haven't tried often, either. Really just once. The costs are automatically so high for such a proposal that it is difficult for mere engineering needs to get to net-positive because it's hard to prove the benefits.


Part of the issue is that not everyone understands the trade-offs or perhaps believes the trade-offs so you get teams/leaders haphazardly ignoring expert guidance and crafting careless micro-services around team boundaries that do not line up with value.


I find the responsibility & authority angle is very helpful in these conversations. It doesn't take a lot of convincing (in a reasonably healthy organization) to say that these should be broadly aligned, and that if you can show people a place where they are either currently grossly out of sync, or are going to be out of sync if we move forward with some particular iteration of the plan, they start thinking more about how to bring them in sync.

The good news is, pretty much everyone has been on at least one side of that balance being broken (that is, having responsibility without authority, or being subject to someone else's authority without responsibility), and a lot of people have been on both, and once you bring it to their attention that they're creating another one there is a certain amount of motivation to avoid it, even if it is a bit more work for a them. You can actually use people's experiences with bad organization to convince them to not create another one, if you are careful and thoughtful.


> This has been my major criticism of them; you cement the design of the app by creating an organizational structure around your software components.

That will always happen, and I've seen it happen in a monolith too (expressed as module ownership/ownership over part of the codebase).

It's inevitable in almost all organizations.


How do you organize teams if not around services? As the GP points out, the whole point of SOA is to scale people. Yes, this makes rearchitecture hard, but that is always the case at scale. The problem of territoriality and resistance to change needs other solutions (tldr; clueful management and mature leadership level ICs who cut across many teams to align architecture vision and resolve disputes between local staff engineers)


Exactly my question. The teams are going to be organized around something. Even if not intentionally, it'll be true in practice, as people are going to end up knowing things. And if you fight that, it only gets worse.

Long ago I consulted for one company that reconstituted teams for every new project, so people got swapped around every few months, usually ending up in a new chunk of the code. Whatever the theoretical benefits, it meant that nobody felt a sense of ownership for any of the code. Combine that with aggressively short project deadlines and an absolute refusal to ever adjust them and the code rapidly dropped in quality. After all, you'd be on something else soon, so any mess wasn't your long-term problem. And why should you bother given that the code you started with wasn't pristine?

As far as I can tell, the best it gets is organizing around the most durable features of the business environment. E.g., audiences and their real-world needs. Next, long-lived solutions to those needs. And then you find the common and/or large needs of those teams and spin up service teams around them. And in the background, some strong culture around what the priorities really are, so that service teams don't get so self-important that they become the tail wagging the dog.

But I love all the war stories here and would love to hear more, as I think there are no perfect global solutions to organizational problems. I think it's mostly in the details of circumstance, history, and the people involved.


Organize around delivering value to the customer. This is the obvious split for different products, and it is natural for something like AWS where each service is a relatively small team. Sometimes the customer is internal and they need an API, and internal customer decides when they are satisfied, not the API passing tests. SOA will arise, but services need to be big rather than micro, have defined internal customers and their own team. Better to have teams with frontend, backend, designer, etc. who collectively complete features, rather than hoping a bunch of services integrate into a solution for the customer at the end. It is much easier to get the architecture right when people talk first and often about what the customer needs more than implementation details. I think once you have decided how data is going to be persisted and what the customer needs to accomplish, the architecture falls into place. The application code should be stateless so it is never a scaling issue. When the code is isolated by customer feature, the blast radius for future changes is much smaller.

Microsoft, Apple, Amazon I think do a good job at preaching the customer first model and organizing around that.


Organize teams around products. The point is to increase value delivered to the customer through the product, not to produce left-pad-as-a-service.


Agreed. I've worked at a scarce few companies who organized this way, but it seemed to work better than "frontend team, backend team, devops team." When each piece of the stack is a different team, it creates unnecessary hostility and miscommunication. Aligning engineers around products, such that your iOS dev and your backend dev are on the same team, seems to run more smoothly. YMMV.


I think it runs a bit deeper than that. Microservices are how the business views its self.

Look, once upon a time managers designed the system that workers implemented. The factory assembly line, or the policy manual.

But now the workers are CPUs and servers. The managers designing the system the workers follow are coders. I say coders are the new managers.

Now this leaves two problems. The first is that there are a lot of "managers" who are now doing the wrong job. (and explains perfectly why Steve Jobs gets involved in emails about hiring coders.) But that's a different post.

For me the second problem is microservices represent the atomic parts of a business. They should not be seen as a technical solution to anything, and because the first problem (managers arent managing workers anymore) there is no need to have the microservices built along Conways Law.

And so if you want to build a microservice it is not enough to consider the technical needs, it is not enough to automate what was there, you need to design an atomic building block of any / your company. They become the way you talk about the company, the way you think about the processes in a company.

An mostly the company becomes programmable. It is also highly likely the company becomes built mostly of interchangable parts.


> there is no need to have the microservices built along Conways Law

This is a misunderstanding of Conway's Law. Your code _will_ reflect the structure of your organization. If you use microservices so will their architecture. If you use modules, so will the modules.

If you want a specific architecture, have your organization reflect the modules/microservices defined in that architecture.


But, I am trying to feel my way towards the idea that that was true when managers arranged systems for workers to follow, and then we came along to automate the current process. But if we have a system where the workers are the CPUs, then the people doing the managing are the coders.

The point ebing is that if workers are CPUs and coders are managers, then why worry about how the managers of the coders are arranged. Get rid of that management layer. Conway is not implying the financiers of the organisation affect the architecture.

This basically means that the microservices a company is built out of should more readily align to the realities of the business model. And instead of shuffling organisations around it would behove executives to look at the microservices.

One can more easily measure orders transferred, etc, if the boundaries are clearer.

Plus conway is just a ... neat idea, not a bound fate.

There is a caveat with the architecture reflecting the organisation of the software teams, but that usually follows the other to-be-automated structure as well.


Managers arranging work is _not_ why Conway's law is true though. I think it behooves one to look at the actual text of the "law":

> Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure.

It doesn't matter WHO defines or does the work (the coders themselves in a team or the manager). It matters how communication is arranged. And communication naturally aligns itself along organizational and team boundaries.

When a group/person/whatever writes an API they are MOSTLY writing a document for other human beings. That is an act of communication. If you meet with someone everyday in a stand-up you're likely to put a less structured API between yourselves than if you meet with someone once a month, or once a year. You're also going to create a finer grained subdivision of the API for them as you're likely working on closely related posts of the problem.

Organization creates communication, and communication creates architecture. Therefore organization creates architecture.

> Plus conway is just a ... neat idea, not a bound fate

It is in fact bound fate. Code is written to communicate with others.

People come together, they decide to solve a problem. They divide up that problem and create functions (an API), objects (an API), modules (an API), and services/microservices (an API). At each later they've written something that communicates with others. The structure of that code reflects the necessary communication to the other parties for their part of the problem. This the code is structured following the lines of communication in the organization.

Organization is how we structure communication (that's it's primary point) and thus organization = architecture + some wiggle room for communication paths defined outside that (say a SAFE feature team).


> It matters how communication is arranged.

Just to add it here, how evaluations and goals are set matter too. Those tend to be even more constrained by the formal organization than communication.


The systems are not only organized on the programmers managers, but their managers too, all the way up the chain.


Very good point. Tying that back into the Conway's law angle on all of this...

I've been a couple levels up the chain at times in my career. When I've seen technical decisions at this level done well is when those people realize that their organizational decisions are inevitably technical decisions because of Conway's law. When the impact of Conway's law is not explicitly considered, things will often look random at the IC level. Too often, organizational leaders don't realize they are dictating technical outcomes with decisions that appear to just be about reporting structure.

An example is when a major new product feature comes along that is big enough to need multiple teams to build/own it for the long term. Where are those teams placed in the organization? Conway's law helps us see that a huge part of the future architecture is being determined by answering that question. Placing those new teams in an existing org mainly responsible for UX will result in a very different architecture than putting the new teams in an org mainly responsible for monetization. Can the teams be split across those orgs? Well, can you envision a reasonable architecture with an enforced divide along those lines? Without recognition of Conway's law, that last question might not be considered and the decision to split or not will be based on other factors.

Unfortunately that requires multidisciplinary thinking (technical architecture, resource allocation, communication structure, people development, ...). Such broad thinking is not always easy. When leadership starts neglecting one or more of these disciplines, more and more of the people doing the work on affected teams start perceiving the decisions as arbitrary and counter productive. Organizational churn (e.g., teams repeatedly changing location within the org over the course of a year or two) is often leadership looking at a different subset of these dimensions over time, yielding different "correct" answers.


I think microservices turn this around - the idea is that if we ignore them as technical units and treat them as organisational units we can invert conway's law - it's much easier to change how microservices talk to each other than it is to chnage how people do.

But if you chnage the dataflow in a few microservices so that the accounts team no longer deals / works directly with the sales team you have chnaged the organisation.

plus it's way easier to monitor activity etc


Exactly. Conway's Law is descriptive not prescriptive!


Quite. And many people still seem to confuse this. The way to exploit Conway's law is not to do anything _technical_ but rather to do something _organisational_. The "Reverse Conway Manoeuvre" is the term, IIRC, for this. Rather than trying to "build something along the lines of Conway's law" (which doesn't really make sense) -- one should determine a desired architecture and then set up the _organisation_ and teams to mirror that.


> There's two technical problems that microservices purport to solve

There's a third technical problem that microservices solve, and it's my favorite: isolation. With monoliths, you provide all of your secrets to the whole monolith, and a vulnerability in one module can access any secret available to any other module. Similarly, a bug that takes down one module takes down the whole process (and probably the whole app when you consider cascading failures). In most mainstream languages, every module (even the most unimportant, leaf node on the dependency tree) needs to be scoured for potential security or reliability issues because any module can bring the whole thing down.

This isn't solved at the language level in most mainstream languages. The Erlang family of languages generally address the reliability issue, but most languages punt on it altogether.

> The real reason that microservices may make sense is because they keep people honest around module boundaries.

Agreed. Microservices, like static type systems, are "rails". Most organizations have people who will take shortcuts in the name of expedience, and systems with rails disincentivize these shortcuts (importantly, they don't preclude them).


>> because they keep people honest around module boundaries

Imagine a world where every pip/nuget/cargo package was a k8s service called out-of-process and needed to be independently maintained. We would have potentially hundreds of these things running, independently secured via mTLS, observability and metrics for each, and all calls run out of process. This is the abysmally slow hellscape that some are naively advocating for without realizing it.

It is relatively obvious what should be a library (json, numpy, NLog, etc), and what should be a service (Postgres, Kafka, Redis, NATS, etc.) when dealing with 3rd party components. It is also obvious that team scaling is not determined by whether code exists in a library or service from this example since all 3rd party code is maintained by others.

However, once we are all in the same organization, we lose the ability to correctly make this determination. We create a service when a library would do in order to more strictly enforce ownership. I think an alternate solution to this problem is needed.


But of course team scaling is determined this way. Merely releasing a library version has zero business impact. To have impact, library teams must go around to every consumer team and beg them to accept a version bump. This is a) shitty work for which it is impossible to retain people, and b) intractable without an enormous labor force once the number of consumers is big enough.

Services can be continuously deployed by small owning teams.


If library teams could force consumers of the library to upgrade the problem is seemingly solved.

Is this correct or are there other aspects that you are considering?


> Microservices [..] actually solve for a human problem in scaling up an organization.

So does modularity.

"The benefits expected of modular programming are: (1) managerial_development time should be shortened because separate groups would work on each module with little need for communication..."

On the Criteria To Be Used in Decomposing Systems into Modules, D.L. Parnas 1972.

http://sunnyday.mit.edu/16.355/parnas-criteria.html


Parnas was writing under an assumption of BDUF, big-bang releases, and expensive hardware instances. As soon as you want to decouple module deployments and accommodate changing requirements over time you need something else. That "something else" might be "making sure your modular monolith has a good enough test suite that any team can deploy it with only their changes" or it might be "microservices for all", but Parnas' assumption that independently verified modules will need to be built once by independent teams then function correctly when assembled has been comprehensively squashed in the last 50 years.

He's right as far as Conway's Law goes, though.


Here I guess BDUF = big design up front.

Edit: failed to spell "I" correctly. Ouch.


Can you explain why you can deploy microservices independently but not modules?


You can deploy modules independently, but the technical and organisational measures you need in place to do it safely are an extra step you need to take whereas with microservices they're built in. Modules live in the same execution context, so you need a good shared ownership model for that context itself.

The point is that Parnas never conceived of doing it because he was writing about a world where the interfaces between modules were known ahead of time and were static.


> the technical and organisational measures you need in place to do it safely are an extra step you need to take whereas with microservices they're built in

They aren't built in, it's just that the need for them is impossible to ignore. Developers (and management) can't help but recognize and respect modularity in microservices because the separation of services and the APIs between them make the modularity obvious. When the separation between modules only exists in code, and even then only when seen through the eyes of someone who understands the modular architecture of the codebase, it is easily ignored, inevitably forgotten, and might as well not exist at all.


Module is not just code in a separate file. Litmus test: if you cannot have two modules be written in separate languages - you don't have modules. M

If all modules have to be deployed using the same build even though different build of the modules would have been API compatible - you don't have modules.


That sounds way too strict to me. A C++ app might use the C++ ABI to communicate with its modules, which would preclude writing them in any other language without jumping through too many hoops. But that's an ABI constraint and says nothing about the actual modularity of the application.

IMO a module is defined as something that has a self-contained API, and versioning rules for that API if the module is evolving.


You need to do additional steps in both cases:

With modules you need some sort of wrapper application to bring it all together.

With microservices you need some sort of network layer so that the microservices can talk to each other.


And also with modules you need the business processes to coordinate deployments across teams, because they all live in the same wrapper application. That's what stops them being independent.

You don't restart the network every time you deploy a microservice.


>And also with modules you need the business processes to coordinate deployments across teams, because they all live in the same wrapper application. That's what stops them being independent.

If we're being technical some languages support hot swapping modules so no restart would be needed. Setting that aside, restarting an application isn't anything that needs coordination today. You wouldn't even restart the application. You'd deploy a new version, swap over, and shut down the old version.

>You don't restart the network every time you deploy a microservice.

No, but something changes in the network configuration so that the other microservices are aware of the deployment.


Same thing with microservices, unless you do a blue green deployment, have a planned shutdown, load balancer in between releases,...


The key word is "independently".


There is no independently in distributed systems.


But there are rules of blame.

With microservices, as long as you maintain the contract with caller services, you can deploy whenever you want. If you have some kind of issue, your team is solely responsible. If the caller services do weird things, they are responsible for fixing them.

If you are pushing changes to a module as part of a more monolithic, or wrapper service - if you do a deploy and break the whole big service, you are now responsible for the entirety of any issue, which is now more likely due to integration risk and unrelated changes from others, especially because now there need to be coordination across many teams integrating - hopefully via tests and observability. But this requires a high-degree of maturity for automated quality assurance and site reliability. If you don't have that, the operational risks are very high. So that alternative is having some ops-like function, or other integration team responsible. Or doing more top-down waterfall coordination.

Given the service maturity needed is rare, microservices distributes that operational ownership in a way where there is more accountability.


Unless there is infrastructure in place good luck replicating those dropped requests as services are replaced.

Infrastructure that someone has to take care of.


Every organisation I've worked at that had microservices has had all of them released every other Thursday at the same time by the same pipeline because they are tied to the organisation scrum schedule. Also, most changes are part of epics that span multiple microservices. If you are gonna argue that properly done microservices don't have that problem you are free to practice your perfect microservices alongside perfect scrum, oop, and communism.


I was arguing the opposite, coordinated deployments already presumes more maturity than I was considering. I've only been at one company with same-day deployments. All the other places have been free-for-alls where each group determines their own release schedule, with their own pipelines. Managing deployments between dependent services thus requires significant coordination overhead (or none, because you are stuck with what you get until the other teams deliver your needs on their own timeline).

In chaotic environments like that, microservices help with defining clear ownership of operational responsibility, though at the cost of clear responsibility for the overall distributed system. This comes at the cost of making it hard to develop, let alone ship, changes that impact multiple microservices (outside the scope of a single team, or closely related sibling teams).

The main point I was making was that, with a more monolithic system, managing that kind to organization disfunction, necessitates top-down, waterfall like control. So moving to microservices enables non-coordinated agility removing one layer of coordination problem (a the cost of other problems).


You can't solve the halting problem either, and yet somehow useful work gets done.


For code modularity to serve the same purpose, I think there needs to be explicit language-level support, because on a medium-sized or larger project, when the modularity exists only in the minds of developers, it might as well not exist at all. You need virtually all of your developers to understand the design and respect the seriousness of violating it, and beyond a certain project size, that isn't possible.

Developers do a much better job with microservices. I think it's easy for them to respect the seriousness of designing and changing the API of a microservice. In contrast, they often don't respect or even realize the seriousness of making a change that affects the modular design of code.

Language-level support for defining and enforcing module interfaces might help developers invest the same level of care for module boundaries in a modular monolith as they do in a microservices architecture, but I've yet to work in a language that achieves this.


You would hope so... but it's not necessarily the case.

Often people are too lazy to issue API version change on a "bugfix", and take down multiple applications.

Funny thing, a modularized monolith may have identified some of those issues.


Off topic, but 70s computing papers are just the best aren't they?


Totally agree. Also microservices shines IF you need different release schedule for two services. If they are managed in the same project/team, the effort you pay is high, the advantage could not pay your bill, so be careful in such scenario.


yeah i've always thought that microservices shine when the org chart defines them. I've been on teams where we were responsible for multiple microservices and it was so annoying to have to bounce back and forth between code bases and deployments and then it was super easy to intertwine different microservices when you were in the them so often. I feel like if you don't have a team dedicated full time per microservice then you probably shouldn't be using that architecture.


Modules aren't an alternative to microservices in a reasonable way though. And for all modules solve the modularization problem at the code level, they don't really solve modularization at the service level. The main alternative to microservices is monoliths and for many applications I far prefer microservices to monoliths. I want modularization in how I scale my app, I don't want to spin up a whole bunch of servers that can serve almost any possible function in production. I'd prefer more delineated responsibilities. I don't think modularization solves this if after the modules you just throw all your modules in one big bucket and serve that up.


I think what is missing here is that it doesn't have to be an either-or at the organizational level.

If you have a particular piece of the system that needs to be scaled, you can take that module out when it becomes necessary. You can alter your system such that you deploy the entire codebase but certain APIs are routed to certain boxes, and you can use batch processing patterns with a set of code outside the codebase.

You can have an admin application and a user application, and both of them are monoliths. They may or may not communication using the database or events or APIs.

However, you don't make this on the single bounded context guideline.


Just because the monolith image contains all code doesnt mean each deployment needs to run all that code. Or even same version of the code. A deployment can run only a small portion of the code determined by deployment args. It can even run an older version than other deployments


That's just a microservice enabled by configuration.


That's exactly what the comment you're replying is saying though. You're talking about scalability which they say modules doesn't exactly solve.

And I do agree that most people need less of those than they think.


> scalability (being able to increase the amount of compute, memory and IO to the specific modules that need it).

I gave a talk [1] about scalability of Java systems on Kubernetes, and one of my recommendations is that Java-based systems - or any system on a runtime similar to the JVM, like CLR or even Go - should be scaled diagonally. Efficiency is the word.

While horizontal scaling can easily address most performance issues and load demand, in most cases it is the least efficient way for Java (and again, I risk saying .NET and Go), as these systems struggle with CPU throttling and garbage collectors.

In short, and exemplifying: one container with 2 CPUs and 2GB of RAM will allow the JVM to perform better, in general, than 2 containers with 1 CPU/1GB RAM each. That said, customers shouldn't be scaling horizontally to any amount more than what is adequately reasonable for resiliency, or unless the bottleneck is somewhere like disk access. For performance on the other hand, customers should be scaling vertically.

And Kubernetes VPA is already available. People just need to use it properly and smartly. If a Kubernetes admin believes a particular pod should double in number of replicas, the admin should consider: "would this microservice benefit even more from 1.5x more CPU/Memory than 2x more replicas?" and I bet to say that, in general, yes.

[1] https://www.youtube.com/watch?v=wApqCjHWF8Q


Basically any software, even something in C or C++ with manual memory management will scale more efficiently vertically than horizontally. Basically it would be better to run one twice as big as it would be to run two instances on the same machine. There is almost always some overhead or cache that will make the performance of fewer instances better. The only exception is resource contention but it is generally not too hard to fix the important cases.

My default recommendation has always been to make instances "as big as possible, but no bigger". You may need 3 instances for redundancy, fault tolerance and graceful updates but past that you should probably scale up to near the size of your machine. There are obviously lots of complications and exceptions (for example maybe using most 3 instances uses 90% of your machines so you can't bin pack other processes there so it is better to use 4 instances at 70% as the machines will be used more efficiently) but bigger by default is generally a good option.


> make instances "as big as possible, but no bigger"

Exactly.

But I am seeing more and more of the opposite: make instances as small as possible, and scale out (not up) if/when needed.


It's really not a big surprise, tbh.

Making data available to a thread on the same core has exceptionally high bandwidth. Just that alone will help with scaling.


Twitter seems to have adopted Microservices in 2014, and in 2013 they had ~200M MAU (presumably using a monolith architecture).

Even if Microservices are better for scale, most companies will never experience the level of scale of 2013 Twitter.

Are Microservices beneficial at much smaller levels of scale? Ex: 1M MAU


> most companies will never experience the level of scale of 2013 Twitter

I fully agree with your argument. Then again, as mentioned elsewhere in this discussion, microservices are often not introduced to solve a scalability problem but an organizational one and there are many organizations that have more engineers than Twitter (had).

Personally, I still don't buy that argument because by solving one organizational problem one risks creating a different one, as this blog post[0] illustrates:

> […] Uber has grown to around 2,200 critical microservices

Unsurprisingly, that same post notes that

> […] in recent years people have begun to decry microservices for their tendency to greatly increase complexity, sometimes making even trivial features difficult to build. […] we experienced these tradeoffs first hand

I'm getting very strong Wingman/Galactus[1] vibes here.

[0]: https://web.archive.org/web/20221105153616/https://www.uber....

[1]: https://www.youtube.com/watch?v=y8OnoxKotPQ


Github got absolutely enormous as a Rails monolith, plus some C extensions in a few hot paths.


From what I can tell, Github transitioned over to micro-services ~2020 (https://www.infoq.com/presentations/github-rails-monolith-mi...). By this point it had already grew to a scale that very few companies will ever reach.

I'm not sure at what point the monolith codebase became absolutely enormous, but I would bet that GitHub grew to 1M active users with a monolith just fine.

Micro-services might be necessary for the companies like Github, Twitter, Google that grow to be huge, but monoliths seem to work just fine for the vast majority of other companies.


One other thing they give you at a large organization is flexibility in your stack

If a team wants to try out a different language, or hosting model, or even just framework/tooling, those things can be really hard to do within the same codebase; much easier when your only contract is handling JSON requests. And if your whole organization is totally locked into a single stack, it's hard to keep evolving on some axes

(I'm generally against microservices, but this is one of the more compelling arguments I've heard for them, though it still wouldn't mean you need to eagerly break things up without a specific reason)


Tech zoo sometimes considered to be an anti-pattern in microservices. By introducing a different language into your organization, you decrease the mobility of developers between code bases and dilute technical knowledge.


Everybody shouldn't pull in their favorite stack just for fun, but it seems valuable to have the option of trying out new things that could turn out to be a better way forward for the org


Rewriting a module from one tech stack to another is not much harder when that module was a part of a monolith and not a separate service, except that you haven't paid the upfront cost of bootstrapping a new service, putting in RPC calls, etc. And in any case, starting a project as microservices is already a bad practice due to a number of reasons, the most important for me is that it's hard to change module boundaries, which you will most likely get wrong in a new project.


Not worshiping the Microservice, doesn't mean that you avoid services in different languages.

Not all engineers need to move around - DS and MLEs are generally useless on the frontend, and vice versa.


Forcing consistent language, runtime, tooling etc is generally a good thing as it reduces ongoing operational burden as well as some retention & hiring dilemmas.


Strong typed languages with support for binary modules are just as good keeping people honest.

Each team gets to distribute libraries over repos (COM, JAR, DLL, whatever), no way around that unless they feel like hacking binaries.


Services generally have a stateless request/response architecture, inherited from the success of HTTP, and contra the experiences of CORBA and COM, the latter being much better suited to Office Object Linking and Embedding and VB visual controls - local, assumed reliable, not across a network boundary.

Creating an object-oriented API for a shared library which encapsulates a module to the degree a service boundary would is not trivial and it's very rarely done, never mind done well. Most OO libraries expose an entire universe of objects and methods to support deep integration scenarios. Maintaining that richness of API over time in the face of many consumers is not easy, and it (versioning) is a dying art in the eternal present of online services. The 90s aren't coming back any time soon.

If you own a library which is used internally, and you add a feature which needs a lot more memory or compute, how do you communicate the need to increase resource allocation to the teams who use the library? How do you even ensure that they upgrade? How do you gather metrics on how your library is used in practice? How do you discover and collect metrics around failure modes at the module boundary level? How do you gather logs emitted by your library, in all the places it's used? What if you add dependencies on other services, which need configuring (network addresses, credentials, whatever) - do you offload the configuration effort on to each of the library users, who need to do it separately, and end up with configuration drift over time?

I don't think binary drops work well unless the logic they encapsulate is architecturally self-contained and predictable; no network access, no database access, no unexpected changes in CPU or memory requirements from version to version.

There's plenty of code like this, but it's not usually the level of module that we consider putting inside a service.

For example, an Excel spreadsheet parser might be a library. But the module which takes Excel files uploaded by the user and streams a subset of the contents into the database is probably better off as a service than a library, so that it can be isolated (security risks), can crash safely without taking everything down, can retry, can create nice logs about hard to parse files, can have resource metrics measured and growth estimated over time, and so on.


So much for the theory, most services I see in the wild are stateful.

As for scalability, most orgs aren't Facebook nor Google scale, regardless of how much they want it to be true.

All of those issues can be tackled, when proper architecture design is done, instead of coding on the go and then will see attitude.


People keep regurgitating this “you aint going to be google” mantra but I worked there and in reality generic microservice stack is in a totally different league of complexity and sophistication of what google and co have. This argument is basically reductio ad absurdum


We keep regurgitating this, because most microservices are the 2nd coming of big data that fits into a USB stick.


The argument is simply the empirical observation that the vast majority of microservices deployments don't operate anywhere near the scale that actually requires microservices and is not going to operate at that scale in the foreseeable future. When scalability is the primary argument in favor of microservices, how is that absurd?


So all your microservices implement sagas or other synchronisation patterns that ensure 100% data consistency?


If you distribute the state over ALL your services and need 100% data consistency you’re holding it wrong


Ok, how big percentage of them implement data consistency in your company? :)


You'll need to support loading two versions of each DLL into memory and invoking both on some percentage of servers to be able to replicate a microservice though.

The important part of microservices isn't just API boundaries, it's lifecycle management. This CAN be done with a DLL or JAR, but it's MUCH harder today.


I don't know about you, but I could deploy multiple versions of my Java webapp to Apache Tomcat over 15 years ago... and gracefully sunset the old version.


Not at all.

Sure if one likes to make things harder than they are supposed to be, there are lots of paths down that road.


And then each DLL creates its own thread pool... well, usually multiple.


Which is irrelevant, as many microservices do the same inside of them, and the libraries that they consume.


If it's .NET or Java, that's very unlikely.


> But most people need it a lot less than they think. Normally the database is your bottleneck and if you keep your application server stateless, you can just run lots of them

At my last job, there were quite a few times where being able to scale some small "microservice instance" up from 2 -> 4 instances or 4 -> 8 or 8 -> 12 was a lot easier/quicker than investigating the actual issue. It'd stop production outages/hiccups. It was basically encouraged.

Not sure how that can be done with a "it's all modules in a giant monolith".


A few ways, the easiest being to scale up the whole monolith with more instance. Another way is run multiple "services" using the same codebase, so you have workload segmentation, either via synchronous network calls, or async worker systems.


devil's advocate:

> A few ways, the easiest being to scale up the whole monolith with more instance

as far as I know, there's no way to granularly scale up a monolith. if the monolith has 20 or 50 or 100 modules and you need 1 or 2 of them scaled, you have to scale up the entire thing which is huge, expensive, etc.

> Another way is run multiple "services" using the same codebase, so you have workload segmentation, either via synchronous network calls, or async worker systems.

this is interesting. a monolith with some form of IPC? why not do microservices at that point? that sounds like microservices-ish?


> you have to scale up the entire thing which is huge, expensive, etc.

Yes, yet people still do it that way. This is tradeoff against the costs of microservices. I'm not saying it is worth it, but yes, sometimes you can just inefficiently throw hardware resources at scaling problems.

> this is interesting. a monolith with some form of IPC? why not do microservices at that point? that sounds like microservices-ish?

Because you are incrementally changing an existing monolith, but still getting some of the benefits of scaling distinct workloads independently. If you do this right, it does lend it self to reworking the new "service" to be become its own microservice. Or you can just keep going with the shared codebase.


>I'm not saying it is worth it, but yes, sometimes you can just inefficiently throw hardware resources at scaling problems

Microservice scaling is also an inefficient hardware scaling option.


is this a common nomenclature/design? how does a monolith that makes IPC/RPC calls to submodules not basically microservices?


yes, it's very common design. sometimes called a "distirbuted monolith." it's not microservices because the number of additional services is usually small, and the codebase is still tightly coupled to itself (even if well factored in terms of modules). i.e., everything is still tested together, and there's still a single monolithic build and deployment process, and no team can go off and decide to write a service in a different language.


so for example in the context of Java, you have one main App/main() entry, probably bounding to a port (let's say 80 serving HTTP requests) acting as a router

it brokerages them to subclasses/handlers for CRUD/RPC/plumbing to database/other services

then in a separate process App/main() you have a cron job or a queue of some sort

you bundle these together as a monolith into one main()?


In the case of Java, imagine we have a big Maven project that has a bunch of different well factored modules that form the "libraries" of our application. At first, in our Maven project we also have a single module which is the deployable for "the monolith" -- it consumes all the other "lib" modules and gets packaged into a fat jar.

We deploy a few instances of the monolith fronted by a load balancer. There's not much scaling logic here -- when we see p75 request latency reach a certain threshold, we add another monolith instance.

A while later, we realize that the /foo route is getting a ton of spikey traffic, which makes it hard to scale. So, in the load balancer, we set up a new pool that just targets /foo, and in our Maven project we create a new module that will be the deployable for this pool. Now, /foo still requires most of our libs, so the jar we're bundling is still "the monolith" in essence, but maybe it doesn't require a few dependencies, and so is a little slimmer. We deploy that and everything is scaling great!

Then, we discover that in our main monolith, the background threads we were using to send emails are getting really clogged up, which is adversely effecting performance. We decide we want to move this work to an external persistent queue, like Kafka. Now, for our consumer, we create one more Maven module for a new jar. This time, we're lucky. Emails only need 2 of our libraries and so this is actually a pretty small deployable, but it's still being built from the same core set of libraries as the monolith.


so you end up with a monolith that might have a config flag or something to determine its purpose when it comes online, and then it can go down a few different code paths. pretty cool. is this something that is done on the fly and then later you don't let that code pattern stay alive (aka, is it not a smell/bandaid)?

my concern would be if you accidentally create bugs by basically creating new mini-monolith flavors as the modules were never expected (in the beginning of their creation) to run in a "partial" context


I was part of a project that did exactly this - same monolithic code base with config flags that transformed it into web server, worker or scheduler. It allows scaling all parts separately, but I can confirm it's easy to introduce bugs if you're not careful. Since you're sharing the same DB, migrations also need to be backwards-compatible.

A cool side-effect is that you can usually run the whole thing in one app for development by just enabling all profiles - as opposed to some microservice architectures where you need dozens of containers and DBs for replicating inter-service bugs.


The way it usually works, if 1 of your 100 modules needs to be scaled, it probably means the overall monolith only needs a small amount of extra resources, since that module is only using 1% of the total. So it's not like you need to double the instances of the whole thing.

The benefit though is you get a bit of 'free' scaling. Most of the modules don't even have to consider scaling at all, as long as they are growing ~average or slower. So this saves a lot of operational burden for many teams. Conversely with microservices every team has to consider scaling, no matter how simple their service is.

(If you do have 1 module out of 100 that takes up 40% of the resources, then it may be appropriate to split it up. Monolith doesn't literally mean "exactly one service", after all, just a small number of big ones.)


> The way it usually works, if 1 of your 100 modules needs to be scaled, it probably means the overall monolith only needs a small amount of extra resources, since that module is only using 1% of the total. So it's not like you need to double the instances of the whole thing.

I could be wrong but doesn't monolith usually refer to one really heavy app? As soon as you go from needing 1 instance to 2 (because 1 of the 100 inner "modules" needs to be scaled), I would guess most of the time there's a high expense (lots of memory) in duplicating the entire monolith? Unless you can scale threads or something instead of the entire process... Or if the entire process doesn't initiate 100 modules as soon as it is started (although I imagine it would in most cases)


>doesn't monolith usually refer to one really heavy app?

No. Backend Server clusters are around almost as long as the internet exists.

>I would guess most of the time there's a high expense (lots of memory) in duplicating the entire monolith?

That's right, but there's a gotcha. Everytime you split your monolith into parts, that doesn't mean that each part will consume 1/n of original monolith startup resource. There will be a floor of memory usage. Consider 1 monolith that eats up 1GB to startup vs 4 microservices that uses 512MB each. Right from start you doubled the wasted memory from 1GB to 2GB. That only gets worse the more services are created. Another problem is that microservice/cloud folks loves to create anemic machines. Usually setting up 512MB RAM and half a core cpu to a service that needs 380mb minimum :) Thats 75% overhead and 25% of working memory. It's bonkers.


You do lose some memory to holding unnecessary code and whatever data caching that a module does. But it's not generally very much and large data caching can be pushed to Redis. If you're not buying your servers by the rack or data center it's ultimately negligible compared to development cost.


> as far as I know, there's no way to granularly scale up a monolith. if the monolith has 20 or 50 or 100 modules and you need 1 or 2 of them scaled, you have to scale up the entire thing which is huge, expensive, etc.

not necessarily. yes, you do pay for executable bloat and some memory overhead from loading code into memory, but the argument here is that you can still deploy a "Service B" from the same deployable that only services certain kinds of requests in order to scale parts of the application horizontally.

> this is interesting. a monolith with some form of IPC? why not do microservices at that point? that sounds like microservices-ish?

no, because again, you're still deploying the same single codebase, it's just the instance on the other end of the queue is configured just to consume and process messages.


Working at a huge company I feel like every service owning its own data store (or at least only accessing/manipulating another’s through its API, not directly in the underlying store) is the only sensible way to do it, because you can never really envision what clients you will have ahead of time or what they will want to do, and that’s an easy recipe for headaches. But the smaller the company gets the further I’d go along the spectrum from full isolation to isolated services with shared stores to monolith.


I'm glad you made this point.

This happens a lot. Organisational problems conflated for technical ones. Technical solutions to organisational politics. Etc. It's often easier to admit technical challenges than organisational ones. Also, different people get to declare technical and organisational challenges, and different people that get to design the solutions.

There's also a dynamic where concepts are created by the technical 1% doing vanguard or challenging tasks. The guys responsible for scaling youtube or whatnot. Their methods and ideas become famous and are then applied to less technically demanding tasks, then in less technical organisations entirely.

I think if we can be honest at the actual problem at hand, 80/20 fixes will emerge. IE, the "real" value is not the architecture per se, but the way it lets you divide the responsibilities in the organisation.



>scalability (being able to increase the amount of compute, memory and IO to the specific modules that need it).

I would like someone to spell this out. It seems to me people are claiming that if a single binary serves some CPU-bound requests and some memory-bound requests, and you give it more memory, then the memory gets "wasted" on the CPU-bound part. Or if you give it more CPU, the CPU gets wasted on the memory-bound part. But this kind of assignment of resources to code paths seems to be a consequence of microservices. In a single computer, single binary situation resources should not get used up unless the workload actually wants to use them. A compute-heavy thread doesn't cost heap. A big heap doesn't slow down a compute-heavy thread. What am I missing?


I’d love for a language / framework that allows for an application to be composed of “modules” that can either be run in a single process, or deployed as multiple independently scalable processes, with a mostly transparent RPC system requiring minimal boilerplate.

My IDE should be able to easily traverse the call graph. My development environment should be simple to setup.

I’ve worked on microservices that required an insane amount of boilerplate to do simple things. Like 7 layers of controllers, clients, services, data services, etc, just to fetch a simple piece of data. And the developer experience of running dozens of services in a Kubernetes cluster running on my dev machine was awful.

Does anything like this exist?

I only dabbled many years ago but Erlang/OTP comes to mind.

And tRPC for TypeScript calls in browser and server.


I made a framework along those lines for work which we used for many years to mediate ~1000 namespaced commands across dozens of repos. Whether it ran distributed or bundled (I called it "mega-service" mode) was completely controlled by config. We mostly used ZMQ for inter-service messaging, but I had implemented http, redis pub/sub, grpc, etc at various times to prove the concept.

> https://github.com/NathanRSmith/lib-courier-js

Interestingly, we're pursuing a monorepo & multi-monolith setup for the next version of our platform. So lib-courier is no longer necessary to stitch it all together. It was fun while it lasted though. Once you understood the routing algo & code patterns, lots of stuff "just worked".


The industry has been trying to do "transparent remoting" for decades now. DCOM and CORBA were both basically that.

It turns out that "transparent RPC" is basically a contradiction in terms. As soon as you start doing things across process boundaries, and even more so across network boundaries, it requires a very different approach for API design - something that's very cheap locally, like passing objects by reference, becomes expensive and full of footguns.

If you reduce the feature set to the point where it can be transparently mapped to either local or RPC - which is, basically, function calls processing and returning data organized into arrays & trees (but not graphs) - there's still the issue that RPC has so many more failure points that you have to handle that would never light up in local.

This is all still doable; I have my doubts about practical the end result would be, though.


By “mostly transparent” I was mostly thinking type safety, ergonomics, no hand written boilerplate, etc.

Totally fine to support a limited set of primitives and collections passed by value, and require developers consider the failure modes.

In my experience HTTP+JSON or gRPC are usually lacking in at least one of those respects. If you constrain all the services to be the same language it becomes easier.


Hey, we have a system like that at my company. Microservices are self contained and responsible for exposing their interface via SDKs created with a tRPC like library. Jump to definition works across the entire monorepo like a charm. Could you take a look and give me any feedback? https://github.com/rectech-holdings/umbrella-corp-boilerplat...


Both Modularization and Microservices solve scaling problems.

Modularization allows development to scale.

Microservices allow operations to scale.


> A better rule is for one service to own writes for a table

This breaks down when the database is essentially a generic graph. The worst solution I've seen to this is to have another service responsible for generic write operations and any service that wants to write data goes through that service -- you're essentially re-introducing the problem you're purporting to solve at a new layer with an added hop and most likely high network usage. The best solution I've seen is to obviously have the monolith. The enterprise solution I've seen, while not good by any means but nearly essential for promoting team breakdown and ownership, is to just let disparate services write as they see fit, supported by abstraction libraries and SOPs to help reinforce care from engineers.


Added network call is often better than disjointed view on database schema.


That's a difficult claim to make without knowing the workload.


> I'm not so positive on every microservice maintaining its own copy of state, potentially with its own separate data store. I think that usually adds more ongoing complexity in synchronization than it saves by isolating schemas. A better rule is for one service to own writes for a table, and other services can only read that table, and maybe even then not all columns or all non-owned tables.

I’ll go one step further and say that you should treat your data stores as services in and of themselves, with their own well-considered interfaces, and perhaps something like PostgREST as an adapter to the frontend if you don’t really need sophisticated service layers. The read/write pattern you recommend is a great fit for this and can be scaled horizontally with read replicas.


> A better rule is for one service to own writes for a table, and other services can only read that table,

Been there: how do you handle schema changes?

One of the advantages that having a separate schema per service provides is that services can communicate only via APIs, which decouples them allowing you to deploy them independently, which is at the heart of microservices (and continuous delivery).

The way I see it today: everyone complains about microservices, 12 factor apps, kubernetes, docker, etc., and I agree they are overengineering for small tools, services, etc., but if done right, they offer an agility that monoliths simply can't provide. And today it's really all about moving fast(er) than yesterday.


For us, we started off with a world where each service communicates to each other only via RabbitMQ, so all fully async. So theoretically each service should be able to be down for however it likes with no impact to anyone else, then it comes back up and starts processing messages off its queue and no one is the wiser.

Our data is mostly append-only, or if it's being changed, there is a theoretical final "correct" version of it that we should converge to. So to "get" data, you subscribe to messages about some state of things, and then each service is responsible for managing its own copy in its own db. This worked well enough until it didn't, and we had to start doing true-ups from time to time to keep things in sync, which was annoying, but not particularly problematic, as we design to assume everything is async and convergent.

The optimization (or compromise) we decided on, was that all of our services use the same db cluster, and that if the db cluster goes down, it means everything is down. Therefore, if we can assume the db is always up, even if a service is down, we consider it an acceptable constraint to provide a readonly view into other services db. Any writes are still sent async via MQ. This eliminates our syncing drifting problem, while still allowing for performant joins, which http apis are bad at and our system uses a lot of.

So then back to your original question, the way that this contract can break is via schema changes. So for us, since we use postgres, we created database views that we expose for reading. And postgres view updates are constrained that they must always be backwards compatible from a schema perspective. So then now our migration path is:

- service A has some table of data that you like to share

- write a migration to expose a view for service A

- write an update for service B to depend upon that view

- service B now needs some more data in that view

- write a db migration for service A that adds that missing data, but keeping the view fully backwards compatible


> So then back to your original question, the way that this contract can break is via schema changes. So for us, since we use postgres, we created database views that we expose for reading. And postgres view updates are constrained that they must always be backwards compatible from a schema perspective. So then now our migration path is: > - service A has some table of data that you like to share > > - write a migration to expose a view for service A > > - write an update for service B to depend upon that view > > - service B now needs some more data in that view > > - write a db migration for service A that adds that missing data, but keeping the view fully backwards compatible

I don't think I understand. You need to update (and deploy) service B every time you perform a view update (from service A), although it's backward compatible?


if service B needs some new data from the view that isn't being provided, then you first run the migration on service A to update that view and add a column. Then you are able to update service B to utilize that column.

If you don't need the new column, then you don't need to do anything on service B, because you know that existing columns on the view won't get removed and their type won't change. You only need to make changes on service B when you want to take advantage of those additions to the view.


This only works if you apply backward compatible changes all the time. Sometimes you do want to make incompatible changes in your implementation. Database tables are an implementation detail, not an API which you're trying to expose as a view, etc.

But hey, every team and company has to find their strategy to do things. If this works for you, that's great!

It's just not a microservice by definition.


I would never claim that our setup uses microservices. Probably just more plainly named "services".

And yes, that is correct, we agree that once we expose a view, we won't remove columns or change types of columns. Theoretically we could effectively deprecate a column by having it just return an empty value. Our use cases are such that changes to such views happen at an acceptable rate, and backwards incompatible changes also happen at an acceptable rate.

Our views are also often joins across multiple tables, or computed values, so even if it's often quite close to the underlying tables, they are intentionally to be used as an abstraction on top of them. The views are designed first from the perspective of, what form of data do others services need?


Yeah... The issue lies in cases where the decomposition is so extreme, that you end up not able to deploy independently.

And any benefit of a microservice owning it's own rDB is still, that schema changes aren't easily reversible. Specially when new, non predefined, data has been flowing in.

Stateless microservices are great, in the sense that you don't have to build multiple versions of APIs... but stateful microservices are just a PITA.


what if we put the table behind a view? would it be count as a communication via API?


No, because that's not the service API. That's just a view over a table - an internal data structure used to represent some business domain model which should be properly exposed through some implementation-agnostic API. Service B should not care how service A implements it.

And the fact you must keep backward compatibility ( see the OP answer above) at the implementation level shows how fragile this approach is - you will never be able to change a database schema as you wish just because you have consumers that rely on the internal details of your implementation - a table. If you want to change a field from char to int, you can't. How is it important for service B to know that level of detail? An API could still expose a domain model as char, if you want to, and maybe introducing new fields, new methods, whatever way. Or maybe nothing at all, maybe it's not necessary because the database field is never exposed but only used internally (!!).

On the other hand, if you expose a database agnostic API (e.g., http, rpc, ... whatever) you can even swap the underlying database and nobody will notice.

A good rule of thumb is: if I change the implementation, do I need to ask/tell another team? If the answer is yes, that is not a microservice.


a view would allow you to change a field's type.

something like this

CREATE VIEW table_read_api AS SELECT TO_CHAR(now_int) as was_char FROM the_table;

not sure that the view is an internal structure because it will be exposed via API as it is.

SQL allows to swap the database at least if no specific SQL features are used.


> a view would allow you to change a field's type.

Which proves my point: if I change a field type ( in the table ), I will have to change the view type. I need to change 2 services because one of them changed an implementation detail.


before the change

CREATE TABLE the_table (was_char CHAR);

CREATE VIEW table_read_api AS SELECT was_char FROM the_table;

a client consumes data via:

select was_char from table_read_api;

after the change

alter table the_table modify was_char int;

alter table the_table rename column was_char to now_int;

CREATE VIEW table_read_api AS SELECT TO_CHAR(now_int) as was_char FROM the_table;

the client uses the same query

select was_char from table_read_api;

the type of the column has not changed

it is still a char

only "internal implementation" is changed


Those human scaling problems are also security scaling problems; the Unix process model (and, in fact, the von Neumann architecture) doesn't separate modules in security capabilities, nor is it practical to do so.

Microservices allow your permissions to be clear and precise. Your database passwords are only loaded into the process that uses them. You can reason about "If there's an RCE in this service, here's what it can do". Trying to tie that to monoliths is hard and ugly.


I don't understand the argument around microservices purporting to fix scalability. Say your system has modules A, B and C in one app, and that A requires more resources while B and C are relatively unused. Won't B and C just run less and thereby use appropriately fewer resources from within the same system as A? Are microservices just an aesthetic separation (it feels nice to know you're "only" scaling up A)?


It doesn't necessarily fix it but has less resource waste or is easier to manage than the bin packing needed otherwise.


Another benefit of microservices is it allows you to have a permission boundary, which can restrict the scope of damage if a single server/service is compromised.

And for that matter it can (but not necessarily) limit how much damage a bad code release can do.

Of course you don't get those benefits for free just by using microservices, but impleminting those kind if boundaries in a monolith is a lot harder.


The same techniques used to restrict intrusion, can be applied to a monolith.

Microservices aren't inherently more secure.


Really? How do you make it so that different modules in the same process have different IAM credentials and can't get the credentials for a different module? How do you make sure a buffer overflow in you analytics module doesn't allow an attacker to read memory from you login module? How do you make sure an RCE in your image upload code doesn't give an attacker access to credentials for your payment processor?

Maybe it is possible with some low-level system calls, but at the very least it is a lot more difficult than using separate VMs or containers.


Giving me an example that is created to work explicitly at system/container level isn't the "gotcha" you think it is.(IAM profiles have their own limitations)

Separate process vaults, HSMs and other techniques of offloading security credentials - are the same for microservice architecture, as "monolith".

The implication that anything other than a microservice architecture must be exclusively an uber executable doing literally everything, is naive.


> Separate process vaults, HSMs and other techniques of offloading security credentials

How do give one module access to the vault/HSM without also giving any other code in the same process access? Even in the event of a security compromise. And that still doesn't address the problem of a vulnerability anywhere in the monolith potentially exposing any sensitive data in processes memory (such as user data including password).

> The implication that anything other than a microservice architecture must be exclusively an uber executable doing literally everything, is naive.

Ok, replace "microservices" with "service oriented architecture". I'm comparing specifically against a monolith where you have an "uber executable doing literally everything".


> I'm comparing specifically against a monolith where you have an "uber executable doing literally everything".

I doubt that anyone here is advocating for a system that it literally everything in one OS process. I bet even you would say that a process with a RDMBS connection is a monolith, even though it doesn't fit your definition.

As for modules - they can run as a separate process on the same machine, to isolate security critical elements.

Conversely, I doubt that any reasonable microservices architecture has every instance of a microservice in its own subnet with a firewall and strict network access permissions... including one off generated API keys.

Most, at best, use a static API key per microsoervice inside one large "secure" network... which leads me back to my point - they're not exactly easier to secure. Methods may differ a little bit, but the techniques are the same.

Then your example of a buffer overflow is going to be as bad for microservices, as for monolith.

Unless you're going to invest in a variety of systems, programming languages and OSes in your stack - your one buffer overflow, turns into buffer overflow on every other microservice... making your claim of "easier to isolate" a little bit delusional. (Classic example of Log4J bug, where monolith or microservices - once you're breached "game over")


You shoot yourself in the foot pretty hard regarding point 2 (scalability) if you have your microservices share a DB.


FWIW, most of Google's Ads services share a database - the catch is it's F1 on top of Spanner. Latency is highish, but throughput is good.

In the outside world, for an application that may truly need to scale, I'd go MySQL -> Vitess before I'd choose separate data stores for each service with state. But I'd also question if the need to scale that much really exists; you can go a long way even with data heavy applications with a combination of sharding and multitenancy.


Or hardware. I worked on an HPC product which used "microservices" of a kind but many of them were tied to specific hardware nodes. So much of what we think of as "microservices" relies on a fairly narrow set of assumptions, and mostly makes sense in specific contexts, i.e. cloud, APIs, etc.


Microservices, by definition, do not share a DB. If they do then you just have multiple processes in the same larger service.


What it is is using the wrong solution for a problem. Vice-versa the number of devs who justify monolithic apps is absurd. The justification of microservices is just as problematic.

It is about striking a balance. No reason should something be overly compounded or overly broken up.


I very much agree with you on points 1 & 2. I'll add a 3rd - drying up repeated functionality such as repeated auth logic in completely separate applications.


> usually actually solve for a human problem in scaling up an organization.

It's invented by a software outsourcing firm to milk billable hours from contracts.


I am working on a project that uses a microservice architecture to make the individual components scalable and separate the concerns. However one of the unexpected consequences is that we are now doing a lot of network calls between these microservices, and this has actually become the main speed bottleneck for our program, especially since some of these services are not even in the same data center. We are now attempting to solve this with caches and doing batch requests, but all of this created additional overhead that could have all been avoided by not using microservices.

This experience has strongly impacted my view of microservices and for all personal projects I will develop in the future I will stick with a monolith until much later instead of starting with microservices.


Yes. If you design a distributed system you need to consider the network traffic very carefully, and choose your segmentation in such a way that you minimize traffic and still achieve good scalability.

For this reason, I've been trying to push for building a monolithic app first, then splitting into components, and introducing libs for common functionality. Only when this is all done, you think about the communication patterns and discuss how to scale the app.

Most microservice shops I've been in have instead done the naïve thing; just come up with random functionally separate things and put them in different micro services; "voting service", "login service", "user service" etc. This can come with a very very high price. Not only in terms of network traffic, but also in debuggability, having a high amount of code duplication, and getting locked into the existing architecture, cementing the design and functionality.


> Only when this is all done, you think about the communication patterns and discuss how to scale the app.

The main thing is that regardless of scaling, the app should always be able to run/debug/test locally in a monolithic thing.

Once people scale they seem to abandon the need to debug locally at their peril.

Scaling should just be a process of identifying hot function calls and when a flag is set, to execute a call as a network rpc instead.


The execution model of local calls are quite different from remote calls. Because of these differences, many efforts have been made, but "transparent remoting" is still not achieved [1].

[1] https://scholar.harvard.edu/waldo/publications/note-distribu...


Thanks for the interesting read.

I agree with sentiment of the paper in that it can never be fully transparent. When I say "it should just be a process of identifying hot function calls..." this is admittedly a little contrived - there would need to be a more thorough analysis.

The main point seems to be that partial failure is too difficult to handle without explicitly declaring some objects as "remote".

> Merging the models by making local computing follow the model of distributed computing would require major changes in implementation languages (or in how those languages are used) and make local computing far more complex than is otherwise necessary

Given that the paper is from 1994, I'm wondering if there have been language changes that make this more achievable.

What comes to mind is the event loop. The fact that in Node.js, almost every function call is now an asynchronous promise due to the non-blocking paradigm coupled with the convenience of async/await syntax. These event loop runtimes provide a universal way to hook into asynchronous calls.

I would imagine that in 1994, envisioning the code added to the languages of the day needed to handle timeout and asynchronous invocations would have felt overwhelming.

Also thinking about ORMs and databases. It seems TOPLink for SmallTalk was the first, released in 1995 after this paper was. ORMs seem kind of like "transparent remoting", as they usually can run against an in-memory database or db on a local system, or over an TCP connection, which then needs to deal with latency and failure. And for any app talking to a DB with an ORM, means that the failure cases of the ORM must prevade throughout the stack.


Thanks for the thoughtful reply.

> Given that the paper is from 1994, I'm wondering if there have been language changes that make this more achievable.

As the paper argues, it is not possible to make remote calls look like local ones. But it is trivially possible to make local calls look like remote ones! Use call-by-value parameter passing, and make all call asynchronous. The most successful solution I have heard of is Erlang's immutable-message-passing model.

It has been many years since I used it, but I think TOPLink Session and UnitOfWork interfaces threw exceptions signaling remote execution. This is like Erlang - you work with remote APIs; local execution is a special case.


Everything you mention (network traffic cost, code duplication (we instead resorted to auto-generate some shared code between services based on a open-api spec), locked into the architecture) applies to our use case...

And it is reassuring to hear that you seem to have success in avoiding these issues with a monolithic architecture, as I thought I was oldschool for starting to prefer monoliths again.


I don’t mean to be snarky but how is that an “unexpected consequence”? Were the pros and cons just never considered when deciding to use micro services? Additional networks calls are one of the most obvious things to go on the cons list!


Someone must have read a blog post about microservice and decided it would look good on their resume.


Unfortunately this is also partly correct - The team was young and people were eager to play around and learn how to work with microservices.


This is one of the reasons I dislike working in teams that lack 'old people'. Young devs still have a lot of mistakes to make that old devs have already made or seen. The young ones seem to see them as slow and difficult, but they also create stability and control.

In a start-up, having a team of young people will allow you to move fast, pivot and deliver. What you usually end up with though, is a tower built from spaghetti that's serving actual users. I see this again and again and then people get surprised that the software doesn't scale.


My experience of being the old guy in the team (older also than the manager) was that the young whippersnappers[0] drive new tech adoption regardless of the views of the old fart; the manager waves it through to keep the young-un's happy, and the old fart doesn't find out until it's a fait accompli.

My experience was also that the young manager was very suspicious of the old fart. I've never had any problem working for managers younger than me; but some young managers don't know what to make of a team member that knows more than they do.

[0] No insult intended; almost everything I learned after the first 10 years, I learned from people younger than me.


I'm so conflicted over becoming an older engineer (having recently moved back from a leadership role). On one hand, I have a wealth of knowledge and can see around corners that the youngsters have no chance of doing. OTOH, it is absolutely exhausting explaining in excruciating detail why I am right time-after-time as new mids join who just want to experiment. That and I am absolutely petrified of my alacrity fading with time.


I hear ya. If you feel you've lost your perspicacity, fear not; it's always in the last place you look.

But... yeah. A colleague (25+ years) has to deal with this more than I do, but have hit it some over the years, and it happens more the older I get.

Often he or I may be 'right' but only because of existing constraints in a system. Someone with a different view may even be 'correct' in a nuts-and-bolts technical sense. Yes, A is 8% faster than B. But... we have an existing system in B that meets our current needs, and team who know B, have written it, and maintained it for a couple years. Switching tech stacks to something no one knows for an 8% network speed boost is not remotely helpful.


Agreed. This is a corollary of the Second-system effect [1]. By the time developers design their third system they have become relatively "old".

[1]: https://en.wikipedia.org/wiki/Second-system_effect


This is funny, I'm currently working on a second system that replaced the first. It's awful and I think we need to build the third version.


The most amazing kicker is that a lot of 5 YoE get senior status, while still making ridiculous mistakes.


You are right, looking back I also ask why it was not considered more indepth before. I was not involved in the decision making process as I recently joined, and to be honest I would have maybe not thought about it either (but will certainly in the future). I think the main reason this became such a big problem is because we underestimated the number of calls that would be made between the services. Service A was initally supposed to call Service B only as a last resort, so the 10 ms or so were fine, but in the end, the answers that Service B gave turned out to be very useful in our process so more and more calls were made and now they are strongly connected, but still separated services, which is ... not ideal.


Could this be solved by consolidating service A and service B into one unit while maintaining the overall architecture? I know it's just an example but the whole point of microservices is to have the flexibility to rearchitect pieces without having to rewrite the entire project, so potentially poor initial choices can be reworked over time.


Just wait until you decide you need some kind of distributed transaction support across multiple microservices… You should aim to spend a considerable amount of time in the design and discovery phases.


> one of the unexpected consequences is that we are now doing a lot of network calls between these microservices

Not trying to be harsh here, but not expecting an increase of network call in a system where each component is tied together with... network calls sounds a bit naive.

> We are now attempting to solve this with caches and doing batch requests

So you have built a complex and bottle-necked application for the sake of scalability, then having to add caching and batching just to make it perform? That sounds like working backwards.

Obviously, I have no clue on the scale of the project you are working on, but it sure sounds like you could have built it as a monolith in half of the time with orders of magnitude more performance.

Scalability is a feature, you can always add it in the future.


Scalability is not a feature. If you "need" to scale but don't then you're not delivering to your target market, not delivering is not a lack of a feature it's a net loss to the organization. If you're Twitter/Instagram/Uber/whatever you cannot tell your users to not post or like or request a ride because "right now we don't have the scalability feature"


I don't completely agree here.

Yes, if you can't scale fast enough as you need to, it can hurt your business. Not being able to keep up with demand is a (luxury) problem that every business faces, not just in tech. They would often be called 'growing pains' in a business, and though they are bad, they rarely contribute to the failure of a company.

Starting a startup/service/platform with microservices before you even understand the bottlenecks/market fit/customers is usually not a good idea. You can come a very, very long way with a monolith before you hit performance and scalability limits. And once you do, you can always start breaking things up into smaller services for scalabity. Obviously you need to make sure you are scaling on time to keep up with demand.

'Nail it, then scale it', and 'premature optimization is the mother of all f-ups' are popular sayings for a reason.


I'm not making a case in favor of starting with microservices for your startup, that'd be insane, to say the least. I'm just disagreeing with "scalability is a feature". A feature is every addition on top of your minimum viable product. If at some point it becomes apparent that the business needs scalability, then scalability becomes your minimum viable product.


> However one of the unexpected consequences is that we are now doing a lot of network calls between these microservices

I wouldn't call that entirely unexpected. :-) It's a rather well-known issue:

  Microservices

  grug wonder why big brain take hardest problem, factoring system correctly, and introduce network call too

  seem very confusing to grug
(from https://grugbrain.dev)


> However one of the unexpected consequences

How the hell... like... who decided to do Microservices in the first place if they didn't know this? This is such a rookie mistake. It's like somebody right out of high school just read on a blog the new way is "microservices" and then went ahead with it.


There is a high amount of correlation between people that start with microservices (instead of using them to solve specific problems after they have the problem) and people that lack any awareness about the resources they need.

And both are way more common than they should be.


>How the hell... like... who decided to do Microservices in the first place if they didn't know this?

Microservices can have both design utility and simultaneously been a major fad.

Lurking reddit and HN you can watch development fashions come and go. It's really really profoundly hard to hold a sober conversation on splitting merits from limitations in the middle of the boom.

tl;dr: gartner_hype_cycle.png


well, this is a surprisingly common case in my experience, except that the people were not "right out of high school" but "right out of a company funded seminar on microservices".


Here is an analogy that can inform the implications of this (so-called, imo) architecture:

Imagine if you are responsible, at runtime, for linking object files (.o) where each object is a just the compilation of a function.

Now why would anyone think this is a good idea (as a general solution)? Because in software organizations, the “linker’s” job is supposed to be done by the (“unnecessary weight”) software architect.

Microservices primarily serve as a patch for teams incapable of designing modular schemas. Because designing schemas is not entry level work and as we “know” s/e are “not as effective” after they pass 30 years of age. :)

> monolith

Unless monolith now means not-microservice, then be aware that there are a range of possible architectures between a monolith and microservices.


No personal project has any business using microservices, unless it's specifically as a learning toy. Use a monolith. Monoliths can scale more than you can (provided you manage not to limit yourself to the single-thread single-process version of a monolith). Microservices are an organisational tool for when you need to release chunks independently.

I first wrote a program which ran on a web server about 25 years ago. In that time, computers have experienced about ten doublings of Moore's law, i.e. are now over a thousand times faster. Computers are very fast if you let them be.


Twitter has a similar issue according to Musk https://twitter.com/elonmusk/status/1592176202873085952


The tweet is obviously wrong.

> I was told ~1200 RPCs independently by several engineers at Twitter, which matches # of microservices. The ex-employee is wrong.

> Same app in US takes ~2 secs to refresh (too long), but ~20 secs in India, due to bad batching/verbose comms.

RPCs are on the server side. Why would they app take longer to refresh in India than in the US?

Some more explanation:https://twitter.com/mjg59/status/1592380440346001408


Yep, this is trivial to falsify, Musk jumped to an incorrect conclusion based on some anecdotes that he did not follow: https://twitter.com/Popeska/status/1592179502838435847


Musk may just be wrong but isn’t it also possible that the RPCs need to communicate with an edge node in India. Perhaps government regulation requires them to store/serve some user data from India if serving Indian customers? Or something like that?


They might be using cheaper caching in India and dedicate the more expensive ones to areas of the world they get profit from.


An alternative view of this is - Elon isn't a software engineer, so don't listen to him at all.


Musk last wrote code in 1995, most likely.


In which case, the next step for your org is a mandate that all microservices be available in all datacenters.

Let each microservice owner figure out how to achieve their latency+reliability SLA's in every location - whether replicating datastores, caching, being stateless, or proxying requests to a master location.


"Unexpected consequences" for a core piece of the technical approach suggests there are architects that are in the wrong role.


Some of the services not being in the same datacentre seems orthoganal. If that solves a problem you have, wouldn't it still be an issue in a non-microservices design?


If the services need to be in separate data centers then how would a monolith be a solution? Monoliths can't span data centers


I'm assuming that those services don't actually NEED to be in a separate data centre ...


>We are now attempting to solve this with caches and doing batch requests, but all of this created additional overhead that could have all been avoided by not using microservices.

> especially since some of these services are not even in the same data center.

I think you need to answer why? If you can't put all of the services in one data center, then by definition you can't write a monolith either. If the monolith would happily run in one datacenter, then you should have all instances of your microservices in that one datacenter.

It surprised me that you would conclude that this is a problem with microservices. It's like if a particularly architect always punches you in the groin in every meeting, and you've concluded that architects are bad people, rather than this one architect is a bad person.


Would be interesting to learn how your (or any other) team defines borders of a microservice. Iow, how “micro” they are and in which aspect. I guess without these details it will be hard to reason about it.

At my last job we created a whole fleet of microservices instead of a single modular project/repo. Some of them required non-trivial dependencies. Some executed pretty long-running tasks or jobs for which network latency is insignificant and will remain so by design. Some were few pages long, some consisted of similar-purpose modules with shared parts factored out. But there was no or little processes like “ah, I’ll just ask M8 and it will ask M11 and it will check auth and refer to a database. I.e. no calls as trivial as foo(bar(baz())) but done all over the infra.


Did you have a common, centralized data store or did each and every microservice manage their own instance?

(Because this is at the same time one of the defining elements of this architecture... and the first one to be opted out when you actually start using it "for real").


Yes and no. The "central" part of data flowed naturally through services (i.e. passed in requests and webhooks, not in responses). Microservices maintained local state as well, though it was mostly small and disposable due to whole-intra crashonly design. For example, we didn't hesitate to shut something down or hot-fix it, except for bus-factor periods when there's only few of them. They could also go down by themselves or by upstream, and routers avoided them automatically.


I agree that monoliths are the way to start many times, especially if you're not actually sure you'll ever need to scale. One reason we do sometimes plan on microservices from the start with projects is separation of security concerns. Easier to lock a public-facing microservice down tight and have what's ostensibly a non public-facing monolith call out to it.

Lots of ways to solve a problem, though.


Yeah, I think this is one of the very few places where splitting out something as a microservice makes sense. For example, you (mostly) never want to open/process/examine/glance at user-provided PDFs on a box with any sort of unfiltered network access. Ideally you do what you need to do within a sandbox that has _no_ network access, but that's really hard to do performantly.

The primary reason for this is that PDFs can contain executable code and the common tools used to process them are full of unpatched CVEs.


Unexpected?!

Either way, one of my biggest pet peeves is the near-ubiquitous use of HTTP & JSON in microservice architectures. There's always going to be overhead in networked servies, but this is a place where binary protocols (especially ones like Cap'n Proto) really shine.


especially since some of these services are not even in the same data center.

Why is the other service in another data center? Does it need to be in another data center? If it does, how will a monolith help?


May I ask why they are not even in the same data center?


I really like modular designs, but this article is missing some key limitations of monolithic applications, also if they are really well modularized (this is written mostly from the perspective of a Java developer):

* they force alignment on one language or at least runtime

* they force alignment of dependencies and their versions (yes, you can have different versions e.g. via Java classloaders, but that's getting tricky quickly, you can't share them across module boundaries, etc.)

* they can require lots of RAM if you have many modules with many classes (semi-related fun fact: I remember a situation where we hit the maximum number of class files a JAR could have you loaded into WebLogic)

* they can be slow to start (again, classloading takes time)

* they may be limiting in terms of technology choice (you probably don't want ot have connections to an RDBMS and Neo4j and MongoDB in one process)

* they don't provide resource isolation between components: a busy loop in one module eating up lots of CPU? Bad luck for other modules.

* they take long to rebuild an redeploy, unless you apply a large degree of discipline and engineering excellence to only rebuild changed modules while making sure no API contracts are broken

* they can be hard to test (how does DB set-up of that other team's component work again?)

I am not saying that most of these issues cannot be overcome; to the contrary, I would love to see monoliths being built in a way where these problems don't exist. I've worked on massive monoliths which were extremely well modularized. Those practical issues above were what was killing productivity and developer joy in these contexts.

Let's not pretend large monoliths don't pose specific challenges and folks moved to microservices for the last 15 years without good reason.


Mostly valid, but...

On the RAM front, I am now approaching terabyte levels of services for what would be gigabyte levels of monolith. The reason is that I have to deal with mostly duplicate RAM - the same 200+ MB of framework crud replicated in every process. In fact a lot of microservice advocates insist "RAM is cheap!" until reality hits, especially forgetting the cost is replicated in every development/testing environment.

As for slow startup, a server reboot can be quite excruciating when all these processes are competing to grind & slog through their own copy of that 200+ MB and get situated. In my case, each new & improved microservice alone boots slower than the original legacy monolith, which is just plain dumb, but it's the tech stack I'm stuck with.


>As for slow startup, a server reboot can be quite excruciating when all these processes are competing to grind & slog through their own copy of that 200+ MB and get situated.

You are writing microservices and then running them on the same server??


There are multiple hosts, but yeah I doubt our admins would go for 1 service per host, plus they'd just be on VM's sharing the same hardware anyhow.


> they force alignment on one language or at least runtime

How is this possibly a down-side from an org perspective? You don't want to fracture knowledge and make hiring/training more difficult even if there are some technical optimizations possible otherwise.


The capabilites of the language and the libraries available for it can sometimes be a good reason for dealing with multiple languages.

E.g. if you end up having a requirement to add some machine learning to your application, you might be better off using Tensorflow/PyTorch via Python than trying to deal with it in whatever language the core of the app is written in.


Different domains will have different requirements. Doing a monolith, doesn't preclude you from building domain specific services external to the monolith (Model serving service for example).

Take an RDBMS as an example, that is a service external to the monolith often written in a very different language to the monolith.


>You don't want to fracture knowledge and make hiring/training more difficult

these are not maxims of development, there can be reasons that make these consequences worth it. Furthermore you can still use just a single language with microservices*, nothing is stopping you from doing that if those consequences are far too steep to risk.

*:you can also use several languages with modules by using FFI and ABIs, probably.


In an organization that will be bankrupt in three years it doesn’t matter. But if you can pour that much energy into a doomed project you’re a steely eyed missile man. Or dumb as a post. Or maybe both at once.

This is the Pendulum Swing all over again. If one language and runtime is limiting, forty is not liberating. If forty languages are anarchy, switching to one is not the answer. This is in my opinion a Rule of Three scenario. At any moment there should be one language or framework that is encouraged for all new work. Existing systems should be migrating onto it. And because someone will always drag their feet, and you can’t limit progress to the slowest team, there is also a point in the migration where one or two teams are experimenting with ideas for the next migration. But once that starts to crystallize any teams that are still legacy are in mortal danger of losing their mandate to another team.


>they force alignment on one language or at least runtime

A sane thing to do.

>they force alignment of dependencies and their versions

A sane thing to do. Better yet to do it in a global fashion, along with integration tests.

>they can require lots of RAM if you have many modules with many classes

You can't make the same set of features build in a distributed manner comsume _less_ RAM than the monolith counterpart. Given you're now running dozens of copies of the same java vm + common dependencies.

>they can be slow to start

Correct.

>they may be limiting in terms of technology choice

Correct.

>they don't provide resource isolation between components

Correct.

>they take long to rebuild an redeploy, unless you apply a large degree of discipline and engineering excellence to only rebuild changed modules while making sure no API contracts are broken

I think the keyword is the WebLogic Server mentioned before. People don't realise that monolith architecture does't mean legacy technology. Monolith web services can and should be build in Spring Boot, for example. Also, most of the time, comparisons are unfair. In all projects i've worked im yet to see a MS instalation paired feature-wise with his old monolith cousin. Legacy projects tends to be massive, as they're made to solve real world problems while evolving during time. MS projects are run for a year or two and people start to compare around apples to oranges.

>they can be hard to test

If other team's component break integration, the whole building stops. I think Fail-Fast is a good thing. Any necessary setup must be documented in whatever architectural style. It can be worse in a MS scenario, where you are tasked to fix a dusty, forgotten service with an empty README.

If anything, monolithic architecture brings lots of awareness. It's easier to get how things are wired and how they interact together.


> a sane thing to do

imaging your application contains of two pieces - somewhat simple crud, that requires to respond _fast_ and huge batch processing infrastructure, that needs to work as efficient as possible, but doesn't care about single element processing time. And suddenly 'the sane thing to do' is not the best thing anymore. You need different technologies, different runtime settings and sometimes different runtimes. But most importantly they don't need constraints imposed by unrelated (other) part of the system.


You're absolutely right. My comment is towards the view that using a single language for a certain Project Backend is a bad thing per se. The online vs batch processing is the golden example of domains that should be separated in different binaries, call it microsservices or services or just Different Projects with Nothing in Common. Going further than that is where the problems arise.


>>they force alignment of dependencies and their versions

>A sane thing to do. Better yet to do it in a global fashion, along with integration tests.

But brutally difficult at scale. If you have hundreds of dependencies, a normal case, what do you do when one part of the monolith needs to update a dependency, but that requires you update it for all consumers of the dependency's API, and another consumer is not compatible with the new version?

On a large project, dependency updates happen daily. Trying to do every dependency update is a non-starter. No one has that bandwidth. The larger your module is, the more dependencies you have to update, and the more different ways they are used, so you are more likely to get update conflicts.

This doesn't say you need microservices, but the larger your module is, the further into dependency hell you will likely end up.


> A sane thing to do.

This is incredibly subjective, and contingent on the size and type of engineering org you work in. For a small or firmly mid-sized shop? yea I can 100% see that being a sane thing to do. Honestly a small shop probably shouldn't be doing microservices as a standard pattern outside of specific cases anyway though

As soon as you have highly specialized teams/orgs to solve specific problems, this is no longer sane.


> Honestly a small shop probably shouldn't be doing microservices as a standard pattern outside of specific cases anyway though

And yet, that is exactly what gets done


> unless you apply a large degree of discipline and engineering excellence to only rebuild changed modules while making sure no API contracts are broken

Isn't that exactly what's required when you're deploying microservices independently of each other? (With the difference of the interface not being an ABI but network calls/RPC/REST.)


I used to be monolith-curious, but what sold me on micro-services is the distribution of risk. When you work for a company where uptime matters having a regression that takes down everything is not acceptable. Simply using separate services greatly reduces the chances of a full outage and justifies all other overhead.


Why would using microservices reduce the chance of outages? If you break a microservice that is vital for the system, you are as screwed as with a monolyth.


Sure, but not all micro-services are vital. If your "email report" service has a memory leak (or many other noisy-neighbor issues) and is in a crash loop then that wont take down the "search service" or the "auth service", etc. Many other user paths will remain active and usable. It compartmentalizes risk.


Proper design in a monolith would also protect you from failures of non-vital services (e.g. through exception capture).

So it seems like we’re trying to compensate bad design with microservices. It’s orthogonal IMO.


How does exception capture protect from all failures? The most obvious one I don't see it relating to is resource utilization, CPU, memory, threadpools, db connection pools, etc etc.

> we’re trying to compensate bad design

No I think we're trying to compensate for developer mistakes and naivety. When you have dozens to hundreds of devs working on an application many of them are juniors and all of them are human and impactful mistakes happen. Just catching the right exceptions and handling them the right way does not protect against devs not catching the right exceptions and not handling them the right way, but microservices does.

Maybe you call that compensating for bad design, which is fair and in that case yes it is! And that compensation helps a large team move faster without perfecting design on every change.


With microservices you have to have a tradeoff - a monolith is inherently more testable at integration level, than a microservice based architecture.

There's a significant overhead to build and run tests at API level, that includes API versioning... and there's less of a need to version API inside a monolith.


You have fifty (or 10,000) servers running your critical microservice in multiple AZs. You start a deployment to a single host. The shit hits the fan. You rollback that one host. If it looks fine, you leave it running for a few hours while various canaries and integration tests all hit it. If no red flags occur, you deploy another two, etc. You deploy to different AZs on different days. You can fail over to your critical service in different AZs because you previously ensured that the AZs are scaled so that they can handle that influx of traffic (didn't you?). You've tested that.

And that is if it makes it to production. Here is your fleet of test hosts using production data and being verified against the output of production servers.


If you have a truly modularized monolith, you can have a directed graph of dependent libraries, the leaves of which are different services that can start up. You can individually deploy leaf services and only their dependent code will go out. You can then rationalize which services can go down based on their dependency tree. If email is close to a root library, then yes a regression in it could bring everything down. If email is a leaf service, its code won’t even be deployed to most of the parallel services.

You can then have a pretty flexible trade off between the convenience of having email be a rooted library against the trade off of keeping it a lead service (the implication being that leaf services can talk to one another over the network via service stubs, rest, what have you).

This is SOA (Service Oriented Architecture), which should be considered in the midst of the microservice / monolith conversation.


> Simply using separate services greatly reduces the chances of a full outage and justifies all other overhead.

or maybe run redundant monolith fail over servers. should work the same as micro services.


then you need to hotfix you need to re build a giant monolith that probably has thousands of tests and a 20-30 minute regression suite easily.


I have seen exactly this. Waiting for a 60+ min CI build during an outage is not a good look.


However, there's another "bonus here" is that you have integration tests that have a better coverage.

Microservices don't make builds radically faster for the majority. People still split systems into larger services.


How do microservices help here? You can deploy a monolith 10 times and have the same risk distribution.


It's not about replication, it's about compartmentalizing code changes / bugs / resource utilization. If you deploy a bug that causes a service crash loop or resource exhaustion isolating that to a small service reduces impact to other services. And if that service isn't core to app then the app can still function.


> they force alignment on one language or at least runtime

You can have modules implemented in different languages and runtimes. For example you can have calls between Python, JVM, Rust, C/C++, Cuda etc. It might not be a good idea in most cases but you can do it.

Lots of desktop apps do this.


And in runtimes like BEAM, JVM and .NET, multiple languages are even supported out of the box, plus FFI.


Please check your assumptions. Why do you think 2 modules cannot be in different runtimes? How do you think JNI works?

You can absolutely call js running in V8 vm from scala running in jvm. No networking needed, hell not even IPC is needed.

And when you deploy this you don't have to deploy all modules' http servers (for external requests into the system) and queue consumers in the same container, only a single module's. So no busy loops affect other modules, unless as a result of direct api call from module to module. If anything it encourages looser coupling as you are incenticised to use indirect communication through the queue over direct api calls.


hell not even IPC is needed.

Uh... what's the trick? I don't see how you can have V8 and the JVM communicate without something that's inter-process.


Why? They're both just libraries. Load them both into the process and see what happens. At worst you'll have a bit of fighting over process-global state like signal handlers, but at least the JVM is designed to allow those to compose.


I suspect that most people would be better off favoring inlined code over modules and microservices.

It's okay to not organize your code. It's okay to have files with 10,000 lines. It's okay not to put "business logic" in a special place. It's okay to make merge conflicts.

The overhead devs spend worrying about code organization may vastly exceed the amount of time floundering with messy programs.

Microservices aren't free, and neither are modules.

[1] Jonathan Blow rant: https://www.youtube.com/watch?v=5Nc68IdNKdg&t=364s

[2] Jon Carmack rant: http://number-none.com/blow/john_carmack_on_inlined_code.htm...


I’ve said this before about applying Carmack’s architectural input on this topic:

Games are highly stateful, with a game loop that iterates over the same global in-memory data structure as fast as it can. You have (especially in Carmack era games) a single thread performing all the game state updates in sequence. So shared global state makes a ton of sense and simplifies things.

Most web applications are highly stateless with request-oriented operations that access random pieces of permanently-stored data. You have multiple (usually distributed) threads updating data simultaneously, so shared global state complicates things.

That game devs gravitate towards different patterns for their code than web service devs should not be a surprise.


I felt suspicious as soon as I saw Jon Carmack’s website being mentioned in a conversation about Microservices.


I hope this quote from the Carmack essay shows that his argument isn't necessarily restricted to game development:

    ---------- style C:
     
    void MajorFunction( void ) {
            // MinorFunction1
     
            // MinorFunction2
     
            // MinorFunction3
     
    }
> I have historically used "style A" to allow for not prototyping in all cases, although some people prefer "style B". The difference between the two isn't of any consequence. Michael Abrash used to write code in "style C", and I remember actually taking his code and converting it to "style A" in the interest of perceived readability improvements.

> At this point, I think there are some definite advantages to "style C", but they are development process oriented, rather than discrete, quantifiable things, and they run counter to a fair amount of accepted conventional wisdom, so I am going to try and make a clear case for it. There isn't any dogma here, but considering exactly where it is and isn't appropriate is worthwhile.

> In no way, shape, or form am I making a case that avoiding function calls alone directly helps performance.


I love Carmack but I always thought his conclusion there was unsatisfying. I think the issue with really long functions is that they increase scope - in both the code and mental sense.

Now your MinorFunction3 code can access all the local variables used by MinorFunction1 and MinorFunction2, and there's no easy list of things that it might access which makes it harder to read.

Separate functions do have a nice list of things they might access - their arguments!

Of course this technically only applies to pure functions so maybe that's why it doesn't matter to Carmack - he's used to using global variables willy nilly.

Also sometimes the list of things a function might need to access gets unwieldy, which is when you can reach for classes. So no hard and fast rule but I think increased scope is the issue with it.


I used to be a fan of style C, but these days, I prefer either A or B, with the condition that no MinorFunction should be less than 5 lines of code. If a function is that small, and it's called from only one place, then it doesn't need to be a function.

Using A or B results in self-documenting code and I think DOES (or at least, CAN) improve readability. It also can help reduce excessive nesting of code.


Counter point - https://github.com/microsoft/TypeScript/blob/main/src/compil... - your task is to just make it a little bit faster. Where do you begin with a 2.65mb source file?

It’s easy to miss the point of what the OP is saying here and get distracted by the fact this file is ridiculously huge. This file used to be a paltry 1k file, a 10k file, a 20k SLOC file… but it is where it is today because of the OP suggested approach.


Counter point: you have 2650 files, with a couple of 10 line functions in each. Your task is to just make it a little bit faster. Where do you start?

Answer: the same place - with a profiler. Logical code organization matters a lot more than “physical” break to specific files.

I have inherited a Java class spaghetti project in the past, with hunderds (perhaps thousands) of short classes, each which doesn’t do much but sits in its own file - and I would much prefer to work on an SQLite style codebase, even if I have to start with the amalgamation file.


> your task is to just make it a little bit faster. Where do you begin

With a trace from a profiler tool, which will tell you which line number is the hot spot. If run from any modern IDE, you can jump to the line with a mouse-click.

In essence, the file boundaries ought not to make any difference to this process.


>> In essence, the file boundaries ought not to make any difference to this process.

I find this the strongest counter argument. File boundaries shouldn’t matter - but in practice they do.

The ide will give up doing some of its analysis because this is not an optimised use case, the IDE vendors don’t optimise for it. Some IDEs will even give up just simple syntax highlighting when faced with a large file, never mind the fancy line by line annotations.


Do you have a feel for how large a file has to be for an IDE to choke on it? My vim setup can open a 13GB file in about ten seconds, with all syntax highlighting, line numbers, etc working. Once the file is open, editing is seamless.

Bitbucket is the biggest offender for me. It starts to disable syntax highlighting and context once the file hits about 500KB, which should be trivial to deal with.


Where do you begin if the code was in hundreds of separate modules? It's not clear if it's easier. It would take time to load the code into your brain regardless of the structure.

By the way, JavaScript parser in esbuild is a 16 thousand lines of code module too:

https://github.com/evanw/esbuild/blob/0db0b46399de81fb29f6fc...


Yeah, I think this a very much worst case scenario though.

<rant> You will (for most statements) both be able to find a best and worst case. That's the catch with most generalized statements/ principles, e.g., DRY. The challenge is to find a good "enough" solution because perfection is usually either unfeasibly expensive or impossible (different viewpoints, ...) </rant>

Though it's kinda hilarious that the source code of a MS project is not natively viewable on a MS platform.


If GitHub (or whatever you use) says 'sorry we can't display files that are this big', that should be your hard limit...


I do not agree. I believe that professionally "most people" don't work on the same code every day and don't work alone. Modules are a mean of abstraction and are a classic application of "divide et impera" and you'll need them pretty soon to avoid keeping the whole thing in your head. But different cultures of programming have a different meaning of what a module is, so, maybe, I'm misunderstanding your point


I strongly disagree on this one. 10+K lines files are absolutely unreadable most of the time. Separating business logic than other parts of the application helps maintaining it and making everything evolve in parallel, without mixing things up. It also helps to clearly see where business logic happens.


I'm inbetween. 10K line files are usually extremely messy, but they can be written not to - a large number of well-organized, well-capsulated <100 LOC classes can be very readable if smashed together in one file. It just so happens that people who tend to write readable self-contained classes just don't put them in 10 KLOC files, but rather split them. And vice-versa, creating an association "10 KLOC files are unreadable", where it's not the length of the file, but rather the organization itself.

Same for business logic - very clear separation can be cumbersome sometimes, but otherwise it becomes messy if you're not careful. And careful people just tend to separate it.


I disagree with this stance. Creating a file and naming it gives it a purpose. It creates a unit of change that tools like git can report on.


A line is a unit of change that git can report on.

If it's a separate file that is scoped to some specific concern, sure. But its tgat grouping by concern that is key. Not separation into another file. Extracting ra dom bits of code into separate files would be _worse_.


> A line is a unit of change that git can report on.

Yes and no.

Git doesn't store lines, it stores files. Git diff knows how to spit out line changes by comparing files.

So to run git blame on a 10k line file you're reading multiple versions of that 10k file and comparing. It's slow. Worse still is that trying to split said file up while trying to preserve history won't make the git blame any faster.


Yes and yes. While agree with the general points, note that they didn't say "unit that git stores", but "unit git can report on". Git can totally report on lines as a unit of change.


git diff understands function boundaries, and for many languages will “report” equally well on a single file.

It’s a good idea to break things down to files along logical boundaries. But got reporting isn’t a reason.

edit: "got diff" -> "git diff". DYAC and responding from mobile!


Git diff absolutely does not understand function boundaries, it's diff algorithms routinely confuse things like adding a single new function, thinking that the diff should begin with a "}", instead of a function definition.


It varies a bit with language and tooling, but 10k lines is around the place where the size of your file by itself becomes a major impediment on finding anything and understanding what is important.

A 10k lines file is not something that will completely destroy your productivity, but it will have an impact and you'd better look out for it growing further, because completely destroying your productivity is not too far away. It is almost always good to organize your code when it reaches a size like this, and the exceptions are on contexts where you can't, never on contexts where it's worthless.


I generally agree. My argument is that 10K lines written one way can certainly be more readable than 10 files x 1K lines written in a different way, so the real differentiator is the encapsulation and code style, not KLOC/file per se.


What muddies the waters here is languages like Java, where "10k lines" means "you've got a 10kLOC class there", and ecosystems like PHP's where while there's nothing in the language to require it, people and teams will insist on one class per file because hard rules are easier to understand and enforce than "let's let this area evolve as we increase our understanding".

As long as what's there is comprehensible, being able to evolve it over time is a very useful lever.


Honest question: do you think the same exact 10+K lines of code are easier to read spread across 1,000 files? And why do you think the overhead of maintaining the extra code for module boundaries is worth it?

EDIT: And what editor do you use? I'm wondering if a lot of these differences come down to IDEs haha


The right answer is 20 files with 500 lines in each - i.e. few pages of clean/ readable/logical/well-factored code. Obviously it depends on the code itself - it's is fine to have longer files if highly correlated. Stateful classes should be kept short however as the cognitive load is very high.

I also find that updating code to take advantage of new/better language features / coding styles, etc. is impossible to do on a large code base at once. However, sprinkling these kind of things randomly leads to too much inconsistency. A reasonable sweet spot is to make each file self-consistent in this regard.

My experience stems from larger 500+ person-year projects with millions of lines of code.


The right answer is that there is no right answer. You shouldn't divide your code based on arbitrary metric like size, you should divide it based on concepts/domains.

If a particular domain gets big enough it probably means it contains sub-domains and can benefit from being divided too. But you cannot make that decision based on size alone.


Sure but no problem domain is not going to lead you to 10,000 single line files. Similarly it will likely lead to very few 10K line files. There will likely be a way to factor things into reasonable sized chunks. File sizes are not going to be that highly coupled to problem domain as there are multiple ways to solve the same problem.


Sure, but it can lead to 1000 lines which some people still think is too much.

The point is that a numeric upper bound on LoC is inherently subjective and pointless. Instead of measuring the right thing (concepts) you're measuring what's easy to measure (lines).

In fact, it usually makes things worse. I've seen it over and over: you have N tightly coupled classes in a single file which exceeds your LoC preference.

Instead of breaking the coupling you just move those classes into separate files. Boom, problem solved. Previously you had a mess, now you have a neat mess. Great success!


10+K lines spread across 1k file is equally as bad as 10+K line files IMO.

I tend to ensure each file serves exactly one purpose (e.g. in C# one file = one class, with a only few exceptions).

I use VS Code, but in every IDE with a file opening palette it's actually really fast: you want to look for the code to, let's say, generate an invoice, just search for "invoice" in the list of files and you'll find it immediatly.

(Also modules have their own problem, I was mainly talking in a general way since that's what the parent comment was talking about.)


> Honest question: do you think the same exact 10+K lines of code are easier to read spread across 1,000 files?

This is a fair point but assumes 1 particular use case. It is easier if you are just concerned with a bit of it. If you need to deal with all of it, yeah, good fucking luck. 10k LOC file or 1k 100 LOC files.


> EDIT: And what editor do you use? I'm wondering if a lot of these differences come down to IDEs haha

Yes java developers can't do anything without their IDE. It helps them mask the 30000 nested directories they've created to "organize" the code.


And of course it's lot easier to read 200k+ LoC shattered around twenty repos.


The problem arises when you need to read the code of other modules or services. If you can rely on them working as they should, and interact with them using their well-defined and correctly-behaving interfaces, you won't need to read the code.

I'm a proponent of keeping things in a monolith as long as possible. Break code into files. Organize files into modules. A time may come when you need to separate out services. Often, people break things out too early, and don't spend enough effort on thinking how to break things up or on the interfaces.


> and interact with them using their well-defined and correctly-behaving interfaces, you won't need to read the code.

Don't you want determinism and deterministic simultations? If you do, you'll also need stub implementations (mocks, dummies) for your interfaces.

Some notes on that: https://blog.7mind.io/constructive-test-taxonomy.html

> A time may come when you need to separate out services.

Most likely it won't if you organise properly. For example, if each your component is an OSGi module.


> The problem arises when you need to read the code of other modules or services. If you can rely on them working as they should, and interact with them using their well-defined and correctly-behaving interfaces, you won't need to read the code.

You can say the exact same thing about C-headers though.


No, to me that's equally as bad. But 100k lines split across 500 well-named files is a lot easier to work with than 10+K line files or multi-repo code.


With an IDE you're can just look at the class hierarchy/data types rather than the files. As long as those are well organized, who cares hoe they span files?

For instance, in C# a class can span multiple files using "partial" or you can have multiple classes in a single file. It's generally not an issue as long as the code itself is organized. The only downside is the reliance on an IDE, which is pretty standard these days anyway.


Respectfully, rants by niche celebrities are not something we should base our opinions on.

If you're a single dev making a game, by all means, do what you want.

If you work with me in a team, I expect a certain level of quality in the code you write that will get shipped as a part of the project I'm responsible for.

It should be structured, tested, malleable, navigable and understandable.


I feel like this is a knee jerk reaction to the hyperbole of the parent comment rather than the contents of the actual linked talks. I'm watching Jonathan Blow's talk linked above and your comment does not seem relevant to that. Jonathan's points so far seem very reasonable. Rather than arguing for 10000 lines of code it's arguing that there is such a thing as premature code split. Moving code into a separate method has potential drawbacks as well.

One suggested alternative is to split reusable code into a local lambda first and lift it into a separate code piece only once we need that code elsewhere. It seems to me that such approach would limit the complexity of the code graph that you need to keep in your head. (Then again, when I think about it maybe the idea isn't really that novel.)


So you think it’s easier to keep in your head lambdas in a 10k line file vs methods split by functionality across a number of smaller files?


> It should be structured, tested, malleable, navigable and understandable.

Great comment!

I personally find that most codebases are overstructured and undertested :)

In my experience, module boundaries tend to make code less malleable, less navigable, and less understandable.


>It should be structured, tested, malleable, navigable and understandable.

People have different thresholds for when their code reaches these states though, especially "understandable".

You can meaningfully define these (and test for them) on a small scale, in a team, but talking about all developers everywhere, these are all very loosely defined.


It's a valid hypothesis, but without empirical data the question is not easy to settle (and neither Carmack nor Blow are notable authorities on systems that have to be maintained by changing teams of hundreds of people that come and go, maintaining a large codebase over a couple of decades; if anything, most of their experience is on a very different kind of shorter-lived codebases, as game engines are often largely rewritten every few years).

Also, the question of inlined code mostly applies to programming in the small, while modules are about programming in the large, so I don't think there's much relationship between the two.


> neither Carmack nor Blow are notable authorities on systems that have to be maintained by changing teams of hundreds of people that come and go

Most systems don't have to be maintained by hundreds of people. And yet they are: maybe because people don't listen to folks like Carmack?

We like stories about huge teams managing huge codebases. But what we should really be interested in is practices that small teams employ to make big impact.


I didn't mean a team of hundreds maintaining the product concurrently, but over its long lifetime. A rather average codebase lifetime for server software is 15-20 years. The kind of codebase that Carmack has experience with has a lifetime of about five years, after which it is often abandoned or drastically overhauled, and it's not like games have an exceptional quality or that their developers report an exceptionally good experience that other domains would do well to replicate what games do. So if I were a game developer I would definitely be interested in Carmack's experience -- he's a leading expert on computer graphics (and some low-level optimisation) and has significant domain expertise in games, but he hasn't demonstrated some unique know-how in maintaining a very large and constantly evolving codebase over many years. Others have more experience than him in domains that are more relevant to the ones discussed here.


I can't say I agree with all your "okays", although If you prefix them with "In some cases it's okay", then I understand where you're coming from.

The problem is when it's OK and for how long. If you have a team of people working with a codebase with all those "okays", then they have to be really good developers and know the code inside out. They have to agree when to refactor a business login out instead of adding a hacky "if" condition nested in another hacky "if" condition that depends on another argument and/or state.

I guess what I'm trying to say that if those "okays" are in place, then there's a whole bunch of unwritten rules that come in place.

But I agree that microservices certainly aren't free (I'd say they are crazy expensive) and modules aren't free either. But all those "okays" can end up costing you your codebase also.


> It's okay not to put "business logic" in a special place

It's not. This is the thing where you start thinking "YAGNI", yadayada, but you inevitably end up needing it. Layering with at least a service and a database/repositories is a no brainer for any non-toy app considering the benefits it brings.

> It's okay to have files with 10,000 lines

10.000 lines is a LOT. I consider files to become hard to understand at 1.000 lines. I just wc'd the code base I work on, we have like 5 files with more than 1.000 lines and I know all of them (I cringed reading the names), because they're the ones we have the most problems with.


As someone who has personally dealt with files as large as 60K lines, I disagree completely. I believe instead that structure and organization should be added as a business and codebase scales. The problem I think most orgs make is that, as they grow more successful, they don't take the time reorganize the system to support the growth, so, as the business scales 100x in employee account, employee efficiency is hampered by a code organization that was optimized for being small and nimble.

It gets worse when the people who made the mess quit or move on, leaving the new hires to deal with it. I've seen this pattern enough times to wonder if it gets repeated with most companies or projects.

I do agree that microservices and/or modules aren't magical solutions that should be universally applied. But they can be useful tools, depending on the situation, to organize or re-organize a system for particular purposes.

Anecdotally, I've noticed that smart people who aren't good programmers tend to be able to write code quickly that can scale to a particular point, like 10k-100k lines of code. Past that point, productivity falls rapidly. I do believe that part of being a skilled developer is being able to both design a system that scales to millions of lines of code across an organization, and to operate well on one designed by someone else.


Well said. You will see very fast if a dev is experienced or not by looking at code organization and naming. Although I deal with experienced ones that just like to live in clutter. You can be both smart and stupid at the same time.


> It's okay to have files with 10,000 lines.

Ever since my time as a mathematician (I worked at a university) and using LaTeX extensively, I never understood the "divide your documents/code into many small files" mantra. With tools like grep (and its editor equivalents), jumping to definition, ripgrep et al., I have little problem working with files spanning thousands of lines. And yet I keep hearing that I should divide my big files into many smaller ones.

Why, really?


I think what goes wrong with dividing is people truly just splitting the code into multiple files.

I think the code should be split conceptually thus not just copy/paste part of code from main file to submodules, but split the code around some functional bounderies or concepts so that each file is doing one thing and compose those concepts into more abstract concepts.

So that when I try to debug something I can decide where I want to zoom in and thus be able to quickly identify the needed files.


I think the big benefit is not the actual split into files, but the coincidental (driven both by features of some languages and also mere programmer convenience) separation of concerns, somewhat limiting the interaction between these different files.

If some grouping of functions or classes is split out in a separate file where the externally used 'interface' is just a fraction of these functions or classes, and the rest are used only internally within the file, then this segmentation makes the system easier to understand.


One big reason is source control. Having many smaller files with well defined purpose reduces the number of edit collisions (merges) when working in teams.

Also, filenames and directories tree act as metadata to help create a mental map of the application. The filesystem is generally well represented in exploratory tools like file browser and IDE. While the same information can be encoded within the structure of a single file, one needs an editor that can parse and index the format, which may not be installed on every system.


> Also, filenames and directories tree act as metadata to help create a mental map of the application

Correct. But this is an argument against splitting: once your folder structure reflects your mental model, you should no longer split, no matter how big individual files get. Splitting further will cause you to deviate from your mental model.

Also, it seems like we're arguing against a strawman: saying "big files are okay" is not the same as "you should only have big files". When people mean is that having a big file does not provide enough justification to split it. But it is still a signal.


Well, Git is pretty good at merging when the changes are in different places of the same file. Though your second point is a very good one.

FWIW, sometimes when I worked on a really large file, I put some ^L's ("new page" characters) between logical sections, so that I could utilize Emacs' page movement commands.


It's not just git, it's pull requests and code reviews that block on concurrent file updates. The workflow friction multiplies with the number of devs sharing the same code area. Small files minimize this by making more granular work units.


At the risk of getting lost in a swamp of not particularly good answers, it's most useful if you have scope control: If you have language keywords that allow you to designate a function as "The context/scope of this function never escapes this file," then multiple files suddenly become very useful, because as a reader you get strong guarantees enforced by a compiler, and a much easier time understanding context. The same can be said of variables and even struct fields. In very large programs it can also be useful to say, "The scope of this function/variable/etc never escapes this directory".

If everything is effectively global to begin with, you're right, it might as well all be in one file. In very large programs the lack of scope control is going to be significant problem either way.

This is where object-oriented programming yields most of its actual value - scope control.


This is one of alarmingly few sane, articulate comments amid an extremely weird conversation about how maybe a giant mess of spaghetti is good, actually.

Yeah sure overengineering is a thing, but you're way off the path if you're brushing aside basic modularization.


Because one day someone else may need to read and understand your code?


But is it generally easier to read and understand, say 10 files of 1000 lines or 100 files of 100 lines or 1000 files of 10 lines, compared to one 10,000 lines file? (I don't know the answer, and don't have any strong opinion on this.)


navigating between files is trivial in most real IDE's. i can just click through a method and then go back in 1 click


But navigating between functions / classes / paragraphs / sections is also trivial in most real editors.


But this is precisely my question: in what way does splitting the code into many small files help with that? Personally I find jumping between many files (especially when they are located in various directories on various levels in the fs) pretty annoying...

Of course, that probably depends on the nature of the project. Having all backend code of a web app in one file and all the frontend code in another would be very inconvenient. OTOH, I have no problems with having e.g. multiple React components in one file...


Obviously you can split your code into many files in a way that obfuscates the workings of the program pretty much comoletely. And I think you can write a single 10kloc file that is so well organized that it is easy to read. I just never have seen one...

I believe that files offer one useful level of compartmentalizing code that makes other people easier to understand what is going on by just looking at the file structure before opening a single file. The other guy can't grep anything before they have opened at least one file and found some interesting function/variable.


Mathematicians rarely collaborate by forking and improving other papers. They rewrite from scratch, because communicating ideas isn't considered the important part, getting credit for a new paper is.


Fair enough, though I've been working as a programmer for over 6 years now (and I've been programming on and off, as a hobby, for over 3 decades).


I often want to have multiple parts of the code open at once. Sometimes 5-10 different parts of the code (usually not so many as that, but it depends what I'm doing) so I can flip between them and compare code or copy and paste. Most editors I've used don't have good support for that within a single file (and having 5 tabs with the same name isn't going to make it very easy to tell which is which).


Very good point, though this is really a deficiency of "most editors". ATM I have 300+ files open in my Emacs (which is no wonder given that I work on several projects, and my current Emacs uptime is over 6 days, which is even fairly short). Using tabs is bad enough with 5+ files, tabs with 300+ would be a nightmare.

That said, Emacs can show different parts of the same file in several buffers, a feature which I use quite a bit (not every day, but still). And of course I can rename them to something mnemonic (buffer name ≠ file name). So I personally don't find this convincing at all.


Because often you don't know what to grep for, or search is too general and returns lots of irrelevant results, or perhaps you're just in process of onboaring on the new project and you want just to browse the code and follow different logical paths back and forth...

and when dealing with the code that's well-organized and grouped into logically named files and dirs, you simply can navigate down the path and when you open a file all related code is there in one place without extra 10k lines of misc. code noise.


Just one of many reasons: parallel compilation.


Another excellent point, but only applicable to compiled languages (so still not my case).


Just two of many reasons: parallel linting, parallel parsing.


Ah. Silly me. Still, linting/parsing 5000 sloc is almost instantaneous even on my pretty old laptop.


5000 LoC isn't that much.

There are many other reasons why separation is better for humans (separation is organization) but these arguments about parallelism are valid for androids, humanoids and aliens.


It is a kind of cult really. Along with the rise of modern editors which somehow craps out at relatively large files. So small files are sold as well organized, modular, logically arranged codebase. You see these lot of adjectives to support short files. None of them seem to be obviously true to me.


Generally speaking those 10k lines files are shit code, regardless of context

Generally speaking it's only Enterprise(tm) code that has miniscule (not small, miniscule) files with shit code

As long as it's good code, no one cares. If you aren't working on Enterprise(tm) code odds are every bad code you'll look at has way too many lines in the abstraction unit that is being used


If you change 1 .c file, your compiler needs to recompile the .o file for that file and link it with the unchanged .o files.

If you have everything into one .c file, you need to recompile the whole thing every change.


If the file only contains static methods, it doesn't matter too much. However a class with mutable state should be kept pretty small imo.


This only applies to OOP, no?


Or module level state (Go, Python) which is in many ways even worse


grep can search multiple files. Tags can jump between files.

LaTeX is an obvious case: why pay to recompile the whole huge project, for each tiny change?


Actually, LaTeX has to recompile everything even after a tiny change (there are rare exceptions like TikZ, which can cache pictures, but even then it can put them into some file on its own AFAIR, so that I can still have one big file).

Now that I think about it, LaTeX is a very bad analogy anyway, since LaTeX documents usually have a very linear structure.


Broadly my heuristic for this is, "Would it make sense to run these functions in the other other?".

If you split up MegaFunction(){} to Func1(){} Func2(){}, etc, but it never makes sense to call Func2 except after Func1, then you haven't actually created two functions, you've just created one function in two places.

Refactoring should be about logical separation not about just dicing a steak because it's prettier that way.


I think that's a reasonable heuristic, but I'd also say you have to take into account the human readability aspect of it. It sometimes does make sense IMO to split solely for that, if it allows you to "reduce" a complicated/confusing operation to a string name, leading to it being easier to understand at a glance.


Shouldn't split MegaFunction into Func1 and Func2, you should:

  MegaFunction()
  {
  Func1();
  Func2();
  ...
  FuncN();
  }
People will call MegaFunction() but it will be logically split internally.


This is how it's taught in school but the argument is that you shouldn't do that, instead just inline Func1 and Func2 and comment it better.

By chopping up MegaFunction like this, you've not actually separated Func1 and Func2 if they aren't really independent, they're just MegaFunction in disguise but now split across two places making it more difficult, not easier, to reason about.

If you need the state (implied or explicitly passed in and out) from having executed Func1 to run Func2 then you're just creating spaghetti code. You're taking a big ball of mud and smearing it around instead of actually tackling the abstraction.

This is how you end up with Func(a, b, c, d, &e, &f) which ends up changing your MegaFunction state.

Or just as bad, Func1,2,N are all private functions, never called anywhere else outside the class MegaFunction is in, so logically (and from the point of view of testability) it's no different to having them all inline.

If you're creating a function that's only ever called once, then instead of a function call you're probably better off with a comment to "name" that block and explain the process instead.


> you shouldn't do that, instead just inline Func1 and Func2 and comment it better.

Frankly It Depends(tm). Sometimes you can do this and not pass too much state, sometimes you can not. Sometimes your state is in a class, or global, and you just pass the class around.

I have somewhere a 3000 line state machine with the app state in a class - i just pulled out logically connected states in auxiliary files when it went over 1k lines. In my case it's easy to comprehend because groups of states are kinda separated logically.


As someone who works at a place that previously lived by:

>It's okay to not organize your code. It's okay to have files with 10,000 lines. It's okay not to put "business logic" in a special place. It's okay to make merge conflicts.

I absolutely disagree. It's "okay" if you're struggling to survive as a business and worrying about the future 5+ years out is pointless since you don't even know if you'll make it the next 6 months. This mentality of there being no need for discipline or craftsmanship leads to an unmanageable codebase that nobody knows how to maintain, everybody is afraid to touch, and which can never be upgraded.

You don't see the overhead of throwing discipline out the window because it's all being accrued as technical debt that you only encounter years down the road.


I think we're all confused over the definition. Also one might understand what all the proponents are talking about better if they think about this more as a process and not some technological solution:

https://github.com/tinspin/rupy/wiki/Process

All input I have is you want your code to run on many machines, in fact you want it to run the same on all machines you need to deliver and preferably more. Vertically and horizontally at the same time, so your services only call localhost but in many separate places.

This in turn mandates a distributed database. And later you discover it has to be capable of async-to-async = no blocking ever anywhere in the whole solution.

The way I do this is I hot-deploy my applications async. to all servers in the cluster, this is what a cluster node looks like in practice (the name next to Host: is the node): http://host.rupy.se if you click "api & metrics" you'll see the services.

With this not only do you get scalability, but also redundancy and development is maintained at live coding levels.

This is the async. JSON over HTTP distributed database: http://root.rupy.se (2000 lines hot-deployable and I can replace all databases I needed up until now)


How can I sell your idea?

I easily find my way in messy codes with grep. With modules, I need to know where to search to begin with, and in which version.

Fortunately, I have never had the occasion to deal with microservices.


I'm unsure how to "sell" this idea. I don't want to force my view on others until I truly understand the problem that they're trying to solve with modules/microservices.

For searching multiple files, ripgrep works really well in neovim :)

[1] https://github.com/BurntSushi/ripgrep


> With modules, I need to know where to search to begin with

You can just grep / search all the files.

> in which version.

grep / search doesn't search through time whether you're using one file or many modules. You probably want git bisect if you can't find where something is anymore (or 'git log' if you have good commit messages),


grep can search multiple files at once.


I actually use ripgrep.

But modules can be in different repositories I haven't cloned yet.


Just use a proper IDE. It doesn't care about how your code is structured and can easily show you what you look for in context.

(And other tools like symbol search https://www.jetbrains.com/help/idea/searching-everywhere.htm...)


and it also does not magically guess what is in modules you haven't fetched yet. (I use LSP when I can)


Of course it can't know about non-existent sources. But when the sources are there, it's light years ahead of a simple text search that is grep/ripgrep.


Repo search.


> It's okay to have files with 10,000 lines. It's okay not to put "business logic" in a special place.

Couldn't disagree more. As usual, it's a tradeoff. You could spend an infinite amount of time refactoring already fine programs. But complex code decreases developers productivity by orders of magnitude. Maybe it's not always worth refactoring legacy code, but you're always much better off if your code is modular with good separation of concerns.


... then before you know it your code-base is 10 million lines long and you have no idea where anything is, what the side-effects of calling X are, what's been deprecated, onboarding is a nightmare, retention of staff is difficult, etc. You may be right for the smallest of applications, or whilst you're building an MVP, but if your application does anything substantial and has to live forever (a web app, for example), then you will have to get organised.


I think we are conflating (at least) two different issues here because we don't have a good way to deal with them separately. Closure on one hand and implicit logic on the other.

Most of the time when I split out a method what I want is a closure that is clearly separated from the rest of the code. If I can have such closures, where input and output is clearly scoped, without defining a new method, that might be preferable. Lambdas could be one way to do it but they still inherit the surrounding closure so it's typically not as isolated as a method.


> I think we are conflating (at least) two different issues

For sure, I have absolutely no idea how your comment relates to what I wrote.


Oops, it looks like I answered the wrong comment here. :/


It happens to the best of us! :)


> It's okay to have files with 10,000 lines.

I find there are practical negative consequences to having a 10,000 line file (well, okay; we don't have those at work, we have one 30k line file). It slows both the IDE and git blame/history when doing stuff in that file (w.r.t. I'll look at the history of the code when making some decisions). These might not be a factor depending on your circumstances (e.g. a young company where git blame is less likely to be used or something not IDE driven). But they can actually hurt apart from pure code concerns.


I think you are missing one point: mental load. I doubt people keep all their files in one directory or all their emails in one folder or just have one large drawer with every possible kitchen utensil in a pile. The same is true for code. Organizing your code around some logical divisions allows for thunking. I will agree that some people take it too far and focus too much on "perfect". But even some rudimentary directories to separate "areas" of code can save a lot of unnecessary mental gymnastics.


Not sure I totally agree, but one strong point against factoring in-line code out into a function:

You have to understand everything that calls that function before you change it.

This is not always obvious. It takes time to figure out where it's called. IDEs make this easier but not bullet proof. Getting it wrong can cause major, unexpected problems.


Organizing code and services so that a rotating crew of thousands of engineers can be productive is critical to companies like Amazon and Google and Netflix. Inlining code is a micro-optimization on top of a micro-optimization (that is, choosing to [re]write a service in C/C++), not an architectural decision.


I agree, but the problem is that eventually it becomes not okay. So it requires a bit of nuance to understand when it is and when it isn't.

Unfortunately most engineers don't like nuance, they want one-size-fits-all solutions.


Modules are lot cheaper if you have a solver for them.

Microservices are the same modules. Though they force-add distributiveness, even where it can be avoided, which is fundamentally worse. And they make integration and many other things lot harder.


What is a "solver"? Do you have any resources where I can learn about them?


Well, anything what allows you to express the wiring problem in formal terms and solve it. A dependency injection library. Implicit resolution mechanism in Scala. They may solve basic wiring problem.

distage can do more, like wire your tests, manage component lifecycles, consider configurations while solving dependency graph to alter it in a sound manner.


Blow and Carmack are game programmers. They are brilliant, but their local programs and data are tiny compared to distributed systems over social graphs, where N^2 user-user edges interact.


Their programs interact with different APIs and constraints on many different sets of hardware. 5 different major hardware targets, 8 operating systems, and hundreds of versions of driver software. It’s hard to do all that well and keep things validated to continue running with minimal ability to update and support it. Web programs barely work on two browsers. Server targets are generally reduced to a single target (Docker or a specific distribution of Linux).

Their tiny data is rendering millions of polygons with real-time lighting, asset loading in the background, low latency input handling, etc. at 60fps+. If my API responds to the user in 100ms, they’re ecstatic. If their game responds to the user in 100ms (5-10 dropped frames) even a few times, they’re likely to ask for a refund or write a bad review, hurting sales.

The constraints are different, but game programmers do the same kinds of optimization social networks do, just for different problems. They avoid doing the N^2 version of lighting, path finding, physics simulations, etc. Social networks avoid it when searching through their graphs.

I think the web and other big tech companies should try thinking about problems the way game programmers do.


Seems like you're talking about algorithms issues, not code complexity. If your code needs to scale (at all, never mind quadratically) with the size of the data, you're doing something very wrong.


You say that like real-time multiplayer gaming doesn't exist or something. Both of them have worked on those.... I think Carmack invented a lot of the techniques we use for those.

Yeah sure, the scale is smaller, but you can't get away with making the user wait 10 seconds to render text and images to a screen either. I think the software world might be a lot better place if more developers thought like game developers.


Let's not turn it into penis measuring contents, please. Code organization and requirements differ significantly between different programming niches, and the today's accepted practices are not some randomly invented caprices, but the result of the slow (and painful) evolution we've been fighting through past decades. Each niche has optimized over time for its own needs and requirements. My web apis have hundreds of controllers and keeping them in separate files makes it way easier to manage. I know that because we used to keep it all in a single file and it sucked, so over time I learned not to do it anymore. Does it mean that embedded systems devs should organize their code in the same way? I have no idea, that's up to them to decide, based on their specific environment, code and experience.


> Each niche has optimized over time for its own needs and requirements.

Sure, but those "needs and requirements" aren't necessarily aligned with things that produce good software, and I think a lot of software development these days is not aligned. Further, I think the evolutionary path of the web in particular has produced a monstrosity that we'd be better off scrapping and starting over with at this point, but that's a tangential discussion.


> My web apis have hundreds of controllers

That you probably don't even need, but there's a paradigm of "every function should be it's own class" that some devs seem to follow that I will never understand.


I don't do one function one class mantra, but I absolutely need the separate controllers to group methods because each set of them does different things and returns different data. If in my 20+ years of web dev I learned one thing, it's that trying to be too smart with code optimizations and mixing different logic together is never a good idea - it will always backfire on you and everything you "saved" will be nullified by extra time and effort when you're forced to untangle it in future. The whole point of what I wrote was that there's no recipes that can be just uncritically applied anywhere, you need to adapt your style to your particular needs and experience. If you don't need many controllers, great for you... but don't presume you can just copy/paste your own experience on every other project out there, and we all are stupid for doing it differently...


You greatly underestimate the complexity of games, and greatly overestimate the complexity of working with distributed systems over social graphs


I've worked extensively in both. Both are complex. Games typically have a complexity that requires more careful thinking at the microscopic level (we must do all this game state stuff within 16ms). Web service complexity requires careful thinking about overall systems architecture (we must be able to bring up 10x normal capacity during user surges while avoiding cost overruns). The solutions to these problems overlap in some ways, but are mostly rather different.


The article above, and most if not all the comments I read right before posting this, seem to be very quiet about what I thought was one of the main "distinctive elements" of Microservices.

I.e. the idea that each microservice has direct access to its own, dedicated, maybe duplicated storage schema/instance (or if it needs to know, for example, the country name for ISO code "UK" it is supposed to ... invoke another microservice that will provide the answer for this).

I always worked in pretty boring stuff like managing reservations for cruise ships, or planning production for the next six weeks in an automotive plant.

The idea of having a federation of services/modules constantly cross-calling each other in order to just write "Your cruise departs from Genova (Italy) at 12:15 on May 24th, 2023" is not really a good fit for this kind of problems.

Maybe it is time to accept that not everyone has to work on the next version of Instagram and that Microservices are probably a good strategy... for a not really big subset of the problems we use computers for?


Country codes, and other lookup tables, could easily be handled by a repository of config files. Or, if your company uses a single programming language, a library.

One strategy I've used is to designate one system as a source of truth (usually an ERP system) and periodically query its database directly to reload a cache. Every system works off their own periodically refreshed cache. Ideally, having all the apps query a read replica would prevent mistakes from taking down the source of truth.

I haven't done this, but I think using Postgres with a combination of read-only foreign tables and materialized views could neatly solve this problem without writing any extra code.

I don't know how far this would scale. I do know that coordination and procedures for using/changing the source of truth will fall apart long before technical limitations.


Fair enough. The catch, though is that ... I am the guy working on the ERP, in your example ¯\_(ツ)_/¯


You can make spaghetti from anything.

A micro service implementation of that would be CQRS, where the services to write updates, backend process (eg, notifying the crew to buy appropriate food), and query existing records are separated.

You might even have it further divided, eg the “prepare cruise” backend calls out to several APIs, eg one related to filing the sailing plans, one related to ensuring maintenance signs off, and one related to logistics.


They solve specific problems. If they don't solve a problem you have, then using them is probably a mistake.

The thing is that the framing of "the problems we use computers for" misses the entire domain of problems that microservices solve. They solve organisational problems, not computational ones.


I have exactly the opposite problem though: the kind of problems I have worked on so far would not be "solved" by leveraging "large number of developers that can independently work on an equally large number of small, modular programs with a very well defined, concise interface".

And this is not because "my stuff is complicated and your stuff is a toy", either. It's more like "ERP or Banking Systems" were deployed decades ago, started as monoliths and nobody can really afford to rewrite these from scratch to leverage Microservices (or whatever the next fad will be). (I am also not sure it is a good idea in general for transactions that have to handle/persist lots of state, but this could be debated).

The problem, in fact, is that "new guys" think that Microservices will magically solve that problem, too, want to use these because they are cool and popular (this year), and waste (and make me waste) lots of time before admitting something that was clear from day 1: Microservices are not a good fit for these scenarios.

(I still think that "these scenarios" are prevalent in any company which existed before the 90s, but I might be wrong or biased on this).


Microservices as a named architectural pattern are over a decade old at this point. Anyone jumping on them because they're the new hotness is more than a little behind the times.

> I have exactly the opposite problem though: the kind of problems I have worked on so far would not be "solved" by leveraging "large number of developers that can independently work on an equally large number of small, modular programs with a very well defined, concise interface".

If you don't have multiple teams working on it, and you don't have specific chunks of functionality that you need to scale orthogonally, then you don't have the problems microservices solve. So don't use them. That seems uncontroversial to me.

> I still think that "these scenarios" are prevalent in any company which existed before the 90s, but I might be wrong or biased on this

This is survivorship bias, in a sense. Microservices only make sense where access to CPU and network is cheap and on-demand (largely). That's only started to be an assumption you could really make since Amazon's EC2 convinced the world's CIOs that fungible tin was better than their own racks.

That means you don't see microservice architectures in older companies, and you don't see problems being solved with them even where it might have made sense retrospectively because IT Ops would never have been set up to enable it.

Today you don't need a big rewrite to justify introducing microservices where they're needed. That's a straw man. All you need is a team that will be able to move faster if their deployment is decoupled from everyone else's.

But fundamentally if your problem area smells like "bunch of business rules and a database" then, again: if you don't have the problem that the architecture solves, don't use it.


> Anyone jumping on them because they're the new hotness is more than a little behind the times.

cough Aging directors in Health and Finance cough


> Instagram

Does anyone have a microservice "map" of Instagram? I feel that would be helpful here.


Instagram, Dropbox, many of the major tech companies still use a monolith.

Or you can think of it as trunk/branch architecture. One main "trunk" service and other branch services augmenting it, which is a simpler thing to reason about.

Now imagine a small shop of 20 devs deciding to build something more complicated.


They actually use a python monolith unless it's changed recently. See e.g. https://instagram-engineering.com/static-analysis-at-scale-a...


Meta deploys as "microservices" with multiple monorepos...

Binaries are deployed and scaled independently as thrift services.

Tons of rpc.


The distinct elements are they compile separately and have versioning on their API calls.

No, you don't use separate microservices for writing out that text message.

The idea is pretty simple instead of writing one big program, you write many smaller programs.

It's useful around the 3 to 4 separate developers mark.

It avoids you having to recompile for any minor change, allows you run the tests for just the part you changed and allows you to test the microservices in isolation.

If you a production issue, the exception will be in a log file that corresponds to a single microservice or part of your codebase.

Microservices are a hard form of encapsulation and gives a lot of benefits when the underlying language lacks that encapsulation. e.g. Python.


No, you don't use separate microservices for writing out that text message.

But in order to find out that Genua is the name of the port from where the cruise is departing, the appropriate time (converted to the timezone of the port, or the timezone of the client who will see this message, depending on what business rule you want to apply) and that Genua is in IT=Italy... how many microservices do I have to query, considering that port data, timezones, ISO country codes and dep.date/time of the cruise are presumably managed on at least four different "data stores"?


1 microservice. It's up to the software engineer to scope out the microservices properly.


ive never seen that in practice. you dont have a database for each individual lambda. thats insanity. you can have multiple microservices point to a shared datasource, its not illegal.


I agree, and yet most microservice zealots seem to have a different opinion on this.

e.g.: https://www.baeldung.com/cs/microservices-db-design

"2.1. Fundamentals By definition, microservices should be loosely coupled, scalable, and independent in terms of development and deployment. Therefore, the database per service is a preferred approach as it perfectly meets those requirements. Let’s see how it looks:"

Please understand that I have worked only on monoliths and will probably retire while still working on monoliths. This kind of absurd positions only come up when someone comes to my office with some grand plan to convert the application I work on to something "more microservice oriented".


Whilst I don't know much about cruises. Let me make up an example for you.

Let's suppose we are Acme Cruise Lines running a Cruiseliner:

Microservice - National Coastguard Ship Arrival System Feed Handler

Database - Logs incoming messages on the feed

Microservice - Asian and Australian Joint Ship Monitoring System

Database - Logs incoming messages on the feed

Microservice - Cruiser Arrival and Departure Times

Database - Cruiser Arrival and Departures Times in a Standard Format

Microservice - Customer Bookings and Payments

Database - Customer Bookings and Payments

Microservice - Fuel Management System

Database - Ship Fuel Levels & Costs of fuel at Various Ports.

It's that high level of split up.

(AWS Lambdas aren't quite the same thing as microservices.)


You are absolutely right: you do not know much about cruises.

The things you listed are ... 4-5 different applications, mostly running directly on the ship(s) and what is conspicuously missing are the parts that are managed shoreside, like:

Itinerary planning (your product is one or more cruises: therefore you need to preplan the itineraries, which ships to use, when you will enter each port and when you will leave it, and so on... try to imagine this like a mix between hotel management and flight company management) Inventory management (i.e. Reservation).

You mention "bookings" like a microservice. This could work for ferry line, where you are basically selling a ticket for a single trip, most of the time with no personal accommodation (except maybe for the car).

A Cruise booking usually comes with a ton of ancillary services (e.g. I live in Berlin and I want to take a cruise in the Caribbeans... therefore I need a flight to get there and back. This could be provided by a charter flight or a normal airline ... in either cases the ticket will be part of the booking itself) - take in account that cruise customers are mostly middle-aged or older and relatively affluent, therefore the last thing they would like to do is to create their own itinerary by booking services on 4-5 different websites.

(But I suppose it is better to stop there, we are really OT now)


"4-5 different websites" No, no, no. Those are internal APIs. You still only present one website to the end user.


Sorry, you misunderstood.

When you book a cruise, your expectation is to get everything (including hotel stays before or after the cruise, along with flights to and from it) as a single package provided by the cruise vendor (or a travel agency). So the "reservation system" must take care of all of that.

When I said "4-5 different websites" I was trying to explain the point that a 60 yo high-income guy is usually not interested in getting the cruise itself on Carnival.com, then go look for excursions on CarribeansExcursion.net and to book flights on Lufthansa.de or AirFrance.fr.

It was a remark on the way the Cruise business works, and why it is so, not about architecture.

(But, once again, I really believe that we are way off topic... personally I do not really feel like creating a "Ask HN: are microservices a good choice for the cruise industry?" but if someone feels like submitting one I will try to contribute).


I think most criticisms around microservices are about good practices and skills beating microservices in theory.

And the virtue of microservices is that they create hard boundaries no matter your skill and seniority level. Any unsupervised junior will probably dissolve the module boundaries. But they can't simply dissolve the hard boundary of having a service in another location.


Those boundaries also increase the cost of development. If you are institutionally incapable of enforcing coding standards such that you can't prevent juniors from coupling your modules, perhaps it's worth it. But there are more efficient ways to build and run an engineering organization.

The best place for such boundaries is at the team/division/org level, not team-internal or single dev-internal like microservices implies with its name. Embrace Conway's law in your architecture, and don't subdivide within a service.


"If you are institutionally incapable of..."

This is the road to bullshit. Of course no manager or CEO will admit that their team/company is that. Admitting technical non-excellence is nearly impossible. Organisation inadequacy... impossible.

So the best course of action is to pretend your organisational methods are really just software architecture... and back to square one.


This is something I'm painfully familiar with and I'm starting to seem to me that "definition creep" that I thought was the result of marketing. Such as we see with terms like "AI" and "ML", actually mostly comes from this.

If you are a dumpster-fire S&P500 company CTO and there is a new shiny thing that would actually improve things, you are probably more capable of redefining that new term into the horse-shit you are currently doing; than actually do it.


If you can't admit the problem. Can't stare it down. Can't comprehend it. Then I don't know how you can ever solve it. I try to avoid these places, not easy though.


Are there any examples of large organizations that don't have this problem? I can imagine places with small teams that are basically isolated that can operate efficiently, but once you scale to point where dozens to hundreds of teams needing to cooperate, it seems like all you have is tradeoffs and fundamental coordination and scaling issues. I've never heard of a big org that didn't have some flavor of disfunction resulting from that kind of complexity. Some places seem to be "less-bad" but that doesn't mean good or efficient.


Jim Keller mentioned this problem in the hardware space. Basically when people breach the interface design and couple things and how that coupling limits your ability to grow the design.

When he helped set the tone for the Zen architecture he took AMDs existing people, set them on a more aggressive path and one of the cardinal tenants was you could not cheat on the interfaces. This is one of the nuggets you can pull from his interviews.

It's possible. It happens. And the end results can be industry changing.


Any way you slice it, it's hard to manage/align/coordinate a 100 devs.

If done right microservices is a way to transform part of your organization challenge into a technical one, which for many organizations is the right move.

The biggest issue is if you aren't large enough to have challenging organizational issues it's much easier to just solve the very solvable organizational issues than implement and use microservices.


> If done right microservices is a way to transform part of your organization challenge into a technical one, which for many organizations is the right move.

Famous last words, if done right... Or you just multiply your organizational issue with a technical one.


Sure poorly implemented solutions rarely solve problems well.

But implementing microservices is not an unsolvable problem. It's a problem that 1000s of organizations have solved.


I haven't seen one that has done it well personally. Missing in this is so much of how it might be done right. There are so many dragons. Vendor lock in, logging, debugging, development (can I run the application on my laptop?), versioning. How far will out of the box tooling get me vs what I have to build. Etc etc etc.

When the "new shiny" effect wears off you usually find a turd that smells worse than what came before. Which is why we see this thread ever month or two and will until the tooling catches up and companies stop creating the distributed turds or the method falls out of grace because people finally realize you can scale simple systems pretty far vertically.


Are organizations ever capable of enforcing coding standards even remotely close to that of what microservices _should_ provide? Because I have not seen it.

However this is muddied by the fact that I almost never see microservices, I see a lot of distributed monoliths tho.


For an org capable of adhering to standards. I've seen success in small teams. I've never seen it in a large org through, it's always a dumpster fire. Especially anything that grows really fast, culture is out the window to randomness, and it's a hodgepodge of understanding and methods that are all over the map.


I guess the birth of the term distributed monolith gives lie to the idea that the service boundaries can't be breached by devs. :)


I always love that circular logic.

"Hey our developers can't make modules correctly! Let's make the API boundaries using HTTP calls instead, then they'll suddenly know what to do!"

And that unsupervised junior? At one place I joined, that unsupervised junior just started passing data he "needed" via query string params in ginormous arrays from microservice to microservice.

And it wasn't a quick fix because instead of using the lovely type system that TELLS you where the stupid method has been used, you've got to go hunting for http calls scattered over multiple projects.

All you've done is make everything even more complicated, if you can't supervise your juniors, your code's going to go sideways whatever.

Microservices don't solve that at all and it's pure circular logic to claim otherwise. If your team can't make good classes, they can't make good APIs either. And worse still, suddenly everything's locked in because changing APIs is much harder than changing classes.


If the modules are distributed in binary form, for the languages that support them, the junior will have no way around it unless they feel like learning about hacking binaries.


I have never seen a "hard boundary" in a small to medium sized company. Everyone still ends up communicating, and there are simply not enough resources to maintain truly separate services.

The rigor of maintaining resilient, backward-compatible, versioned internal APIs is too resource and time consuming to do well. All I see is hack and slash, and tons of technical debt.

It seems like in the last couple of years it started sinking in, that distributed systems are hard.


Huh? They can, and will, just add the things they want to the rest api of the microservice A and then call them from B. That doesn't change with microservices.

Look up a thing called "distributed monolith".


That is harder when each service and team has their own repo and review process. Doesn't mean they won't just get merged, but there is increased friction and less-trust about random PRs from engineers that aren't actively working on the same team, so those PRs might get more scrutiny.


Then why cannot you have separate repos and review processes for modules? This has nothing to do with microservices vs modules.


Agreed - ther is no reason for that not to be the case. It just is that in practice (and not as some essentialism of how organizations needs to structure things) that miroservices tend proliferate git-repos and modules tend to be part of monorepos. But, sure, there is no need for that to be the case. So given a world like that, the repo seperation is what enforces the friction for junior devs to submit changes across repos. With monorepos it is easier.


> But they can't simply dissolve the hard boundary of having a service in another location.

I've seen people doing that a few times already. They start changing due to some uninformed kind of "convenience", and then you look at the services API and it makes no sense at all.


Yeah, I feel that happens all the time.


Time to throw the cat amongst the proverbial pigeons and start the year 2023 off with discord and disharmony.

Microservices are a solution to a problem. TDD is a solution to a problem, the same problem. Both are solutions that themselves create more, and worse, problems. Thanks to the hype driven nature of software development the blast radius of these 'solutions' and their associated problems expands far beyond the people afflicted by the original problem.

That problem? Not using statically typed languages.

TDD attempts to reconstruct a compiler, poorly. And Microservices tries to reconstruct encapsulation and code structure, poorly. Maybe if you don't use a language which gets hard to reason about beyond a few hundred lines you won't need to keep teams and codebases below a certain size. Maybe if you can't do all kinds of dynamic nonsense with no guardrails you don't have to worry so much about all that code being in the same place. The emperor has no clothes, he never had.

Edit: to reduce the flame bait nature of the above a bit. Not in all cases, I'm sure there are a very few places and scales where microservices make sense. If you have one of those and you used this pattern correctly, that's great.

And splitting out components as services is not always bad and can make a lot of sense. It's the "micro" part of "microservices" that marks out this dreadful hype/trend pattern I object to. It's clearly a horrible hack to paper over the way dynamically typed codebases become much harder to reason about and maintain at scale. Adding a bunch of much harder problems (distributed transactions, networks, retries, distributed state, etc) in order to preserve teams sanity instead of just using tooling that can enforce some sense of order.


Microservices has little to do with tech. It is a way of organizing teams of people, funnelling all communication between teams through well defined specifications instead of ad-hoc meetings. It is not clear where static typing, or lack thereof, comes into play here.

TDD is a method of documenting your application in a way that happens to be self-verifying. You could use a Word document instead, but lose the ability for the machine to verify that the application does what the documentation claims that it should. Static typing does provide some level of documentation as well, but even if you have static typing available static typing isn't sufficient to convey the full intent that your documentation needs to convey to other developers.


# Microservices

I think at the root this idea that Microservices provide this technical solution to a social issue, coordinating teams at scale, hides the driving motivation.

Why can't the teams work on the same codebase and the same database? (again note my edited in disclaimer in the GP post that maybe at certain rare scales it's necessary). What is a well defined specification and why is it only a REST/proto/whatever network API? Why not a defined interface or project/library? Why aren't teams at this scale working in languages that support defining these things in code without adding a complicating, latency and error-inducing, network call?

Statically typed languages are just superior at encoding this "well defined specification". It's a large part of their reason for being. Sure you can still bypass it and do reflection or other naughtiness, but we shouldn't pretend a network call is the only solution.

# TDD

Testing is certainly a way to provide a self-verifying application. TDD is a toxic, hype-driven, mess that has led many good developers astray. Static languages have their own devils here, mocking being chief among them (not to mention the Java clown brigade and their own enthusiastic adoption of -- originally Smalltalk conceived -- TDD).

I added TDD both to annoy people but also to illustrate the general point. These are both concepts derived from the obvious drawbacks of dynamically typed languages that are then treated as generalist answers to complexity in software.

If you'll allow me some pseudo-intellectual waffle, the complexity of a software system C_dynamic + C_software = C_total, that is, complexity of dynamically typed languages and their lack of rigour, plus complexity of software and the domain as a whole gives you total complexity. These approaches primarily target the dynamic complexity. Therefore introducing them and all attendant drawbacks in places where total complexity doesn't include the dynamic complexity because tools and approaches to remove this source of complexity are used actually worsens the entire field.


> Why can't the teams work on the same codebase and the same database?

The same database is tricky because one team might want to change the schema, breaking another team's work. You then need to have meetings to discuss the changes that will work for all parties and you've broken "communicate only by contract", and therefore no longer doing microservices. If you can ensure that there is no way for teams to trample over each other then you could use the same database.

The same codebase is more feasible as long as the team boundary separation is clear. You are likely already doing microservices with other teams when you import third-party libraries. In practice, though, much like the database problem maintaining boundary separation is difficult if you can easily reach into other team's code, so often each team's work will be parted into distinct codebases using IPC between them to ensure that the separation is unbreakable. This certainly isn't a requirement, though, just one way to keep people in line.

> Testing is certainly a way to provide a self-verifying application. TDD is a toxic, hype-driven, mess

Testing is a development tool to help test the functionality of a feature. If you are faced with a dynamic language you very well might write tests to ensure that types are handled correctly. Static types can, indeed, relieve the need for some tests.

TDD, on the other hand, is a documentation tool to relay to other developers what functionality is intended to provide and how it is meant to be used. It is strange to think that documentation is toxic. Before TDD we wrote "Word documents" containing the same information, but often found that the code didn't do what the documentation said it did. What TDD realized is if the documentation is also executable then you can have the machine prove that the application behaves according to spec. It's not exactly magic.


# Microservices

Agreed the same database gets difficult but I guess, where two things need to change dependently there's coordination overhead, that's inescapable. You can in general use e.g. schemas to keep domains fairly separate. Whether it's your HTTP API interface changing or the change needs to be coordinated in the same process I don't think you have a way to avoid these meetings. (or like the 3rd party team I work with you just change your HTTP API and break production without telling us :shrug:).

I guess my point is languages like Java and C# are much maligned for their "boilerplate" and it doesn't let 10x genius programmers go brrrrrr etc. But that boilerplate serves a purpose. Projects (in C#, I can't speak to Java but I believe it has a similar concept) compile to separate .dll's, or .so's. In general a project will publicly expose interfaces, i.e. contracts, and keep its internals encapsulated. Crucially projects all live in the same codebase in the same monolith, they're folder/directory level separation, but on steroids. There are tricks with attributes and reflection that can break this separation but with the additional constraint that dependencies cannot be circular you're forced to make decisions about where exactly things live. This is where the dynamic crowd throw up their hands and say "there are so many layers of pointless abstraction!". I'd agree these languages tend to too many layers but the layers provide the foundation which can support a project growing from 1-2 devs all the way through to 50+.

I'm maybe finally becoming a grumpy old programmer and I'm glad much of the consensus is swinging away from the no-types, no schema, no rules position of the past decade. People forgot Chesterton's fence and decided the layers didn't provide anything, then they added them back, as network boundaries, which was indescribably worse and more of a headache.

# Testing

I guess it's a semantics debate really. I think the red-green-refactor loop of TDD doesn't bring much. I've never seen it applied well on a statically typed codebase and where I've seen advocates try and apply it I've seen them tunnel-vision to suboptimal outcomes in order to fulfil the ceremonial needs of the process.

I think a few tests that take place at the project/dll/so boundary and use as much of the real dependencies as possible are far preferable to high coverage. This is probably a no-true-Scotsman area of debate. Maybe TDD has been done properly and productively by some groups, I've never seen it and no one I've met ever has (check out my confirmation bias!).


> Whether it's your HTTP API interface changing

Interfaces don't change under microservices. Communication by contract enshrines a contract. You must ensure that your API does not break legacy users no matter what changes you want to make going forward. You have committed to behaviour forevermore once you submit the contract to other teams.

This may be another reason why IPC is often preferred over straight function calls as some languages, particularly those with static types, make extending functionality without breakage quite hard. An HTTP API can more easily resort to tricks to return different results to different callers.

> I think the red-green-refactor loop of TDD doesn't bring much.

You find little value in writing a specification for your work or you find little value in automatic validation that the program works according to your spec?

"Red-green-refactor" is a little more specific in that it says that you should write a spec only for the work you know you are going to work on in the short term, whereas TDD in general leaves room for things like writing the entire application's spec before getting down to business. I think this is most practical in the real world, generally speaking. Often you don't know what your entire application should do in the beginning, making writing a full spec unrealistic.

> I've never seen it applied well on a statically typed codebase

Interesting, as I have only ever seen specifications (be it TDD or "Word documents") written for work that is done in statically typed languages. In my experience, the dynamic programming languages tend to attract the cowboy coders who don't understand why other (future) developers need more complete documentation or why specs are important and have no care to see that those things are done.


Apologies for the delay in a reply. I think this has been a constructive discussion certainly for how I think about things.

> Interfaces don't change under microservices. Communication by contract enshrines a contract. You must ensure that your API does not break legacy users no matter what changes you want to make going forward. You have committed to behaviour forevermore once you submit the contract to other teams.

I think this entails a couple of things.

Firstly, it's a trade-off. If we discover some incorrect assumption is baked in to both sides of an interface we're kind of stuck and have a hard time with the migration path going forward. This isn't unsolvable and paths exist to correct it, but they're substantially harder than just updating both parts in the same commit. It moves coordination overhead as a trade-off.

Secondly, what I think actually happens is that most companies using microservices aren't this disciplined and just end up bodging or implicitly coupling services. Probably the pattern I'm talking about is "distributed monolith". Not the good kind where you just scale out your monolith but where you drink the microservices kool-aid and create "microservices" by adding network boundaries at random. This is the real-world application of microservices I have seen where I've seen it. I don't doubt some people can use it correctly with proper rigour. But my theory is the pattern took hold because using "distributed monolith" (appears to help / ) helps tame the complexity of a dynamic monolith. I find it incredibly hard to believe the niche pattern of microservices done correctly would have become so popularized otherwise.

> You find little value in writing a specification for your work or you find little value in automatic validation that the program works according to your spec?

I find little value in providing that specification through the TDD approach or associated ceremony. I prefer as few tests as possible at as high a level as possible with as few mocks as possible. I'll take a single test that takes 15 seconds to run over 1000 tests running in milliseconds.

Again for me this is a question of "what popularized this pattern?".

I think there's a lot of healthy debate to be had about how much testing is needed. I think TDD appeared to provide the kind of guarantees that are really helpful in a dynamic language and gained a lot of advocates that way. When I had to write some (type hint free) Python I naturally defaulted to writing TDD since keeping a large system in your head and error free is basically impossible without an exhaustive suite of tests. It's an approach that works very well in one context that applied to other contexts unquestioningly delivers a lot of pain.

Maybe you're lucky to have only worked with very deliberate and rigorous engineers in your career. Outside the top tier we're working with architecture astronauts and people who jump on whatever hype cycle happens to be passing which explains my anger with these concepts.


> Apologies for the delay in a reply.

Not at all. I'm in no rush.

> Secondly, what I think actually happens is that most companies using microservices aren't this disciplined and just end up bodging or implicitly coupling services.

If an organization is sufficiently large you may have no way to get another team on the line to even try. At that scale the other teams may as well work for other companies and that is what microservices models. If your organization is small, I tend to agree that you won't succeed. You can't beat Conway's Law.

> I find little value in providing that specification through the TDD approach or associated ceremony.

But, ultimately, what's the difference between writing your spec in an executable way or writing it in a Word document beyond the superficial differences in the languages? What is communicated to other developers ends up being the same.

> I prefer as few tests as possible at as high a level as possible with as few mocks as possible. I'll take a single test that takes 15 seconds to run over 1000 tests running in milliseconds.

Seemingly not all that common, interestingly, but Go comes to mind as a language that provides constructs to define separation between specification (TDD) and developer tests. Conceivably you could exclude your TDD specs from execution, only running the tests that help in your development process.

The value of TDD isn't in the execution, but in what is communicated to other developers. That the specs are executable is merely a nice side benefit to help with confirming that the implementation conforms to the spec. You still get 90% of the benefit of TDD even if you never run the executable, albeit granted at that point it is just a fancy Word document.

> I think there's a lot of healthy debate to be had about how much testing is needed.

Perhaps, but TDD isn't about testing. TDD is about providing documentation. I think there is less room for debate there. I'm not sure anyone who has ever inherited a codebase has wished it gave less insight into what the 'business needs' of the program are. Again, a Word document can provide the same documentation, but if you're going to write that Word document anyway why not go the extra mile and gain the additional benefits that come with execution?


I don't think this is true at all.

A compiler can't check that your logic is correct. There may be a bit of overlap in the things verified by tests and a compiler but they don't solve the same problem.

How are static types a requirement for encapsulation? Dynamic languages are perfectly capable of providing encapsulation. Statically typed languages are also perfectly capable of having very poor encapsulation.


The assertion that TDD is a solution to dynamic typing falls apart when you consider that the TDD dogma was born out of Java programmers. I won't argue that Java has the most expressive type system, but it _is_ statically typed.


My understanding is that it's kind of a 3-pronged thing, It originates (or is "rediscovered") by Kent Beck working in, at-the-time, Smalltalk. It has huge adoption in the RonR community. And lastly the usual Java suspects who never met a bad pattern they couldn't massively overadopt, e.g. the so-called "Uncle" Bob who never met a pattern he couldn't jam into code to overcomplicate it.


You know, I knew there was a major association between TDD and XP (wonder why!) and I had always heard of XP being Java related, but looking deeper I'm not sure that's has true as I thought


> It has huge adoption in the RonR community

Is that even true? DHH (RoR creator) hates TDD.


It's been hard to find a timeline or more general history of TDD and I should have probably used the word "had" above. But my understanding is RSpec[0] was a huge driver of the test driven development hype cycle. While DHH[1] has, as of 2014, come out against TDD and was probably never 100% on-board he is not the whole community and that's still a gap of 9 years where, in my understanding, it was broadly popularised as both "the way you should be doing things in Rails" and, as with most things in Rails "the way everyone should be doing things".

[0]: https://www.stevenrbaker.com/tech/history-of-rspec.html

[1]: https://dhh.dk/2014/tdd-is-dead-long-live-testing


Not to mention that before TDD the same documentation was provided as plain text. TDD simply realized that if the documentation was also executable that the implementation could be automatically validated for compliance against the documentation. The idea that you are reconstructing the compiler doesn't even track on the surface.

It is true that one may use testing to stand in for the lack of static typing, which is no dobut what the parent is talking about, but TDD != Testing. Testing long predates TDD.


Dynamic typing + Microservices + Unit Tests just blows everything out of the water on Development Speed / Time to Market.

Most important thing for lots of startups and companies is Time to Market.

Traditional static typing based approachs are just a bad joke (3 times slower on average) in comparsion.

That's why we have the whole microservices and dynamic typing thing going on, because businesses that use it beat up businesses that don't. It's pretty simple really.


> That problem? Not using statically typed languages.

TDD was invented by a Java programmer. How does that fit into your world view?


Java's original type system, while static, was far from being powerful enough to provide strong guarantees of safety.

Saying age is an int is great, but Java doesn't let you say age is an int ranging from 0 to 130.

You can of course create an age object, but the constraints on that object cannot be expressed within the type system.

So you have to add unit tests instead, the unit tests in effect extend the type system out to be as powerful as type systems from the 80s.

Yay, progress! :/


> Java doesn't let you say age is an int ranging from 0 to 130

This sounds good on paper, but good luck passing your custom types to third party libraries. Or even doing something as simple as calculating average age.

You do realize the more powerful the type system is, the more closely it is going to resemble programming language, which means that structures you build on top of it will contain bugs? You're not really solving bugs, you're just pushing them to another layer.


> This sounds good on paper, but good luck passing your custom types to third party libraries. Or even doing something as simple as calculating average age.

Ada solved this in the 80s, it isn't some unresolved field of comp sci. (Java already does this for array bounds!)

> You do realize the more powerful the type system is, the more closely it is going to resemble programming language, which means that structures you build on top of it will contain bugs? You're not really solving bugs, you're just pushing them to another layer.

Unit Tests are no different, except with worse syntax than built in language support.

At least with support in the type system you can turn off range checks for production builds. Meanwhile unit tests only run when invoked, giving less confidence than type checks that undergoing simulated usage as part of daily test builds.

There is of course a gradient, JavaScript exists on one side of that gradient (unit tests are needed for everything, starting with what type of object is even being returned), Java exists in the middle, and you have actual type safe languages farther along.

The point I was originally aiming to make is that Java is statically typed, but its type system is not so powerful as to negate the need for unit tests.


> Ada solved this in the 80s, it isn't some unresolved field of comp sci

Every solution comes at a cost. Where is Ada now?

> Unit Tests are no different, except with worse syntax than built in language support.

And I'm not advocating for unit tests, nor do I treat them as replacement for types. The fact that they occasionally overlap doesn't mean they serve same purposes.

> At least with support in the type system you can turn off range checks for production builds

Sounds like dynamic typing with extra steps.


> And I'm not advocating for unit tests, nor do I treat them as replacement for types. The fact that they occasionally overlap doesn't mean they serve same purposes.

So you don't advocate for unit tests, you don't want a powerful type system, what do you want?

> Every solution comes at a cost. Where is Ada now?

A source of features for newer languages, thus is the circle of programming language life.

> Sounds like dynamic typing with extra steps.

Static assertions based on build flags have been around for a very long time, in all sorts of languages. They are a performance/safety trade off.


> So you don't advocate for unit tests, you don't want a powerful type system, what do you want?

"Write tests. Not too many. Mostly integration."

Leave static typing for performance-critical sections.

> They are a performance/safety trade off.

So is dynamic typing. Only performance in this case means developer performance. Turns out most of our software is not landing airplanes, and developer performance is way more important than occasional non-critical bug that affects 5 users and gets fixed within a day.


The biggest draw of microservices to those just adopting them is not scalability or separation of concerns, but independent deployability (move fast and deploy new features in a particular area unencumbered).

Good LUCK getting that property with a monolithic or modular system. QE can never be certain (and let's be honest, they should be skeptical) that something modified in the same codebase as something else does not directly break another unrelated feature entirely. It makes their life very difficult when they can't safely draw lines.

Two different "modules" sharing even a database when they have disparate concerns is just waiting to break.

There's a lot of articles lately dumping on microservices and they're all antiquated. News flash: there is no universal pattern that wins all the time.

Sometimes a monolith is better than modules is better than microservices. If you can't tell which is better or you are convinced one is always better, the problem is with you, not the pattern.

Microservices net you a lot of advantages at the expense of way higher operational complexity. If you don't think that trade off is worth it (totally fair), don't use them.

Since we are talking about middleground, one I'd like to see one day is a deploy that puts all services in one "pod", so they all talk over Unix socket and remove the network boundary. This allows you to have one deploy config, specify each version of each service separately and therefore deploy whenever you want. It doesn't have the scalability part as much, but you could add the network boundary later.


This sounds like Bell Labs' Plan9


I am "too young" to get the reference but I looked it up and once again smiled at how little I know :)


As a performance geek, I like the idea of modules because, especially with languages with heavy runtimes like Java and .Net, packing more code into a single process brings with it some non-trivial performance benefits. And, of course, library calls can be 1000x cheaper than network calls! But there are also major downsides:

1. Deployment. Being able to deploy code rapidly and independently is lost when everything ships as a monolith.

2. Isolation. My process is GC-spiraling. Which team's code change is responsible for it? Since the process is shared across teams, a perf bug from one team now impacts many teams.

3. Operational complexity. People working on the system have to deal with the the fact that many teams' modules are running in the same service. Debugging and troubleshooting gets harder. Logging and telemetry also tends to get more complicated.

4. Dependency coupling. Everyone has to use the exact same versions of everything and everyone has to upgrade in lockstep. You can work around this with module systems that allow dependency isolation, but IMO this tends to lead to its own complexity issues that make it not worthwhile.

5. Module API boundaries. In my experience, developers have an easier time handling service APIs than library APIs. The API surface area is smaller, and it's more obvious that you need to handle backwards compatibility and how. There is also less opportunity to "cheat", or break encapsulation, with service boundaries compared to library boundaries.

In practice, for dividing up code, libraries and modules are the less popular solution for server-side programming compared to services for good reasons. The downsides are not worth the upsides in most cases!


This article is all over the place. The author acknowledges that microservices are about organizational clarity, then writes "In theory, anyway" without elaboration, and then goes on to talk about the latency impact of network-level IPC.

Why do we care about network latency, when we JUST established that microservices are about scaling large development teams? I have no problem with hackers ranting about slow, bloated and messy software architecture...but this is not the focus of discussion as presented in the article.

And then this conclusion:

> The key is to establish that common architectural backplane with well-understood integration and communication conventions, whatever you want or need it to be.

...so, like gRPC over HTTP? Last time I checked, gRPC is pretty well understood from an integration perspective. Much better than Enterprise Java Beans from the last century. Isn't this ironic? And where are the performance considerations for this backplane? Didn't we criticize microservices before because they have substandard performance?


Just saying "gRPC over HTTP" doesn't solve any of the sentence you quoted.

> common architectural backplane with well-understood integration and communication conventions, whatever you want or need it to be

Regardless of the tech used to implement, this paradigm needs to be solved to have a good system. That backplane is not an implementation, but a set of understood guiderails for inter-module communication.

Even with gRPC, both sides need to know what to call and what to provide and expect in response. That's the "conventions" part. Having consistency is more important than the underlying tech. Just simple ReST over HTTP works just as well as gRPC.


> That backplane is not an implementation, but a set of understood guiderails for inter-module communication.

Then why is the author even discussing microservices in the first place? Following your logic, they are an implementation detail just like modules.


There are two things that people often misunderstand about Microservices - there is no single definition about what they actually are, and -- arguably more importantly -- there exists no single rationale about why you would want to move to Microservice architecture in the first place.

Take for example Gartner's definition:

> A microservice is a service-oriented application component that is tightly scoped, strongly encapsulated, loosely coupled, independently deployable and independently scalable.

That's not too controversial. But... as a team why and when would you want to implement something like this? Again, let's ask Gartner. Here are excerpts from "Should your Team be using Microservice Architectures?":

> In fact, if you aren’t trying to implement a continuous delivery practice, you are better off using a more coarse-grained architectural model — what Gartner calls “Mesh App and Service Architecture” and “miniservices.”

> If your software engineering team has already adopted miniservices and agile DevOps and continuous delivery practices, but you still aren’t able to achieve your software engineering cadence goals, then it may be time to adopt a microservices architecture.

For Gartner, the strength of Microservice Architecture lies in delivery cadence (and it shouldn't even be the first thing you look at to achieve this). For another institution it could be something else. My point is that when people talk about things like Microservices they are often at cross-purposes.


I want modules that are highly decoupled. But it's often more painful than extracting some API from a monolith and defining a message passing strategy, interfaces, etc.

Another way to put it is that teams that share parts of the same codebase introduce low level bugs that affect each other, and most organizations are clueless about preventing it and in some cases do not even detect it.


I've been doing modules for decades.

Part of the reason is that anything that leaves the application package increases the error potential exponentially.

Also, modules "bottleneck" functionality, and allow me to concentrate work into one portion of the codebase.

I'm in the middle of "modularizing" the app I've been developing for some time.

I found that a great deal of functionality was spread throughout the app, as it had been added "incrementally," as we encountered issues and limitations.

The new module refines all that functionality into one discrete codebase. This allows us to be super-flexible with the actual UI (the module is basically the app "engine").

We have a designer, proposing a UX, and I found myself saying "no" too often. These "nos" came from the limitations of the app structure.

I don't like saying "no," but I won't make promises that I can't keep.

BTW: The module encompasses interactions with two different servers. It's just that I wrote those servers.


After scanning 300+ comments my conclusion is that not only there is no conclusion in this debate, there is no discernible framework that would help generate a conclusion

Most likely the question is not well defined in the first place.


Maybe that's because AFAIK the various frameworks out there for building applications are strongly opiniated on a given paradigm. We have libraries, but we don't really have composable components that can be assembled into arbitrary architectures.

It's like we're stuck assembling applications by stacking pre-existing Lego sets together with home-made, purpose-built, humongously-sized Lego bricks made of cardboard acting as glue.

I've ranted a bit about that here: https://news.ycombinator.com/item?id=34234840


As a senior software engineer, the most tiresome type of software dev to deal with is not the junior developer, it's the highly opinionated intermediate-level dev. They say things like "we'll obviously build the system using modern microservices architecture using node.js" before they even know the requirements.


+1 on this, folks with just enough experience to have strong, yet naive, opinions is the bane of my existence. It is almost akin to a religious discussion, whereas folks will back up their suppositions with blind faith and defend those views to the death. Sorry if this is not adding a lot to the conversation, you just really struck a nerve!


They’re likely just trying to pad their resume for the next gig.


Reminds me of the time at a previous job where a poorly supervised engineer developed a complex application using LabVIEW (an utterly inappropriate use of the technology) and then took a new job with National Instruments, leaving a shop full of C and Ada programmers to maintain it.


If they wrote that using a bunch of LabVIEW microservices you could replace the services one-by-one with C/Ada until the whole thing was no longer LabVIEW.

This is generally what I think the best part about microservices are; an easy way to improve upon the MVP in any dimension. Re-writing a whole monolith can take forever and probably will introduce a ton of bugs. But incrementally re-writing microservices won't stop feature development and is easy to A/B test for correctness.


For the unfamiliar, National Instruments makes LabView


> At the heart of microservices, we often find...

> ... the Fallacies of Distributed Computing.

I feel like I’m taking crazy pills, but at least I’m not the only one. I think the only reason this fallacy has survived so long this cycle is because we currently have a generation of network cards that is so fast that processes can’t keep up with them. Which is an architectural problem, possibly at the application layer, the kernel layer, or the motherboard design. Or maybe all three. When that gets fixed there will be a million consultants to show the way to migrate off of microservices because of the 8 Fallacies.


I think a lot of microservices are defined as separate apps and code repositories. This sounds good for initial build and deployment but the long term maintenance is the issue. Developers like the idea of independence of other teams writing parts of the overall solution but that trade-off can mean a lot of overhead in maintaining different code repositories of the same stack.

When a critical vulnerability comes out for whatever language you are using, you now have to patch, test and deploy X apps/repos vs much fewer if they are consolidated repositories written modularly. Same can be said for library/framework upgrades, breaking changes in versions, deprecated features, taking advantage of new features, etc.

Keeping the definition of runtimes as modular as the code can be instrumental in keeping a bunch of related modules/features in one application/repository. One way is with k8s deployments and init params where the app starts specific modules which then lends itself to be scaled differently. I'm sure there are home-grown ways to do this too without k8s.


This article builds up a bunch of straw man arguments based on what other people claimed in blog posts and then attacks them. In real world where micro services really made a difference, they enabled being more tolerant to a mix of different quality/maturity/velocity in different parts of the overall system and still keeping things as healthy as possible. If you have a monolithic execution binary that hosted all the logic code, then they all have to be of high quality or the entire thing can be taken down with one bad quality stuff. Having a mix of different quality/maturity/velocity in different parts of your systems is a business reality if your system is modeling a complex multi-pronged business that's evolving and growing and maturing in different prongs with different timelines. If this core value prop isn't needed, then probably you can solve your problems without micro services just as good or even better.


I was initially quite enthusiastic about microservices as I saw Unix philosophy ingrained in it. Especially that each service would be lightweight and small. Instead, what I see is each service tending towards more complexity and code mass because people started adding more ideas to them like DDD. So, on top of the network and authentication code that now need to be added to each service, people started defining classes for each domain object, adding validation code and unit tests for them, layering each service like application, infrastructure, domain etc. Now we are building systems that are more complex both as individual services and aggregates. Much of that complexity does not serve any functional purpose and its utility is difficult to measure in other respects.

I'm glad that this post was written so that we can look at widely accepted ideas a little more critically.


I don't think microservices are the answer to everything, but I don't see how monoliths can keep up with developer velocity when an organization reaches thousands of developers.

Monoliths are way slower to deploy than microservices, and when you have hundreds or thousands of changes going out every day, this means lots of changes being bundled together in the same deployment, and as a consequence, lots of bugs. Having to stop a deployment and roll it back every time a defect is sent to production would just make the whoe thing completely undeployable.

Microservices have some additional operational overhead, but they do allow much faster deployments and rollbacks without affecting the whole org.

Maybe I am biased, but I would love an explanation from the monolith-evangelist crowd on how to keep a monolith able to deploy multiple times a day and capable of rolling back changes when people are pushing hundreds of PRs every day.


> I don't see how monoliths can keep up with developer velocity when an organization reaches thousands of developers.

Are those thousands of developers working on a single product? If so, then I'd argue that you have way too many developers. At that point, you'd need so many layers of management that the overall vision of what the product is gets lost.


Not necessarily. But if the solution in this case is to start breaking up the monolith into smaller services owned by specific product teams, then you are moving towards microservices.


> but I don't see how monoliths can keep up with developer velocity when an organization reaches thousands of developers.

You're likely right.

I think what most detractors of microservices are pointing to is that most companies don't reach thousands of devs in size. Or even hundreds.


There are so many misconceptions about what microservices are or what problems they are trying to solve. Most people don't even experience the problems (yet or ever) they are meant to solve and they go straight to micro. 5 person teams making 5 services to power their product :faceplam:. A relatively simple b2b web application without any serious traffic also does not need microservices to handle its load.

People just read up on whatever seems to be the newest, coolest thing. The issue is that MS articles are usually coming from FAANG/ex-FANNG. These companies are solving problems that 99% of others do not.

As engineers we should be looking for the most effective solutions to a given business problem. Sadly, I see engineers with senior/staff titles just throwing cool tech terms/libs around. Boring tech club ftw


> Boring tech club ftw

As a CTO, I couldn't agree more. For our internal product, we use 100% boring technologies. The most "modern" you'll find is a React SPA.

I sigh when clients want to go the microservices route for a team of just a few developers. When you want to use NextJs for their tables&forms app. When they choose to use Kubernates instead of a couple EC2 instances.

Don't get me wrong, these technologies are great for us because we can charge more for the wasted human time developing these overengineered solutions. But I always, for my peace of mind, try to talk them out of them. Sometimes works, sometimes doesn't, at the end of the day it's their money.


I have done a lot of service development for what would be called microservices.

The article gets it right in ny opinion.

1. It has a lot to do with organisational constraints.

2. It has a lot to do with service boundaries. If services are chatty they should be coupled.

3. What a service does must be specified in regards to which data it takes in and what data it outputs. This data can and should be events.

4. Services should rely and work together based on messaging in terms of queues, topics, streams etc.

5. Services are often data enrichment services where one service enrich some data based on an event/data.

6. You never test more than one service at a time.

7. Services should not share code which is vibrant or short lived in terms of being updated frequently.

8. Conquer and divide. Start by developing a small monolith for what you expect could be multiple service. Then divide the code. And divide it so each coming service own its own implementation as per not sharing code between them.

9. IaaS is important. You should be able to push deploy and a service is setup with all of its infrastructure dependencies.

10. Domain boundaries are important. Structure teams around them based in a certain capability. E.g. Customers, Bookings, Invoicing. Each team owns a capability and its underlying services.

11. Make it possible for other teams to read all your data. They might need it for something they are solving.

12. Don't use kubernetes or any other orchestra unless you can't it what you want with cloud provider paas. Kubernetes is a beast and will put you to the test.

13. Services will not solve your problems if you do not understand how things communicate, fail and recovers.

14. Everything is eventually consistent. The mindset around that will take time to cope with.

A lot more...


Are there similar tools for modules like microservices for observability, discoverability and documentation? I'm thinking something like Backstage [1] that was featured on HN lately but for modules.

[1]: https://backstage.io/


> At the heart of microservices, we're told we'll find...

Well, in our +20 product teams with all serving different workflows for 3 different user types, the separate micro services are doing wonders for us for exactly the things you've listed.

My comment should just stop here to be honest.


In theory microservice is cool. In practice, it's not.

Microservice and modularity is orthogonal, it's not the same.

Modularity is related to business concept, microservice is related to infrastructure concept.

For example, i could have a module A which is deployed into microservide A1 and A2. In this case, A is almost abstract concept.

And of course, i could deploy all modules A, B, C using 1 big service (monothlic).

Moreover, i could share one microservice X for all modules.

All confusion from microservice, is made from the misconception that microservice = module.

Worse, most of "expert advice" which i've learnt actually relate Domain Driven Design to Microservice. They're not related, again.

Microservice to me, is to scale. Scale the infrastructure. Scale the team (management concept).


I have to believe these anti-microservices articles tend to be written by people who just don't need microservices, and maybe who also don't have much experience applying them to useful effect. Amazon, cited in the article as an originator of the practice, is a perfect example of where microservices are virtually unavoidable and central to the success of the company. There is no way to Amazon could have been built on on a small number of monoliths. None.


This article seems to have missed 16 years of event-based architectures, domain driven design, bound contexts, CQRS, and pretty much every reason we use microservices.


I’m surprised none of the big 3 cloud providers have come out with a cloud native language where modules or even classes gets deployed as micro services that scales horizontally.

Then you can have a mono repo that deploys to multiple micro services / cloud functions / lambdas as needed depending on code changes and programmers don’t have to worry about RPC or json when communicating between modules and can just call the damn function normally.


There are fundamental problems with distributiveness (which may be reduced to time/timeout handling) which make such task extremely difficult.

Though there were many attempts to do it, I would just mention Erlang and Akka.

The answer to your question is close to the answer for "what's wrong with Akka"


If you want to scale horizontally then why just run multiple copies of your monolith? That will work just fine unless you're really big, in which case you'll probably be wanted to write your own solution rather than depending on a cloud provider anyway.


Multiple copies but with traffic shaping is one of the simplest things that could work.


The 2 generals problem illustrates the problems with this. You can’t just call a function over an unreliable communication channel and expect it to work the same way.

Solving this in the general case for every function call is difficult. Pure functions are idempotent, so you can retry everything until nothing fails.

But once you add side effects and distributed state, we don’t know how to solve this in a completely generalized and performant way.


I've said this from the start: cloud functions should be a compilation target, not directly developer-facing. Cut out the monorepo middleman.


Isn't that basically how AWS Lambda operates, using JSON?


I didn't get this article. It seems like it contains two things: 1. It was already invented before 2. All the wrong reasons why people decide to use microservices.

But author clearly avoided the real reasons why you actually need to split stuff into separate services: 1. Some processes shouldn't be mixed in the same runtime. Simple example batch/streaming vs 'realtime'. Or important and not important. 2. Some things need different stack, runtimes, frameworks. And is much easier to separate them instead of trying to make them coexist.

And regarding 'it was already in Simpsons' argument, I don't think it should even be considered as argument. If you are old enough to remember EJB, you don't need to be explained why it was a bad idea from the start. Why services built on EJB were never scalable or maintainable. So even if EJB claimed to cover the same features as microservices right now, I'm pretty sure EJB won't be a framework of choice for anybody now.

Obviously considering microservices as the only _right_ solution is stupid. But same goes for pretty much any technology out there.


Idk I feel like one of the benefits to uservices is isolation; not in design or architecture as many comments say, but isolation from one service affecting another.

If I run a monolith and one least-used module leaks memory real hard, the entire process crashes even though the most-used/more important modules were fine.

Of course it's possible to run modularised code such that they're sandboxed/resources are controlled - but at that point it's like...isn't this all the same concept? Managed monolith with modules vs microservices on something like k8s.

I feel like rather than microservices or modules or whatever we need a concept for a sliding context, from one function->one group of functions->one feature->one service->dependent services->all services->etc.

With an architecture like that it would surely be possible to run each higher tier of context in any way we wanted; as a monolith, containerised, as lambdas. And working on it would be a matter of isolating yourself to the context required to get your current task done.


If you really require that sort of isolation, then microservices are a bugridden, ad hoc implementation of half of Erlang.


On the subject of modules, I often recommend “Composite / Structured Design” and/or “Reliable Software Through Composite Design” by Glenford Myers.

It’s old. The examples are in PL/I. But his framework for identifying “Functional Strength” and Data Coupling is something every developer should have. Keep in mind this is before functional programming and OOP.

I personally think it could be updated around his concepts of data homogeneity. Interfaces and first class functions are structures he didn't have available to him, but don’t require any new categories on his end, which is to say his critique still seems solid.

Overall, most Best Practices stuff all seem either derivative of or superfluous to this guy just actually classifying modules by their boundaries and data.

I should note, I haven't audited his design methodologies, which I'm sure are quite dated. His taxonomy concerning modules was enough for me.

"The Art of Software Testing" is another of his. I picked up his whole corpus concerning software on thriftbooks for like $20.


I also realized recently that PHP-FPM can just prefork a number of processes that equals N x (# of cores). And the OS scheduling and waking up threads on I/O is fast enough that you don't really need evented stuff at all. The advantages are that there are no memory leaks, there is isolation, one hung process doesn't bring down the whole server, and you don't have to rewrite your scripts to worry about sharing a process.

Having said that, if you want to eke out another 3x throughput improvement, then by all means, grab your OpenSwoole or ReactPHP or AMPHP and go to town. But PHP already has Fibers, while OpenSwoole still has coroutines. Oh yeah, and the try/catch in OpenSwoole is broken so good luck catching stuff.

https://twitter.com/openswoole/status/1576948742909198337


In my experience of building stuff in a way that I guess you would describe as micro services, but purely by happenstance (ie. I didn’t set out to “do micro services”) at the heart of micro services are queues.

Queues are awesome, just use queues almost all the time and then either build micro services or don’t.


Answer is no.

It is about where the shoe fits. If you become too heavily dependent on modules you risk module incompatibility due to version changes. If you are not the maintainer of your dependent module you hold a lot of risk. You don't get that with microservices.

If you focus too much on microservices you introduce virtualized bloat that adds too much complexity and complexities are bad.

Modules are like someone saying it is great to be monolithic. Noone should upright justify an overly complicated application or a monolithic one.

The solution is to build common modules that are maintainable. You follow that up with multi-container pods have them talk low level between each other.

Stricking that exact balance is what is needed not striking odd justifications for failed models. It is about, "What does my application do?" and answering with which design benefits it the most.


You want to get away from sweeping generalisations because once the build/test/package/deployment tax of a modules approach bites, you do want to go monorepo microservices - it’s free clawback of time wasted building and deploying all the parts of a huge system that you didn’t change in your PR.


The fundamental generalisation that does hold is that you never want another team on your route to production.


Couldn’t agree more.

The way I usually describe my preferred heuristic to decide between modules and microservices is:

If you need to deploy the different parts of your software individually, and there’s a cost of opportunity in simply adopting a release train approach, go for microservices.

Otherwise, isolated modules are enough in the vast majority of cases.


The biggest appeal to me for microservices (which might be in the list in terms of "maintainability" but isn't explicitly called out) is that it enforces the modularization. Yes I want modules. But no, I don't have the discipline to actually keep a code base modular. Platforms and languages have evolved for rapid development and convenience, and realities for modularization that aren't architectural, for example compilation units or deployment.

A failed lookup of a function is greeted by "Do you want to import X so you can call foo()?". Having a battery of architectural unit tests or linters ensuring at module foo doesn't use module bar feels like a crutch.

Now, it might seem like making microservices just to accomplish modularization seems like a massive overkill and sa huge overhead for what should be accomlished at the language level - and you'd be right.

But that leads to the second largest appeal, which is closely related. The one thing that kills software is the big ball of mud where you can't really change that dependency, move to the next platform version or switch a database provider. Even in well-modularized code, you still share dependencies. You build all of it on react, or all the data is using Entity Framework or postgres. Because why not? Why would you want multiple hassles when one hassle is enough? But this really also means that when something is a poor fit for a new module, you shoehorn that module to use whatever all the other modules use (Postgres, Entity Framework, React...). With proper microservices, at least in theory you should be able to use multiple versions of the same frameworks, or different frameworks all together.

It should also be said that "modules vs microservices" is also a dichotomy that mostly applies in one niche of software development: Web/SaaS development. Everywhere else, they blur into one and the same, but sometimes surfacing e.g. in a desktop app offloading some task to separate processes for stability or resource usage (like a language server in an IDE).


While I mostly agree, I don't think it's so black and white on the enforcement part and I actually think that a lot of recent developments actually put holes into this.

The typical example is that when you have to explain something is the job of X and Y. Usually this means that X and Y are breaking those boundaries. Just make a semi-private (or even public) API only used for that thing and you have a broken boundary again. Or push it on a message queue, etc.

I think, it certainly helps, but then again it doesn't prevent it. Having modules you have the same effect.

So in the end you get more spots where things can get wrong and more operational complexity. If you stick to using them right, I think you can also stick to using modules right with less complexities, better performance, easier debugability, fewer moving parts.

Hackers will find a way to do hacky things everywhere. ;)

Also this whole discussion reminds me of Linus discussing how he thinks micro kernels add complexity a very long time ago. Not sure if they should be considered modules or microservices though.

Sharing global state and so on while maybe it shouldn't be done lightly, without thinking about it can and does make sense. And in most modern environments it's not like the most quoted issues can happen too easily.

Also I strongly agree with pointing out that this is actually a niche topic. It's mostly big because that niche is where probably the majority of HN and "startup" people spend their time.


Software architecture can be tailored to a specific use case to best fit an application. Rather than strictly align to some thereoreical design principle, one can consider the use case and end goals and make the architecture match it.

But in general,just write some damn code. Presumably you have to write code, because building a software engineering department is one of the most difficult things you can do in order to solve a business problem. Even with the smartest engineers in the world (which you don't have), whatever you ship is inevitably going to end up an overly complex, expensive, bug-riddled maintenance nightmare, no matter what you do. Once you've made the decision to write code, just write the damn code, and plan to replace it every 3-5 years, because it probably will be anyway.


Ideally, What I want is the ability to decide later whether a particular call should be local, or RPC, and I should not have to care about any of it until I need to start scaling. I should also not need to specify locally what is going on - my reference infrastructure should handle turning the call into a jump-call or a net-call. Ideally, the fiber system that simulates multithreading for me should have an extensibility point that allows my RPC infrastructure to hook into it and "do the right thing" by treating RPC calls as the equivalent of cross-proc co-routines, especially if my OS lets me transparently share memory between two processes: In a way that would let me share pointers and modules across them.

Someday I will find a way to untangle this wishlist enough to turn into a design.


> What I want is the ability to decide later whether a particular call should be local, or RPC

Using hindsight, most of the systems that pretend that the network part of invoking RPCs is "easy" and "simple" and "local" end up being very complex, slow and error prone themselves.

See DCOM, DCE, CORBA and EJBs RMI.

You need instead a efficient RPC protocol that doesn't hide the fact it is a RPC protocol - like Cap'n Proto (it's "time-traveling RPC" feature is very interesting).


IMO the real problem with microservices comes from dealing with distributed state - going from one data store with usually strong consistency guarantees - to many distributed stores that have no way of ensuring they are in sync is the hard part.

RPC vs local calls is trivial in comparison and you can get that level of transparency out of the box with functional programming - it's just data in data out.


Ideally yes. But there are two major differences between a local and a remote call:

1) A remote call – inherently – fail. However, some local calls never fail. Either because they are designed to never fail or because you have done the required checks before executing the call. A remote call can fail because the network is unreliable and there is no way around that (?).

2) A remote call – inherently – can be very slow. Again because of the unpredictable network. A local call may be slow as well but usually either because everything is slow or because it just takes a while.

So if you have a call that may or may not be local you still have to treat it like a remote call. Right?

I think having a certain set of calls that may or may not be executed locally is not that bad. Usually it will just be a handful of methods/functions that should get this "function x(executeLocally = false|true)" treatment - which is prob. an acceptable tradeoff.


The cool part about "some local call never fails" is that because of scale out, out-of-order multi-threaded execution, is that even "local calls" (for some definitions of local) can fail as your preconditions get invalidated in other threads, unless you're employing mutexes/barriers.

Those modern computers are a network mesh of small D/G/CPUs, and it's coming back to bite us sometimes.


CORBA made rpc calls opaque, which was nice in theory, but there are faults that need to be handled only for remote calls. It’s a lot of extra code to handle those faults that you don’t need for local functions.


DCOM and CORBA are the two tarpit nightmares that you want to do everything you can never to have anything to do with.


I think dcom does this


The easiest approach I've found to this whole debate: start with a monolith, making notes about where you think a server is most likely to have bottlenecks.

Push the monolith to production, monitoring performance, and if and when performance spikes in an unpleasant way, "offload" the performance intensive work to a separate job server that's vertically scaled (or a series of vertically scaled job servers that reference a synced work queue).

It's simple, predictable, and insanely easy to maintain. Zero dependency on third party nightmare stacks, crazy configs, etc. Works well for 1 developer or several developers.

A quote I heard recently that I absolutely love (from a DIY construction guy, Jeff Thorman): "everybody wants to solve a $100 problem with a $1000 solution."


This is something folks have been doing long before the microservices hype.

That said, server bottlenecks are not the only thing (micro)services are trying to address.


> This is something folks have been doing long before the microservices hype.

Yes, and that's the point.

> server bottlenecks are not the only thing (micro)services are trying to address.

The only other real advantage is scale of network throughput and isolation from other processes (in case some service is particularly volatile or prone to errors). Even those are a stretch as they're both solved/solvable by modern infra and just isolating code into its own job server (technically a "microservice").


The only good reason to have a micro-service is so you don't have to write and maintain it yourself.


I think microservices are fine for businesses which have a general idea of how 'big' things could get with respect to their platforms. But if you're more in a company that has many one-off programs, scripts, and the like then maybe microservices aren't a thing that you need. Better organization is good, but that shouldn't be just a microservices thing, everyone benefits when you've organized not just individual projects and their source code but also just organizing what does what with external documents that can be updated to explain their purpose is useful. There's nothing worse than looking at source code or an program by name only to ask, "what is this thing?"


It's easy to crap on EJB, lots to disagree with, but vendors like WebLogic were trying to do interesting things with them when they were ascending. I recall they had a nifty feature where if you were trying to call a remote EJB and the container knew it was deployed locally, it would automatically do a cheaper local call instead of RMI. It was awkward as hell, but it did do that, and it was faster. J2EE also had the concept of people _roles_ as part of its prescribed SDLC, something we could benefit from exploring, especially the _deployer_ role.

Ideally we could flexibly deploy services/components in the same way as WebLogic EJB. Discovery of where components live could be handled by the container and if services/components were deployed locally to one another, calls would be done locally without hitting the TCP/IP stack. I gather that systems like Kubernetes offer a lot of this kind of deployment flexibility/discovery, but I'd like to see it driven down into the languages/frameworks for maximum payoff.

Also, the right way to do microservices is for services to "own" all their own data and not call downstream services to get what they need. No n+1 problem allowed! This requires "inverting the arrows"/"don't call me, I'll call you" and few organizations have architectures that work that way - hence the fallacies of networked computing reference. Again, the services language/framework needs to prescribe ways of working that seamlessly establish (*and* can periodically/on-demand rebroadcast) data feeds that our upstreams need so they don't need to call us n+1-style.

Microservices are great to see, even with all the problems, they DO solve organizational scaling problems and let teams that hate each other work together productively. But, we have an industry immaturity problem with the architectures and software that is not in any big players' interest in solving because they like renting moar computers on the internet.

I have no actual solutions to offer, and there is no money in tools unless you are lucky and hellbent on succeeding like JetBrains.


Yes, generally that's what we want -- modularity. I think the article touches upon a key truth, that There is Nothing New Under the Sun -- we always deal with complexity and we always find the same ways of structuring our responses to it.

One thing I've observed in a microservice-heavy shop before was that there was the Preferred Language and the Preferred Best Practices and they were the same or very similar across the multiple teams responsible for different things. It lead to a curious phenomenon, where despite the architectural modularity, the overall SAAS solution built upon these services felt very monolithic. It seemed counter-productive, because it weakened the motivation to keep separation across boundaries.


100% true for certain classes of problem.

If I want to calculate the price of a stock option, that's an excellent candidate to package into a module rather than to expose as a microservice. Even if I have to support different runtimes as presented in the article, it's trivial.

A different class of problem that doesn't modularize well, in shared library terms, is something with a significant timing component, or something with transient states. Perhaps I need to ingest some data, wait for some time, a process the data, and then continue. This is would likely benefit from being an isolated service, unless all of the other system components have similar infrastructure capabilities for time management and ephemeral storage.


It's not really about what's better, it's about whats been standardized upon.

I'm sure I could gain many of the advantages of microservices through an OSGi monolith, however, an OSGi monolith is not the hot thing of the day, I'm likely to be poorly supported if I go down this route.

Ideally I also want some of my developers to be able to write their server on the node ecosystem - if they so choose, and don't want updating the language my modules run on (in this case the JVM) to be the biggest pain of the century.

Besides, once my MAU is in the hundreds of thousands, I probably want to scale the different parts of my system independently anyway - so different concerns come in to play.


This article nails it, but I still like microservices because I’ve yet to see a team doing a modular architecture in memory keep from creating a spaghetti mess of interdependence.

Yes, the same often happens in microservices, but the extra complexity of a distributed system provides a slightly stronger nudge to decouple that means some teams at the margin do achieve something modular.

I’m something of a skeptic on modular monoliths until we as an industry adopt practices that encourage decoupling more than we currently do.

Yes, in theory they’re the same as microservices, without the distributed complexity, but in practice, microservices provide slightly better friction/incentives to decouple.


Modules vs Microservices, January 2023 edition

I've done monoliths and microservices. I've worked in startups, SMEs and at FAANGS. As usual, nothing in this article demonstrates that the person has significant experience of running either in production. They may have experience of failure.

In my experience, microservices are simply one possible scaling model for modules. Another way to scale a monolith is to just make the server bigger: the One Giant Server model.

If you have a fairly well defined product, that needs a small (<30) number of engineers, then the One Giant Server model might be best for you. If you have a wide feature set, that requires >50 engineers, then microservices is probably the way to go.

There is no noticeable transition from a well implemented monolith with a small team into a well implemented Giant Server with a small team. Possibly some engineers are worrying about cold start times for 1TB of RAM, but that's something that can happen well ahead of time, and hardware qualification is something one needs to do for microservices too. Some of the best examples of Giant Servers are developed by small teams of very well paid developers.

The transition from a monolith to a set of microservices, however is a very different affair. Unfortunately, the kind of projects that need to go microservices are often in a very poor state. Many such adventures one reads about are having to go to microservices because they've already gone to One Giant Server and those are now unable to handle the load. Usually these stories are accompanied by a history of blog posts about moving fast and breaking things, yolo, or whatever is cool. The transition to microservices is difficult because there are no meaningful modules: no modules, or modules that all use each other's classes and functions.

I don't believe that, once a particular scale is reached, microservices are a choice. They are either required or they are not. Either you can scale successfully with One Giant Server, or you can't.

The problem is that below a certain scale, microservices are a drag. And without microservices, it's very easy for inexperienced teams to fail to keep module discipline.


I am a front end dev, so microservice architecture is not something I am super familiar with my day-to-day, but I occasionally do work in our back end project, which is a service-oriented java project. The project is broken down into different services, but they are aggregated in "parent" projects, where the parent declares the modules in the pom.xml (in the "modules" xml declaration).

I like that architecture - the services are abstracted with clearly defined boundaries and they are easy to navigate / discover. Not sure if Java modules satisfy the concerns of the author or other HN users, but I liked it.


A benefit of microservices is that you can use specialised languages for different parts of the system. The core of your system might be written in Java but you might one to use Python for an ML heavy module. Nonetheless, I tend to agree that in an organisation with 100 microservices probably the ideal way would be to have 4 or 5 based on exceptions where you have modules that need a specific hardware profile (e.g. a lot of RAM) or a different programming language than your core system. Everything else could go into modules in a megaservice.


While we are on the topic, I would like to point out that the decoupled nature of microservices is governed by queuing theory. Microservices are especially susceptible to positive feedback loops and cascading failures.

Going back to systems thinking, flow control (concurrency and rate limiting) and API scheduling (weighted fair queuing) are needed to make these architectures work at any scale. Open source projects such as Aperture[0] can help tackle some these issues.

[0] https://github.com/fluxninja/aperture


There is a time and place but as soon as you have a few teams that build thigns differently or someone wants to upgrade how things are done on a very big project you will wish you used services instead of modules.


We've done the round trip of splitting up a monolith into microservices and then going back the other way. Network API overhead being the biggest reason. This kind of hell is totally unnecessary until it absolutely is (i.e. hard information theory requirements demand you spread your app across more than 1 physical computer).

Monolith is better in almost every way. The only thing that still bothers us is the build/iteration time. We could resolve this by breaking our gigantic dll into smaller ones that can be built more incrementally. This is on my list for 2023.


The worst software I have ever seen in my 30+ years career is a 20 years old micro-services system. Basically the worst spaghetti code you can imagine, distributed over 150+ micro-services.

I have worked on a monolith that solved the exact same problem. And it was straightforward to maintain and upgrade.

I feel sorry for future developers who will have to take over and maintain micro-services systems created today. Teams that can’t create maintainable, well designed monoliths, will create an even bigger cluster f** using micro-services.


Bring back the monoliths! Divided up into modules and each of those modules being developed by another team, but essentially still the same deliverable (binary).

You only need to agree on an API between the modules and you're good to go!

Microservices suck dick and I hate the IT industry for jumping on this bandwagon (hype) without thoroughly discussing the benefits and drawbacks of the method. Debugging in itself is a pain with Microservices. Developing is a pain since you need n binaries running in your development environment.


You want Elixir and Erlang/OTP process trees, not Microservices.


Which is the actor model with supervision hierarchy (for clarity). I happen to agree, the actor model is the best approach to writing micro-services in my humble opinion. I would still call them 'micro services' though. Has the term 'micro-services' been overly constrained to RESTful only APIs? If so that would be a shame.


As a general rule of thumb, fully explore and exhaust your "monolith" options before you switch to Microservices. They quite often create more problems than they solve.


You don’t have to necessarily decide one way or the other. I have systems where the decision to put the module in the same process or to call it via RPC is done at runtime. Java’s dynamic proxies help with this, but it can be done in any language. The only downside is that one has to think about the size of messages crossing the API and that they need to be immutable values, not references to something in the process.


There is another downside, much more problematic in my experience - the failure modes of a distributed system (even over multiple processes in the same machine) are very different. The other process might be killed by OOM manager, user initiated kill signal, or a bug. All of a sudden, asking an object for its string identifier xyz.Name() becomes a possibly failing operation even though it succeeded a microsecond ago.


Yes, there are downsides to distributed systems. In some cases, they are still necessary. This approach allows us to make the decision to be distributed at runtime instead of at design time.

For our services, there is never a single point of failure. We use Resilience4j for retries, caching inside the dynamic proxy handler (using annotations to describe what is cacheable), and annotations to describe what can be done asynchronously.

In the cases where we use these, the caller is aware (via the interface documentation) that this is a distributed system where sometimes invocations happen really fast and reliably.


I was referring to your "only downside" statement (size and mutability of messages). No, that's far from the only downside, and in my experience, not even the most important one.

> In the cases where we use these, the caller is aware (via the interface documentation) that this is a distributed system where sometimes invocations happen really fast and reliably.

This is at odds with

> I have systems where the decision to put the module in the same process or to call it via RPC is done at runtime.

Either it's in the caller's choice (based on the documentation), or it's at runtime, but it can't be both and be equally reliable.


I meant "only downside" in terms of how to design the API parameters and return values.

I did not say it is the caller's choice nor that it is equally reliable. It is a system composition decision. It is a system that is sometimes distributed and treated as such from a reliability point of view. Sometimes the actual implementation is more reliable and faster.


The best experience I've had with modular code in the java space was osgi. Very cool to be able to upload a new jar and have it start working just like that. Microservices and eda... not really a fan. Yes you can release faster but operationalizing is a massive nightmare to me. Of course one size doesn't fit all but I'm pretty convinced this is only a popular solution for now.


Nah, I want micro services. Not everyone does however.


The original hard limitation was DBMS did not scale horizontally. This is no longer the case so for 90+% of use cases monolith is fine.


Right.

We delivered many talks on that subject and implemented an ultimate tool for that: https://github.com/7mind/izumi (the links to the talks are in the readme).

The library is for Scala, though all the principles may be reused in virtually any environment.

One of the notable mainstream (but dated) approaches is OSGi.


I've sadly never seen micros achieve the purported benefits. All I've seen is problems. Monolith, or a set of distinct systems which are owned by teams with little to no blur (just agreed contracts), anything else gets painfully messy and you'd have been better with code modules.


Microservices might be a hack to gain modularity when nothing else works.

https://michaelfeathers.silvrback.com/microservices-and-the-...


Microservices is a different perspective than modules, there is a business perspective, referencing DDD; there is a data perspective; there is also an operations and maintenance perspective; there is also a technical architecture perspective


If the same thing is said about A and B, it is not guaranteed that A=B, unless that thing specifies equality. Anyway I agree with the article that microservices are basically modules. They extend the domain of modules. They're usually overhyped.


Transaction boundaries are a critical aspect of a system.

I've often noticed that these boundaries are not considered when carving out microservices.

Subsequently, workarounds are put in place that tend to be complicated as they attempt to implement two phase commits.


ah come on, there is a reason why we start most estimates with "it depends".

It's on the same page as "yes, we could have written this in assembler better" or "this could simply be a daemon, why is it a container?"

As if an agile, gitops based, rootlessly built, microservice oriented, worldwide clustered app will magially solve all your problems :D

If i learned anything it's to expect problems and build a stack that is dynamic enough to react. And that any modern stack includes the people managing it just as much as the code.

But yes, back when ASP.NET MVC came out i too wanted to rebuild the world using c# modules.


This is the usual: "don't use a one size fits all solution, sometimes microservices are good, sometimes they're bad. Just be smart and think about why you're doing things before doing them".


The maintenance cost is a debt that must be paid by all means. and microservices may be the cleanest solution if there are teams that develop businesses completely independent of each other.


Unfortunately, people keep forgetting that microservice also requires organizational change. Otherwise, it's just a heavy overhead.



You know what's missing from all microservices discussions? An actual definition. At what point does a microservice stop being micro? 5kloc of python equivalent? 10k? 25k?


Wear shoes that fit; not shoes you think you'll grow into.


Shoes you’ll grow into is advice for children.

Mastery is substantially about figuring out what rules of thumb and aphorisms are in place to keep beginners and idiots from hurting themselves or each other, and which ones are universal (including some about not hurting yourself, eg gun safety, sharps safety, nuclear safety).


Microservices are an optimization. Do not premature optimize.


How about "Architecture decisions based on the needs of our company, and by extension our software, not blog posts from Hacker News"


What I want is modules that I can instantly convert to services without changing any code using the module.

And this is how I work all the time.


I usually say that microservices are a good concept. Do everything to enable the split of your monolith, then ..don't.


Layers, Not Modules, Not Microservices Like donkeys, onions, caching, networking, my feelings and perceptions.


Modules can not constrain resource boundaries, microservices can. This is often overlooked.


Monoliths r hard. Microservices r hard. Pick your poison and get good.


Perhaps we should coin the phase 'embedded microservices'.


I've been thinking for a while about how to architecture systems and I wonder if perhaps we could generalize something like Fuchsia's device driver stack design [1] for arbitrary systems and eliminate this monolithic vs microservice applications debate altogether.

In Fuchsia, the device driver stack can be roughly split into three layers:

* Drivers, which are components (~ libraries with added metadata) that both ingest and expose capabilities,

* A capability-oriented IPC layer that works both inside and across processes,

* Driver hosts, which are processes that host driver instances.

The system then has the mechanism to realize a device driver graph by creating device driver instances and connecting them together through the IPC layer. What is interesting however is that there's also a policy that describes how the system should create boundaries between device driver instances [2].

For example, the system could have a policy where everything is instantiated inside the same driver host to maximize performance by eliminating inter-process communication and context switches, or where every device driver is instantiated into its own dedicated driver host to increase security through process isolation, or some middle ground compromise depending on security vs performance concerns.

For me, it feels like the Docker-like containerization paradigm is essentially an extension of good old user processes and IPC that stops at the process boundary, without any concern about what's going on inside it. It's like stacking premade Lego sets together into an application. What if we could start from raw Lego bricks instead and let an external policy dictate how to assemble them at run-time into a monolithic application running on a single server, micro-services distributed across the world or whatever hybrid architecture we want, with these bricks being none the wiser?

Heck, if we decomposed operating systems into those bricks, we could even imagine policies that would also enable composing from the ground up, with applications sitting on top of unikernels, microkernels or whatever hybrid we desire...

[1] https://fuchsia.dev/fuchsia-src/development/drivers/concepts...

[2] https://fuchsia.dev/fuchsia-src/development/drivers/concepts...


How do I horizontally scale my modules across multiple machines? How do I release a new version of my module without waiting for hundreds of other teams to all fix their bugs? .99^100^365 is a very small number.


you scale the component containing your module to multiple nodes, same with microservices, same with monoliths. the only reason it might be hard is if some other module in the component is aggressively preallocating resources even when lightly used, and that is a problem to be solved by itself, orthogonal to the deployment strategy

with a good branching strategy


There will always be bugs in production. Achieving perfection is not something you should require.


Not every company can afford to run crappy code in production. PayPal and banks, for example. Being able to quickly roll back changes while still keeping developer velocity and moving forward is important, and it is very difficult when thousands of changes are going out every day in the same monolith.


You can forward fix instead of rolling back.


But not all forward fixes are quick to write or even locate. Also, monoliths take much longer to go through CI (build + all the tests), and deployment tends to be much slower.

In cases when a rollback is necessay, you'll often find yourself in a cycle where a breakage goes out, the deployment is rolled back, a patch is applied, a new deployment goes out and a new breakage happens, the deployment is rolled back, and so on. I am exaggerating how often this happens, but it does happen when you have thousands of engineers, and it happened often enough at my workplace that developers were unhappy and management pushed for breaking up the monolith.


Microservices are one of the many AbstractAdapterVistorFactories of this generation of programmers.

Just because you can, doesn't mean you should.


So the time has come that people FINALLY see this! Modules! Inside! A! Monolith! Monolith is dead! Long live the monolith!


Conway's law




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: