Ask HN: Monorepo vs. Multi-Repo

kfichter · on Dec 14, 2021

I strongly prefer monorepo setups. We used to have a multi-repo setup but ended up transitioning to a monorepo and have been extremely happy with our decision.

Multi-repo setups induce a lot of overhead, especially if you don't already have people on your team who know how to manage them. Monorepo setups are just /easier/ on average. To illustrate, here's a simple situation you might run into: you're attempting to upgrade two software components at the same time. For a multi-repo: you open up PRs with the updates for each repo, great. Now you want to test that the two PRs work correctly together. So you might have a third repository with integration tests that can stand up both components with references to your other repositories via submodules. You create a branch on your integration testing repository and check out the appropriate branches of each submodule. You'll also need to keep these submodules updated whenever you update the corresponding PRs. If both updates need to be atomic then you'll need something to make sure both PRs are merged before any releases get cut. You'll have to start throwing some serious automation on top of all of this or it'll become too annoying to manage. For a monorepo: you create one PR with changes to both components. All of your tests run within the monorepo.

trb · on Dec 14, 2021

I've experienced both, and I prefer the mono repo.

Our challenges with multiple repos mostly revolved around builds and orchestration. We had to apply all build/deploy changes to all repos, and that increased the chance of doing some small thing wrong. Finding what exactly is wrong with one repo that should be the same as all the others was like one of those "find the differences" pictures. Really annoying.

This is more microservices than multi repo related, but making sure all the different services are released in sync was hard and annoying and often caused issues. Eg a specific API was updated and released but the for a consumer had to be rolled back and the rolled back version wasn't compatible with the new API. So current version and roll back wasnt an option. Rolling back the API would require rolling back all consumers but crap, one of the consumers applied a big migration to our core database and rolling that bank would take forever. And so on.

Just tons of little edge cases that went wrong at the worst time because it was so hard to foresee all the issues.

Monorepo and monolith is so comfy. Want to share code? Move it up one or two directories and import from there. No issues with two bundled react versions in two builds. Easy to refer to code from other teams, never an issue that someone forgot to add you to that one repo almost no one uses but that you need to commit to during firefighting.

I'm not saying multi repos/micro services can't work, but it's hard - you need strong processes that prevent people from being lazy, you need monitoring and management well defined, you need extra tooling that's aware of the repo structure, you need a strong story around migrations, and so much more.

I currently work with a mono repo monolith that over a thousand devs contribute to daily and it's Cindy comfy. It feels much easier to fix too many devs in one repo (primarily via strong compartmentalization) than to fix too many repos/services

jhoelzel · on Dec 14, 2021

I will probably get whipped for this, but what about submodules?

If I am working on a bigger project with multiple smaller projects I create a folder structure locally anyway!

I can commit them too and when pulling I can just choose to only pull the one submodule instead of all of them?

Working on them is essentially the same, you just need to update the main repo once in a while or run git pull independently.

sirwhinesalot · on Dec 14, 2021

You had 1 problem: Too many git repos. Now you have 2 problems: Too many git repos + a monorepo on top of all those repos.

We tried this where I work to solve the issue of having 10x more git repos than people on the team. We got rid of the submodules as fast as we introduced them, they solved nothing and only caused more headaches.

If you want to switch to a monorepo while retaining the history of the multi-repos, use git subtree. You want to get rid of the other repos, not keep them around.

jhoelzel · on Dec 14, 2021

you could create submodules of submodules to make an ever lasting chain too :D

tbh I believe this is an organisational problem and the usefullness of the monorepo just has to be questioned in this case. If you had lets say 30 devs in your org, there would have been 300 repositories. Since they are only containers for something, the question is what does each repo actually represent?

What we are trying to do is to ogranize things that belong together therefore I suggest a domain driven approach where stuff that belongs together is simply grouped together. kind of like a workspace is anyway in the end.

eg: Org - Infra - DB-Servers - App-Servers - xyz Servers - App1 -x -x - App2 - etc

Now the questions I would like to throw into the room is that if you need all repositories in order to build the software in front of you at the same time: why has it not been a monorepo to begin with? Subtress do seem to be the answer here I completely agree. At lesat this way you have -some- possibility to work on a subset of the code.

But if you use libraries/modules/pacakges to build the product, no matter what language, testing and integration should be done on the package level, even if at least at first anyway right?

So its not really a tech problem, but an organitational one.

What Submodules give you, is a repository that acts kind of like a readme, and you can even deploy documents alongside those submodules easily. By adding and removing ohter repositories you are also given the possiblity to fade out tech you are no longer using and with things like background ci workers you can even keep it nicely updated. Basically built by tech to be used by humans.

edit: I forgot to mention, that if one of those repos is on github, it for instance does not need to be cloned locally but can simply be clicked with a mouse and in the end you clone the repository you would like to work on.

sirwhinesalot · on Dec 14, 2021

All very true, I'm not suggesting a monorepo for the whole org (that's dumb) but per project.

If you have libraries shared across different applications, treat those libraries as fully independent products. No different from something you'd download from NPM.

Anything else is too tightly integrated, and if it is tightly integrated, you want a monorepo, not a repo with submodules.

edit: At least for C++ I typically use submodules to store libraries from their own git repos, but these are very carefully versioned. If there's a good package management solution for the language, this is typically not necessary.

duped · on Dec 14, 2021

Git submodules are only a viable tool for truly external dependencies that need to be built from source, in which case adding a submodule and special case for your configure/make scripts is one of the easiest ways to add a dependency to your project.

If you're using it for dependency management for internal projects you're going to have a bad time. Especially as things get more complex. The only advantage is if you need to keep particularly components open source. Otherwise just use a mono repo.

crate_barre · on Dec 14, 2021

You will not have the developer tooling and know-how to pull off a good monorepo workflow. Avoid it. I can assume this because, well, if you have to ask then the answer is a solid no, don’t do it.

Nothing is worse than half-baked monorepo solution.

nycynik · on Dec 14, 2021

I don't think the suggestion was to invent this from scratch, there are many OS tools designed for this. (Yarn workspaces, NX, bit, rush.. the list goes on)

zaphar · on Dec 14, 2021

Philosophically I think the monorepo is better. However it is my experience that most commonly available dev tooling assumes multi-repo setups and that ends up requiring you to write a lot more tooling and glue for CI/CD than you might want to be doing.

Monrepos can encourage a lot of good practices and can provide some pretty large benefits which is why many large engineering orgs gravitate to them. But most of those orgs have the capacity to absorb the tooling costs they bring as well.

Dm_Linov · on Dec 14, 2021

Both approaches have their good and bad sides. There is, however, a way to combine the advantages of both ways, having a monorepo synchronized with multiple individual repositories by Git X-Modules(https://gitmodules.com).

I am a part of the team behind it, so AMA about the tool :-)

WolfOliver · on Dec 14, 2021

I have the impression the monorepo solves the problems which arise from a distributed monolith. Any thoughts on this?

ehnto · on Dec 14, 2021

I think a better way to look at it, is that it does not introduce those issues in the first place.

It might seem like a nit pick, but incidental complexity like solving all the issues surrounding multirepos are usually accidentally introduced by not fully appreciating the tooling choices you have made. It's how simple products become overengineered nightmares.

Incidental complexity is any problem not directly related to solving your core business problems.

duped · on Dec 14, 2021

It solves the problem of code change visibility and synchronicity. Organizational architecture is not the same as software architecture regardless of how close they may look at times.

Jugurtha · on Dec 14, 2021

>microfrontend implementation

What does this mean?

>* We are debating between a using a monorepo or multi-repo strategy*

Why? What's the debate like?

johnnypangs · on Dec 14, 2021

I'm the original poster. So Microfrontends are basically splitting up a frontend into individually deployable parts much like microservices. We've decided to do this because of the number and size of our teams working on our frontend.

For a little more on microfrontend this is a good description on the idea: https://micro-frontends.org/

As for the debate, we have a monorepo/microfrontend now and it is quite a bit of work to maintain it and certain parts of it are starting to show the lack of resources put into it. We can't deploy easily individually, our tests are very slow, and crucially our organization thinks there is significant risk in a deployment so they have set a very complicated and long QA process for us to get deployments to production.

Realistically I think it is an organizational issue and I've been leaning towards multirepo due to the restricted resources and it would be easier to put some of this work on the teams making the individual parts themselves.

atmosx · on Dec 14, 2021

This is a communication problem, not a technology problem. Either could work or fail. Trade offs will have to be made.

Complexity doesn't go away, it's moved to another layer :)