Let's face it: monorepos are just huge monoliths. If you want modular software, ...

ithkuil · on Dec 19, 2022

Monorepos and monoliths are orthogonal

Monorepos are a tooling nightmare. Monorepos are a bandwidth black hole.

Monorepos make some things possible that are just not possible or very very hard when you have multiple independent repos that are built and tested independently.

Namely,

a) they allow you to test the effects of a change on all the components that depend on you, before you merge your change. This reduces the noise caused by regressions (API or behaviour) introduced in one component that is depended on by many consumers.

b) they are a practical way to ensure that all components have up to date internal dependencies: by placing the burden of API and behaviour breakage to the author of the change, you don't end up having hundreds of teams each struggling to keep up with dependencies that keep breaking their builds when you update them and consequently hating the teams that release those changes.

None of these things is a big deal unless you're a huge company with hundreds of teams.

I think in theory there could be some tooling and workflow that could provide all or moat of the benefits of monorepos, without the downsides.

Until then, monorepos are likely going to be a bad choice for small companies.

lukeramsden · on Dec 19, 2022

I agree with the other commenter - in my view, a monorepo is the _best_ choice for a small company. I guess this depends on what tooling is available for your language / ecosystem of choice though. In my experience of TypeScript and Java with monorepos, you definitely need to know how to configure the tooling properly (which is certainly "overhead"), but it massively reduces the maintenance cost and increases the consistency of your tooling config. Spreading out over loads of repos means you need to share artefacts, which means package managers and package manager hosts, and a whole suite of release CI/CD which gets out of sync almost immediately.

It's also getting a lot better, gradle works amazingly well for a monorepo even with dozens of developers committing to it every day with shared caching, nx/turborepo/others are making the story for front-end/TS much better too.

CrimsonRain · on Dec 19, 2022

Can you explain how a monorepo for microservices is a tooling nightmare and bandwidth sink?

In my experience, a monorepo significantly makes things easier and saves time everyday for all developers involved when compared with multi repo.

rtpg · on Dec 19, 2022

The biggest issue is that every CI provider speaks with "repo=project" language, so your way forward with monorepos is using stuff like Bazel.

Bazel is extremely cool! But you end up with, like, a handful of people on the team who can write Bazel and eveyone else cargo culting their way through it.

There are a lot of JS "monorepo management" tools but they all seem concerned about the release phase for a lot of libraries that really should just be one library instead of 300 npm packages or whatever.

dilyevsky · on Dec 19, 2022

> Bazel is extremely cool! But you end up with, like, a handful of people on the team who can write Bazel and eveyone else cargo culting their way through it.

It's not that difficult and docs are outstanding. It can and it will be worse with your own Bash-isms that likely won't have any docs at all.

rtpg · on Dec 19, 2022

I think Bazel is doing a lot of good stuff... but I think it suffers a bit from the same thing as Angular, where it makes a lot of its own terminology.

There is another aspect where if you are using a language like Python or JS then you have to kind of swim upstream to get an existing project onto Bazel. Far from impossible but if you just look at the default Bazel stuff without pulling in third-party libs it's pretty tedious to get a project with a good amount of dependencies working.

dilyevsky · on Dec 23, 2022

Why would you not pull third-party libs? I presume you use loads of them in the actual codebase how is this different?

CrimsonRain · on Dec 19, 2022

Thank you.

> repo=project

This has gotten better lately. At least, good enough for small-medium scale projects.

ithkuil · on Dec 19, 2022

>Can you explain how a monorepo for microservices is a tooling nightmare and bandwidth sink?

bandwidth:

I meant "bandwidth" as literal bandwidth. When your codebase becomes huge, your VCS repo size becomes enormous and it becomes harder and harder to keep a full checkout on all the development machines, especially if they are over the WAN (e.g. at home on your laptop).

This has fuelled solutions like sparse checkouts (like MS vfs for git, now scalar) remote development (like TFA; but also Google's cider and srcfs etc).

tooling:

naïve monorepo tooling (which I've seen in various companies I worked for) simply perform a full build of the whole monorepo for each CI execution. At first this is just fine since you can parallelize builds and call it a day; but after a while, the builds just don't scale anymore ; flaky tests become an increasing frustration etc.

The tooling that can help scale large monorepos does exist, but requires buy in and comes with its own learning curve and tradeoffs. One well known such tool is bazel (https://bazel.build) and bazel remote builds and remote caches. These tools are hard to set up, although folks at https://www.buildbuddy.io/ can help smaller startups by offering a managed service.

(Again, I'm talking about really large monorepos. A monorepo which includes a dozen or so modules, for which you can easily perform a full re-build on a single CI worker and on your laptop is not the kind of repo that creates tooling nightmares.)

CrimsonRain · on Dec 19, 2022

Thank you. Monorepos of _that_ size is definitely far more complex and comes with certain issues that you mention. Although, I feel like you'll only hit that wall once you are _very_ large as a company and at that point, you'll also have resources to climb that wall.

For small-medium scale, monorepo has been a blessing after dealing with multi-repo systems for years.

ithkuil · on Dec 19, 2022

I'm a fan of monorepos personally.

That said, I cannot pretend I don't see the problems with suboptimal tooling working in a medium size codebase. "big" and "medium" and "small" are quite subjective things.

At $work we have a monorepo whose git repo grew to 1GB in size and where the CI turn-around is so high, and the glitches so often, that it often takes hours or even days to land some code to prod. Developers instinctively react by making bigger and bigger changes because the very thought of going through PR/review/CI/merge cycle once again terrifies them. It's all compounded by a security policy that forces a code review approval every time the source code changed, including when you have to apply fixes to build failures induced by a component you don't own.

All of these things can and should be be fixed. But this is work, and is not urgent work so it's not often done at the same pace other stuff is done. This induces fatigue in the team and, as usually things go, people tend to blame the most easiest thing to blame: the monorepo.

That's why I try to phrase the problem to be a tooling problem and not a monorepo problem. Clearly if you don't have a problem with your monorepo, you either already have good tooling, or you don't need good tooling.

alganet · on Dec 19, 2022

You don't need a monorepo to have integration tests. You can very well have a CI that clones a bunch of stuff and builds and tests them all together.

The issue that monorepo is solving is regarding write operations: FOO depends on BAR, which depends on BAZ. If you need to change BAZ to develop what you want on FOO and they're all on multiple repos, you'd have to pull request your way from the bottom to the top of the dependency graph. This is what causes the friction that monorepos avoid, and this is the hard part of such workflow to automate with multiple repos.

hbrn · on Dec 20, 2022

That's exactly how I understand the benefits of monorepo and it seems like a terrible idea.

You might spend one week building next version of your component and 3 months updating all of the dependents. Then do it again. And again. You are working at a fraction of your productivity. I wonder if that's why Google needs thousands of engineers.

Whereas I just updated stripe from 2.x.x to 5.x.x in one of the projects I'm working on, because new version has features that I needed. I never wasted time updating to other versions until I had a need to.

It also limits your ability to break backwards compatibility in new versions, because we all of course are coming up with great designs right from the beginning.

I get the security and performance benefits of keeping all dependencies up to date, but man, the time sink and limitations seem so not worth it.

allset_ · on Dec 19, 2022

> you start creating a complicated network of soft dependencies between those modules. And when you have everything in a single repository, it's hard NOT to do that.

That is what the visibility [1] in Bazel solves. You can't import other people's code unless they say you can by making their code visible to yours.

[1] https://bazel.build/concepts/visibility