
Engineering orgs don't usually stall because the infrastructure can't scale — AWS will sell you all the compute you want. They stall because the number of people who can hold an accurate model of the system in their heads stays roughly flat while the system's actual complexity compounds underneath them. The tells are everywhere once you look: incidents that take hours to diagnose and minutes to fix, onboarding that takes six weeks instead of two, four overlapping observability tools nobody's brave enough to kill, ownership maps that haven't matched reality since the last reorg.
View original source — Hacker Noon ↗


