At Valon, we care deeply about maintaining a uni-directional dependency graph that matches our intended architecture. With over 17,000 files in our monorepo, managed by over 50 engineers across 9 teams and growing, it’s imperative that our monorepo avoid becoming an infamous “spaghetti mess” where development crawls to a halt. With so many engineers writing code every day, it’s also important we put the power in their hands to understand our dependency graph, which dependencies are valid or invalid, how to eliminate invalid dependencies, and why this work is so important.
In this blog post, I’ll outline the ways our Platform team has tackled these goals.
What are Invalid Dependencies?
A dependency means one file depends on another. In our Python codebase, every import statement is a declaration of a dependency from the file the import statement resides in to the file that is being imported.
When we talk about an “invalid” dependency, we are talking about a dependency we don’t want to exist in our code base. The number one reason for not wanting a dependency to exist is because it causes a cycle in our dependency graph. However, engineers can mark any dependency as invalid. Broadening our definition allows us to get ahead of future circular dependencies, and also helps us maintain a simpler dependency graph.
If a circular dependency is detected, we require an engineer to declare one or more edges to be invalid, such that the dependency graph without those invalid edges (our “ideal” dependency graph) is unidirectional. An engineer can declare any dependency to be invalid however, even if it doesn’t create a circle.
Varying the Granularity of Edges
With respect to invalid edges and dependencies, in addition to preventing file-level circular dependencies, we also prevent package and domain circular dependencies. This helps us scale as our code base grows because it gives us the ability to zoom out and view the repository at a higher level, without getting bogged down in too many details.
The visualization on the left is unidirectional, but only at the file-level. We strive for the visualization on the right, which is a package-level unidirectional dependency graph.
We have over 250 packages, so we take it even one step further and roll up packages into “parent package” and “parent packages,” into “domains” to provide multiple zoomed out views of our monorepo, all of which are required to have unidirectional dependencies between them.
What is a package? A package is a folder that has a special file called _package_config. The package config specifies which team owns the code within, what domain it belongs to, review rules, registrations (a topic for another day), what other packages this package depends on, and code health violation trackers. Our linter runs special analysis on each of our packages to, for example, ensure no circular dependencies between them. |
Why is Eliminating Invalid Dependencies so important?
At Valon, we believe we can maintain the highest development speed by keeping the majority of our code in a monorepo. While monorepos are beneficial, they are not without their risks, one of the main ones being code entanglement due to poor modularization and tightly-coupled code. Circular dependencies are one of the biggest indicators of code entanglement and disorganization. Maintaining a simple and uni-directional dependency graph not only helps simplify an engineer’s mental model of a code base, but has many concrete benefits as well, such as:
- Faster IDE responsiveness by being able to leverage cached mypy results
- Faster and cheaper CI times due to fewer tests running
- Fewer deployments due to code changes not being detected and thus more operational simplicity, cost savings, and faster time to deploy
- Fewer code changes that cross team boundaries and so less time waiting for a review
Reducing complexity makes our monorepo more reliable and easier to debug.
Preventing new Invalid Dependencies
The first thing we focused on was stopping the bleeding and preventing new invalid dependencies. This is critical when working in a code base with many other engineers. Education helps, but once you hit a large enough engineering organization, it is not enough to rely on alone.
There are a few ways we help prevent new invalid dependencies, each managed by different linting rules.
Violation Counts
The first way is by tracking the current count of violations in every package’s package config file. This stops the bleeding, raises awareness, and provides a mechanism for teams to temporarily increase violations for business critical reasons.
Every individual file to file invalid dependency is counted, which provides visibility to the engineer if they happen to increase the coupling in a merge request.
Package to Package Level Dependencies
Creating a brand new invalid dependency between two packages is tracked in a separate file that requires the Platform team to review. Adding an invalid edge between two files, when an invalid package level dependency already exists, is sometimes hard to prevent without a concerted effort (the coupling already exists). However, adding an invalid edge between two packages is even more discouraged, since it creates a coupling that didn’t exist before.
Requiring Platform team review provides us an opportunity to make sure the engineer is aware of the downsides and can help them consider alternative options. We do this by maintaining a json file which has a mapping of each package to package level invalid dependency that our linter requires to be kept in sync with the codebase.
Preventing Circular Loops
If the linter detects a new package to package circular dependency, it will spit out all the packages involved and the edges, and require the engineer add at least one edge as invalid to break the circle.
Tracking the Size of our Largest Strongly Connected Components
Not all circular dependencies are created equal. Small loops (A -> B -> A) are minimally impactful, but large loops (A -> B -> C -> … -> Z -> A) can have detrimental effects on our continuous integration (CI) times due to our dependency graph being used to determine which tests to run.
An engineer can unknowingly create a huge loop in a pull request that itself doesn’t even add any invalid dependencies. For example, let’s say Z -> A is an invalid dependency and we have the following dependencies: Z -> A -> B -> … -> X -> Y. An engineer writes a pull request that adds Y -> Z which is perfectly valid, however, they have just created a loop that connects every node.
We don’t want to prevent a legitimate pull request from being merged, but we do want to know when a minor invalid dependency becomes a major invalid dependency. It gives us a chance to tag the teams that own the packages Z and A about the issue and the new impact, since that might change the prioritization of a fix.
To prevent too many interruptions, we use a threshold and keep track of all “strongly connected components” (graph-speak for a bunch of nodes that are connected in a circular loop, where every vertex is reachable from every other vertex) that have more than 25 nodes in them. This linter works at the file level, since a big loop even within a single package can have negative consequences.
When we first introduced this linter over a year ago, our largest loop comprised 620 files. It is now down to 172, despite our repository doubling in size.
Helping Teams Prioritize Invalid Dependencies
We currently have 1633 invalid file to file level imports that would need to be removed to reach our ideal state. Many of these will require medium to large cross-team efforts to solve. To help prioritization efforts, it is very important that we provide every team with an understanding of the impact of each of the invalid dependencies in which their code is involved . There is too much code—much of which requires specific domain knowledge to solve correctly—to have one team tackle the entire effort. This is why so much of our strategy is about empowering other engineers to solve the problem.
We measure the impact of each invalid edge by counting how many unnecessary transitive dependees are added and how many downstream dependencies are affected.
Example One
The invalid edge from File C to File D unnecessarily pulls in 3 transitive dependees (A, B and C), but it only affects one dependency. Tests from A, B and C will run unnecessarily but only when D is edited. For the sake of these examples, we’ll call this a rare event with a large impact on CI speed.
Example Two
The invalid edge from File A to File B unnecessarily pulls in tests from one transitive dependee (A), but it will happen any time B, C or D are edited. The invalid edge will frequently cause a small impact on CI.
Example Three
The invalid edge from File D to File A creates a loop and thus every node is both a transitive dependee and dependency. Any time any of these files are edited, every test covering them will run. This will frequently cause a large impact on CI.
Impact Score
“Impact Score” is a formula we came up with that gives each edge a weight so that example three from above would rise to the top, and example one and two would be treated equally.
The formula involves a few variables:
importing_node_transitive_dependencies_count_invalid_graph
imported_node_transitive_dependees_count_invalid_graph
importing_node_transitive_dependencies_count_ideal_graph
imported_node_transitive_dependees_count_ideal_graph
The impact score would thus be calculated as:
Example One’s Score
Importing node is File C, the imported node is File D
importing_node_transitive_dependencies_count_invalid_graph = 1
imported_node_transitive_dependees_count_invalid_graph = 3
Importing_node_transitive_dependencies_count_ideal_graph = 0
Imported_node_transitive_dependees_count_ideal_graph = 0
Impact score = (1 – 0) * (3 – 0) = 3
Example Two’s Score
Importing node is File A, the imported node is File B
importing_node_transitive_dependencies_count_invalid_graph = 3
imported_node_transitive_dependees_count_invalid_graph = 1
Importing_node_transitive_dependencies_count_ideal_graph = 0
Imported_node_transitive_dependees_count_ideal_graph = 0
Impact score = (3 – 0) * (1 – 0) = 3
Example Three’s Score
Importing node is File D, the imported node is File A
importing_node_transitive_dependencies_count_invalid_graph = 4
imported_node_transitive_dependees_count_invalid_graph = 4
Importing_node_transitive_dependencies_count_ideal_graph = 0
Imported_node_transitive_dependees_count_ideal_graph = 0
Impact score = (4 – 0) * (4 – 0) = 16
Example Four
Here is an additional example that shows how counts from the ideal graph affect the score. There is basically only one extra node (b) that would ever have tests run unnecessarily.
Importing node is a, the imported node is b
importing_node_transitive_dependencies_count_invalid_graph = 2 (b,c)
imported_node_transitive_dependees_count_invalid_graph = 1 (a)
importing_node_transitive_dependencies_count_ideal_graph = 1 (c)
imported_node_transitive_dependees_count_ideal_graph = 0
Impact score = (2 – 1) * (1 – 0) = 1 * 1 = 1
Dependency Set Impact
One limitation to impact score is that it calculates each invalid dependency in isolation—it removes all other invalid edges from the graph. We do this because if our graph were riddled with invalid edges, removing one might show no benefit. For example:
If we calculated invalid Edge 1 where our ideal graph counts included invalid edges 2 and 3, the impact would be 0.
Impact score is still very helpful, but it could take a long time to see real world impact, and we risk developers becoming frustrated and losing trust in our numbers if they spend a lot of effort removing high impactful edges but don’t see any actual impact on CI times.
This is where our next score comes into play: we highlight sets of invalid dependencies that can be removed and that would have an impact on real-life merge requests.
For this metric, we run a cron every night that goes through the last 24 hours of CI pipelines. For each pipeline, we gather the set of files that were edited and calculate how many tests ran. We then walk the graph breadth first, starting at the edited nodes, and each time an invalid edge is encountered, we chop it off and calculate the new test impact. We usually won’t see an impact for the first couple of invalid edges removed, but eventually we’ll have a minimum set of edges that can cause a decent test file reduction. Due to the explosive nature of set combinatorics, we can’t calculate the impact for every possible set, but it still provides useful information, even if incomplete.
These metrics are then available to all the teams in Datadog. For example, this graph is showing us that over the past week (the time range selected), we would have run 71,000 fewer test files if we only removed two invalid dependencies! You can also run the command on a specific MR. For example, running peach analyze-imports –mr 35608 results in:
What is peach? “peach” is our custom cli tool that offers a broad range of functionality to improve the development experience, from analyzing invalid dependencies, to generating database migrations, linting and auto-formatting, and many others. |
Visualizing Strongly Connected Components
When you want to eliminate or avoid a strongly connected component increase, it can help to visualize the graph in order to see the impact breaking certain edges would have. For example, this was a visualization generated by using the command peach analyze-imports -m front_porch.modules.offboarding_payments.offboarding_payments_service -g ../imgs –highlight_loops
(to visualize the SCC that the offboarding_payments_service file was a part of):
The edges in red are invalid. You can see that the team needed to cut all three edges in order to completely break the loop.
Another example is us focusing on how to best break apart our largest SCC with the minimal amount of effort. Using the command peach analyze-imports -m a.file.in.largest.loop –highlight_loops -g ../imgs results in the following visualization:
One big bottleneck is loan_accounting_service, the node on the right with a ton of incoming edges, many of them red.
To break that apart, you’d need to remove a lot of invalid edges. However, there is an entire cluster of nodes on the left that is included from only two red edges:Now we have a place to start looking where we can make a big impact with (hopefully) relatively low effort.
Conclusion
In conclusion, maintaining a clean and efficient dependency graph in our monorepo is crucial for sustaining productivity and preventing technical debt. By empowering engineers with the tools and knowledge to identify, understand, and resolve invalid dependencies, we’re ensuring that our codebase remains scalable and resilient as we grow. The strategies we’ve implemented not only prevent future issues but also provide actionable insights to prioritize and tackle existing challenges. This proactive approach keeps our development process smooth, enabling us to continue delivering high-quality software while avoiding the pitfalls of a tangled codebase.
If you’re interested in these problems and joining the Valon’s team, check out our open career opportunities.