Having an optimized codebase really affects the overall performance and scalability of a project and the organization as a whole.
A good codebase should address the following issues:
- Boost cooperation among developers: Developers should be able to easily collaborate and work together on the same codebase without any conflicts or hurdles.
- Increase productivity: developers should be able to write code efficiently and effectively without unnecessary obstacles or inefficiencies.
- Take procedural overhead out: developers should focus on the logic building and implementing business logic rather than spending time figuring out how to overcome the setup challenges.
- Traceable: Things should be easily traceable in the version control system and allow for quick identification of changes and accountability for code modifications.
The codebase management is dividend into two popular sections, polyrepo and monorepo and we are going to discuss both from the web development / frontend or JavaScript ecosystem’s perspective.
Without a great understanding of both, it will be harder to design a large scale frontend system.
Polyrepo
As the name suggests, poly (meaning many) and repo (meaning repository) are ways of organizing code where each component or module is stored in its own separate repository.
The codebase of each project, library, and application is stored separately, providing autonomy to the teams to make independent decisions on the package, library incorporation, build, test, and deployment strategies.
This is the old practice of maintaining the code in large organizations. While it provides autonomy, it also faces lots of challenges because of this.
Autonomy is provided by isolation, and isolation affects collaborations, code sharing, and creates lots of redundancy in the code.
Following are the disadvantages of using a polyrepo:
- Challenging code sharing: You would probably need to construct a repository for the shared code in order to share it between repositories. In order for other repos to rely on it, you now need to set up package publishing, add committers to the repository, and configure the CI and tools environment. Additionally, let’s not even attempt to reconcile inconsistent third-party library versions across repositories.
- Major duplication of code: Teams just build their own implementations of common services and components in each repository since they don’t want to deal with the trouble of setting up a shared repository. Not only does this waste time up front, but as the parts and services change, it also makes maintenance, security, and quality control more difficult.
- Expensive cross-repo modifications to consumers and shared libraries: If a developer finds a serious problem or a disruptive change in a common library, they must configure their environment to apply the changes to several repositories with fragmented revision histories. Not to mention the collaboration required to release the packages and version them.
- Heterogeneous tooling: Different projects use different commands for executing tests, building, serving, linting, deploying, and other tasks. This leads to inconsistent tooling. Inconsistent behavior increases the mental burden of having to remember which commands to use on each project. This inconsistent behavior makes it challenging to maintain a smooth workflow across projects.
Polyrepo is still widely used in many organizations, but as the codebase grows, it becomes challenging to maintain the code.
Monorepo
Mono (meaning single) and repo (meaning repository), as the name suggests, is the practice of storing all codebases in a single repository.
All the projects, libraries, common components, shared states, and tools are in a single repository in different subfolders with a defined relationship which each others making them work like a well oiled machinery.
Monorepo should not be misunderstood, as monolith repos is more than just code coloaction. It is a powerful approach that promotes collaboration, scalability, and efficient development processes allowing each entity to be developed independently and in isolation still maintaining the relations.
With the use of efficient and well-configured tools, developers can easily manage and deploy changes across multiple repositories by tracking the dependencies, making the overall development process more streamlined and efficient and only affecting the dependent codebase.
Although there are many benefits to monorepos, having the proper tools is essential to their effectiveness. The tools in your workspace should support you in maintaining its speed, readability, and manageability as it expands.
For example, in the polyrepo, if use to use any library in any other project, you will have to publish the package and then incorporate this in your project using a package manager. Now if there is any issue in the library, you will have to go back to the library, address the issue, test, build, and publish it with the updated version so that it can later used in the project.
In monorepo, we can use effective tools like webpack, which provides import aliases, using which we can create an `nickname` alias for the library, import it into our project, and make them run in parallel in real time, removing the headache of the distribution, which is very time-consuming. This allows us to easily manage and update the library as needed.
Monorepo provides a variety of benefits like:
- Ease of refactoring: It is easier to trace who has worked on what changes have been made to the codebase as we will be following single versioning across the projects.
- Greater flexibility in collaborations: code can be easily shared and reused across different projects within the repository. It is easier to add new projects to the same repo with the available tooling.
- Better immigration experience: Developers working on different project will get the similar experience as the pervious one.
- Consistency: Allows consistency across the repository in terms of design, development, guidelines, and practices that will be followed.
You would be also wondering that having projects in the multiple sub-folders with bring lots of rendundancy the package managers like the same package will be downloaded for all the node modules.
Well that is not the case, with the proper tooling, we can keep the common packages in single node-module from where it can be imported in the projects and then projects can have their own node-modules packages.
And the tool can be configured to run the different projects along with their dependencies in isolation and also detect changes efficiently in the code to determine which projects need to be recompiled.
Proper considerations of linters would help you to be notified when your codebase becomes large and can be refactored into a separate module or library.
Challenges with the monorepo:
- Maintaining the code could sometimes become challenging as there is a chance that multiple developers have contributed to the same thing.
- Determining the release schedule and managing dependencies between different components can also be complex. Proper configuration of CI/CD pipelines and a well-organized version control system can help streamline the development process and ensure smooth deployment.
- Being a single codebase, there is a chance of advernet changes; any developer can make changes to any files. Using the permission reviewers that version control systems offer can solve this. Defining code owners is also a good practice.
- Rather than having a single monorepo within the organization, you can also have a domain-centric or business-centric monorepo to isolate your codebase and improve modularity and maintainability. This approach allows for greater scalability and flexibility in managing code repositories.
List of tools for the Monorepo:
- Bazel (by Google)
- Nx (by Nrwl)
- Rush (by Microsoft)
- Turborepo (by Vercel)
- Lerna, moon (by moonrepo)
Conclusion
Both code management system comes with their advantages and disadvantages, and there is no particularly startergy that can implemented to decide which one to use.
Therefore, it is important to carefully evaluate the specific requirements and goals of the project before choosing between polyrepo and monorepo.
Polyrepo can be incrementally migrated to monorepo.