The changes affects multiple places in the repo and this one of the rare instances where I cant be bothered writing a comprehensive commit. Look at the diff for changes. Signed-off-by: Ronald Caesar <github43132@proton.me>
12 KiB
Design Document: Third-Party Code Inclusion Strategy
Author: GloriousTacoo, Lead Developer
Status: FINAL
Version: 2.0
Date: 2025-09-20
Disclaimer: This document was mostly written by AI. I'm not a good technical writer.
1. Problem Statement
We require a rigorous, auditable, and maintainable strategy for including third-party code in the Pound Virtual Machine. The current approach lacks formal standardization, creating risks to system integrity, security, and long-term maintainability. Each third-party inclusion represents a potential attack surface, a source of unpredictable behavior, and a maintenance burden that must be managed with extreme prejudice. We cannot afford the luxury of casual dependency management in a system that demands absolute reliability. The current ad-hoc approach must be replaced with a formal methodology that prioritizes safety, auditability, and control above all other considerations.
2. Glossary
Third-party code refers to any software component not developed in-house as part of the Pound project, including libraries, frameworks, and tools that are incorporated into our build system. Git submodules are Git mechanisms that link to external repositories while maintaining version references within our main repository, allowing for the inclusion of external codebases as nested repositories. Submodule pinning refers to the practice of locking submodules to specific commits to prevent unexpected updates from introducing instability or security vulnerabilities. Cryptographic integrity verification is the process of using Git's built-in cryptographic mechanisms to ensure that submodule code has not been tampered with. Dependency manifest is a document that tracks all third-party dependencies, their versions, origins, and security status. License compatibility refers to the ability to combine software under different licenses without violating the terms of any license.
3. Breaking Changes
Any transition to a new third-party inclusion strategy will require immediate and comprehensive refactoring of the existing build system. All current third-party dependencies must be converted to Git submodules, with non-compliant components either brought into compliance or removed from the project. The CMakeLists.txt files throughout the project hierarchy will require complete rewriting to build from submodule sources rather than vendored copies. Continuous integration pipelines must be updated to properly initialize and update submodules. This is not a gradual transition but a complete overhaul of our dependency management philosophy that will affect every aspect of the build process.
4. Success Criteria
A successful third-party inclusion strategy must provide complete auditability of all dependencies, with clear documentation of the origin, version, and purpose of each external component. The build process must be fully reproducible across all supported platforms, ensuring that any build from the same source produces identical binaries. Security vulnerabilities in third-party components must be detectable through automated scanning, with clear processes for updating affected dependencies. The strategy must minimize the risk of dependency conflicts and version incompatibilities while maintaining build performance. All third-party code must be subject to the same rigorous coding standards and assertion framework as our own code, with no exceptions granted for external components.
5. Proposed Design
Our philosophy must be that third-party code is not trusted by default but rather treated as potentially hostile until proven otherwise through rigorous evaluation and continuous monitoring. The strategy must prioritize control over convenience, ensuring that every line of third-party code included in the project is explicitly approved, documented, and monitored throughout its lifecycle. We will implement a defense-in-depth approach where each dependency is evaluated not just for functionality but for security posture, maintenance status, and compatibility with our fail-fast philosophy. Git's built-in cryptographic integrity verification will serve as our primary defense against tampering, with all submodules pinned to specific commits to prevent unexpected changes.
6. Technical Design
The recommended approach is to manage all third-party code as Git submodules within the repository structure. Each dependency will be added as a submodule in the 3rd_party directory, pinned to a specific commit that has been reviewed and approved for inclusion. The process begins with identifying the specific version needed for inclusion, rather than simply using "latest" which is not reproducible. For each dependency, the appropriate commit hash is identified and documented, ensuring that the exact same code is used across all builds. The submodule is added using git submodule add with the specific commit hash, creating a persistent reference to that exact version of the external repository. All submodules must be documented with a comprehensive README file that includes the repository URL, commit hash, license information, and purpose within the project. The build system will be configured to build directly from the submodule directories, with CMakeLists.txt files in the main project referencing these submodule paths. Automated verification scripts will run during the CMake configuration phase to ensure all submodules are properly initialized and have not been modified unexpectedly. The process for updating submodules must be clearly documented, including steps for notification, review, testing, and integration of new versions. Any custom patches or modifications to submodule code must be strictly avoided if possible, but when necessary, they must be clearly documented, tracked separately, and ideally submitted upstream for inclusion in future releases. The build system will enforce strict version control through commit pinning, preventing automatic updates that could introduce unexpected behavior or security vulnerabilities, ensuring that all builds are completely reproducible regardless of external network availability.
7. Components
The strategy involves several key components working together to ensure safe dependency management. The 3rd_party directory serves as the centralized location for all submodule references, with each dependency maintained as a separate nested repository. The Git submodule system provides the core mechanism for including external code while maintaining cryptographic integrity verification. The build system, primarily CMake, will be configured to build directly from submodule sources, with no external network access during compilation. A dependency manifest will maintain metadata about each included component, including repository URLs, commit hashes, license information, and security status. Automated tooling will continuously monitor these dependencies for security vulnerabilities and compliance with our coding standards. Documentation requirements mandate that each dependency include clear attribution, modification logs, and integration notes.
8. Dependencies
This strategy depends on several critical elements to be effective. The Git version control system must be available and properly configured for all developers and build environments. The build system must be capable of handling submodule initialization and updates as part of the build process. Automated security scanning tools must be integrated into the development workflow to continuously monitor for vulnerabilities in third-party components. Developer training and enforcement mechanisms are essential to ensure compliance with the new standards. Legal review processes must be in place to verify license compatibility and ensure that all third-party inclusions meet our open-source requirements. The continuous integration system must be configured to properly initialize submodules and verify their integrity before building.
9. Major Risks & Mitigations
The primary risk is that submodules may introduce complexity in dependency management, particularly for developers unfamiliar with Git submodules. This will be mitigated through comprehensive documentation, automated verification scripts, and clear procedures for submodule initialization and updates. Another significant risk is the potential for divergence between the main repository and submodule references, leading to build failures. This will be addressed through automated verification during the CMake configuration phase and clear error messages when submodules are not properly synchronized. There is also a risk that upstream repositories may disappear or become unavailable, making it impossible to clone submodules. This will be mitigated by maintaining a comprehensive dependency manifest with all necessary information and considering periodic archival of critical dependencies. The risk of license incompatibility will be addressed through thorough review of all submodule licenses before inclusion and continuous monitoring for license changes.
10. Out of Scope
This strategy does not address the evaluation of specific third-party libraries for technical suitability, as that is covered by separate architectural review processes. The approach does not provide guidance on negotiating licenses or legal agreements with third-party vendors, as those matters fall under legal review. The strategy does not cover the integration of proprietary or commercially licensed components, as Pound is an open-source project with specific licensing requirements. Continuous integration testing of third-party components is considered part of the broader testing strategy and not specifically addressed here. The strategy also does not address the long-term maintenance of deprecated third-party dependencies, as those should be removed rather than maintained.
11. Alternatives Considered
The primary alternative to Git submodules is direct vendoring of third-party code into the repository, which involves copying source code directly into the project repository. While this approach provides complete control over dependencies and eliminates external network dependencies during builds, it significantly increases repository size and makes updates more cumbersome. Another alternative is CMake FetchContent, which downloads dependencies at build time. This approach offers convenience but sacrifices reproducibility and control, as builds become dependent on external network availability and the continued existence of remote repositories. Package managers like vcpkg or Conan were also considered but rejected due to their complexity and the additional attack surface they introduce. Each of these alternatives was ultimately deemed unsuitable for a safety-critical system where cryptographic integrity verification and reproducibility are paramount.
12. Recommendation
After careful analysis of all alternatives, Git submodules represent the superior approach for the Pound project. While they introduce some complexity in dependency management, they provide the level of cryptographic integrity verification, version control, and reproducibility demanded by a safety-critical system. The current approach of mixed dependency management methods must be replaced entirely with this standardized submodule strategy. The benefits of Git's built-in integrity verification, clear version tracking, and simplified updates far outweigh the complexity of submodule management. This approach aligns perfectly with our fail-fast philosophy and ensures that every line of code in the project, whether first-party or third-party, is subject to the same rigorous standards of safety and reliability. The transition to this approach should begin immediately with a complete inventory of all current third-party dependencies and their conversion to Git submodules.