But Why?#
Difficulties in toolchain maintenance#
Supporting shared toolchains is easy#
...but complicated!
- You've had to turn off a feature on one platform because a dependency was not supported or available for that platform.
- Platform is new and compiler chain requires bootstrapping.
- The time that developer needed a patch applied for their specific use case.
- Did you know that developers were copying libraries from your toolchain into product releases? Let's support that and avoid duplication of effort!
- The time a certain customer made a deal with your company to better prioritize your product's quality on their difficult-to-work-with platform over your developers' preferred platform.
- Commercial tool is only available in binary form, and installer only speaks "/usr/local".
- We want to avoid conflicts with OS vendor-provided packages, but also want the benefit of "just works" package installation picking up all required OS dependencies.
- You need to add support for a new OS distro/version/architecture, but need your tools to work consistently with your existing platforms.
The system has to outlast the original maintainer#
- Existing tool build configurations are not well documented, if at all.
- Existing build system is highly customized to for your specific prefix/repository.
What's wrong with $PATH
?#
Relying on the environment provided is a great solution.. for containers.
In a container environment you are incentivized to install only the desired
tool versions that are appropriate for running the application or task.
Whichever tool your default $PATH
locates should be the correct choice.
On a VM or baremetal host, with tools installed in /usr/local
, or
perhaps in an automounted NFS repositories, $PATH
has to be set
carefully to find the right tool installations, and it's not set by
default. Additionally, a user may still run a tool that isn't in their
$PATH
, and what then? If that tool relies on other tools (such as
gcc
needing as
and ld
from binutils), how does it find the right
ones? How can that user be confident it will work correctly and
reliably?
Why Conan?#
Conan is designed for building, installing, and packaging 3rdParty software, especially open-source software.
Conan doesn't completely satisfy our toolchain installation requirements on it's own, but it gets us about 90% of the way there. Check out these features:
- Integrity Validation for original tool source, and derived package builds
- DESTDIR installation management built-in
- Built-in custom deployer hook exactly where we need it in the process
- Integration with many 3rdParty build systems
- Conan Center - Extensive database of build recipes for thousands of open-source tools
Profiles#
Do you need to build your tools for multiple platforms? Conan profiles help with that! Swap in a different compiler, compiler version, or turn on debugging symbols, code coverage, sanitizers, all without modifying your software!
[settings]
arch=armv8
build_type=Release
compiler=gcc
compiler.cppstd=gnu17
compiler.libcxx=libstdc++11
compiler.version=11
os=Linux
[options]
*:install_prefix=/opt/toolchain
[buildenv]
LD_FLAGS="-fsanitize=address -static-libasan"
[runenv]
LD_PRELOAD=/opt/toolchain/lib64/libasan.so.8
System Package Integration#
Conan can speak Yum (RedHat), Apt (Debian), and more! While this feature is intended to provide a working build and runtime for native Conan packages, we can leverage this metadata to connect our system package builds to OS-provided packages, not just other Conan packages.
Why System Packaging?#
Ease of Use#
When building custom container images using our toolchain packages, we want the installation process to be as straightforward as possible. In an enterprise Engineering organization, we want each product team to be able to customize their sandbox or CI environments with minimal friction.
When we yum install -y toolchain-gcc
in a fresh container, we want a
working gcc
when it's done! We don't want to have to add other system
dependencies that we happen to know GCC requires.
As we want to support the Developer Sandbox with DevContainers, it's important to make it as easy as possible for each developer to install our preferred and trusted packages to ensure a reliable result.
When developers find working with the toolchain difficult or unproductive, they are often entirely capable of implementing their own.
When developers roll their own toolchains, the impact looks like:
- Duplication of effort building and deploying the tools/libraries
- Unreproducible builds, requiring "tribal knowledge"
- "Works on my machine" tests failing in CI/production
- Software supply chain vulnerabilities
SBoM / Software Composition Analysis#
Dependency tracking tools like Black Duck, syft, and Artifactory are able to scan container images and record installed RedHat RPMs, Debian .deb, and macOS .dmg out of the box.
In contrast, files that are simply copied into a container image are difficult to determine the originating source for. This results in incorrect dependencies being detected, creating a mess that needs to be manually sorted out by DevOps or Release Mangagement.
In one example, a product team includes a particular open-source JavaScript library in their product. The SCA file system scan found that library, but identified the component source for that file as Firefox because Firefox also includes that library. This caused all known Firefox CVEs to be associated with that product. Quality metrics were a mess, resulting in a quality escalation. It delayed the product release.
Even worse, a false positive result could be hiding a vulnerability in your product! Black hats don't put their faith in your SBoM! OS packaging significantly reduces both false positives and negatives.
Relying on file-system scan for your SBoM means a lot of manual data correction within your SCA tool, sometimes on a file-by-file basis. Leveraging the system package manager creates a vital linkage in your release provenance documentation. It ensures accurate results, and saves time, especially in unplanned escalations.
graph LR
A[Docker] --> B[RPM];
B --> C[OS Release, Yum];
B --> D[Application];
D --> E[3rdParty];
Reference
Integrity Verification#
- System packages can (and should be) be signed, ensuring they are from trusted sources at install time.
- After installation the installed contents can be verified to be unchanged as well.
Reference
Common Issues#
Avoiding package name conflicts#
What if we want RedHat's system python installed as well as our own?
IT needs their python version on every system for Ansible to work
reliably. RedHat's own yum
tool depends on python as well..
What if we want to support multiple toolchains in the same host or container image? Cross-compiling, product branching..
In this system, we name each package using a convention that associates it with the toolchain/prefix it supports.