All software comprises components. Given the prevalence of supply chain attacks against modules and libraries, it is important to know what parts your code uses. This is where the Software Bill of Materials (SBOM) comes into play. A SBOM is a list of all components your code relies on. So much for the theory, because what is a component? Do you count all the header files of C/C++ as individual components or just as parts of the library? Is it just the run-time environment or do you also list the moving parts you used to create the final application? The latter includes all tools working in intermediate processes. A good choice is to focus on the application in its deployment state.

How do you get all the components used in the run-time version of your application? The build process is the first source you need to query. The details depend on the build tool you use. You need to extract the version, the source of the packaged component (this can be a link on the download source or a package URL), the name of of component, and hashes of either the files or the component itself. If you inspect the CycloneDX or SPDX specifications, then you will find that there are a lot more data fields specified. You can group components, name authors, add a description, complex distributions processes, generic properties, and more details. The build systems and package managers do not provide easy access to all information. The complete view requires using the original source for the component, the operating system, and external data sources (for example, the CPE or links to published vulnerabilities). Don’t get distracted by the sheer amount of information that can be included in your manifest. Focus on the relevant and easy to get data about your components.

Hashes of your components are important. When focussing on the run-time of your application, make sure you identify all parts of it. To give an example of C/C++ code, you can identify all libraries your applications load dynamically. Then calculate the hashes of the libraries on the target platform. SBOMs can be different for various target platforms. If you use containers, then you can fix the components. Linking dynamically against libraries means that your code will use different incarnations on different systems. Make sure that you calculate more than one hash for your manifest. MD5 and SHA-1 are legacy algorithms. Do not use them. Instead, use SHA-2 with 256 bit or more and one hash of the SHA-3 family. This guards against hash collisions, because SHA-3 is not prone to content appending attacks.