Software Composition Analysis 101: Knowing what is inside your apps

The term Software Composition Analysis (SCA) is relatively new to the security world. But similar approaches have been used since the early 2000s to indicate security verifications on open source components. SCA has become an evolution of that. It is the process of identifying and listing all the parts and versions present in the code and checking each specific service and looking for outdated and/or vulnerable libraries that may impose security risks to the application. These tools can also check for legal issues regarding the use of open-source software with different licensing terms and conditions.

But how those SCA tools work, and how can they help identify and remediate issues on open source libraries that are being used in your codebase? Well, first, to be able to generate a graph of the components that are part of specific software and any issues related to them, we need to rely on at least three pieces of information: 

  • Application Manifest – a file that gives instructions on how the software should work, provides a list of dependencies required and contains required permissions and version compatibility.
  • Vulnerability Data Sources – a database of vulnerability information, they can be private or public, the most common public one is the National Vulnerability Database (NVD)
  • Dependency Metadata – it is the metadata related to the dependencies you have on your code, such as version, packaging, license, etc.

With the information above, you can better understand how your application is built and its open-source software. You can also identify which libraries are outdated or have a known vulnerability in them. Ok, that’s great! But there are some other rather complex issues that you need to be aware of when applying this technique or using an SCA tool to identify the problems on 3rd party software.

First, there are the direct dependencies, which are the ones you call directly in your code. Those are easier to list, identify their versions and fix. But then, they can also depend on other libraries that they use, and so on. These are called indirect or transitive dependencies. If those are outdated or vulnerable, it will be hard for someone to fix them directly unless they are the owner or maintainer of that code. 

Second, it is ubiquitous for security researchers to make a vulnerability public by creating a CVE and providing details on how the attack works and how to fix it. But in the open-source world, not every vulnerability gets a CVE. Why? Mainly because the CVE process is prolonged and centralized. Usually, there is no benefit for a developer to report a security bug unless they need someone to fix it for them. 

And last but not least, the effort to remediate those issues found on 3rd party software is much higher than the effort to identify them. It requires the execution of a series of unit and regression tests to ensure that everything is still working as intended and that most of the time either doesn’t exist or isn’t automated.

In a nutshell, SCA tools and techniques are here to stay, and their usage is increasing in many organizations where security is a priority. AppSec teams can’t keep up with all the new vulnerabilities being published daily. Make sure to look for solutions that can adequately adapt to your own way of building software. They need to cover the programming languages used in your organization and identify indirect dependencies, not just relying on public CVEs to find those issues.