Software composition analysis
{{Use dmy dates|date=February 2023}}
{{Short description|Software Composition Analysis}}
Software composition analysis (SCA) is a practice in the fields of Information technology and software engineering for analyzing custom-built software applications to detect embedded open-source software and detect if they are up-to-date, contain security flaws, or have licensing requirements.
{{Cite journal
|last1=Prana|first1=Gede Artha Azriadi
|last2=Sharma|first2=Abhishek
|last3=Shar|first3=Lwin Khin
|last4=Foo|first4=Darius
|last5=Santosa|first5=Andrew E
|last6=Sharma|first6=Asankhaya
|last7=Lo|first7=David
|date=July 2021
|title= Out of sight, out of mind? How vulnerable dependencies affect open-source projects
|journal=Empirical Software Engineering
|volume=26
|issue=4
|pages=1–34
|publisher=Springer
|doi=10.1007/s10664-021-09959-3
|s2cid=197679660
|url=https://ink.library.smu.edu.sg/sis_research/6048
}}
Background
It is a common software engineering practice to develop software by using different components.
{{Cite journal
|last1=Nierstrasz|first1=Oscar
|last2=Meijler|first2=Theo Dirk
|date=1995
|title= Research directions in software composition
|journal=ACM Computing Surveys
|volume=27
|issue=2
|pages=262–264
|publisher=ACM
|doi=10.1145/210376.210389
|s2cid=17612128
|doi-access=free
}} Using software components segments the complexity of larger elements into smaller pieces of code and increases flexibility by enabling easier reuse of components to address new requirements.
{{Cite book
|last1=Nierstrasz|first1=Oscar
|last2=Dami|first2=Laurent
|date=January 1995
|title= Object-oriented software composition
|pages=3–28
|publisher=Prentice Hall International
|citeseerx=10.1.1.90.8174
}} The practice has widely expanded since the late 1990s with the popularization of open-source software (OSS) to help speed up the software development process and reduce time to market.
{{Cite journal
|last1=De Hoon|first1=Michiel JL
|last2=Imoto|first2=Seiya
|last3=Nolan|first3=John
|last4=Miyano|first4=Satoru
|date=February 2004
|title= Open source clustering software
|journal=Bioinformatics
|volume=20
|issue=9
|pages=1453–1454
|doi=10.1093/bioinformatics/bth078
|bibcode=2004Bioin..20.1453D
|citeseerx=10.1.1.114.3335
}}
However, using open-source software introduces many risks for the software applications being developed. These risks can be organized into 5 categories:
{{Cite book
|last1=Duc Linh|first1=Nguyen
|last2=Duy Hung|first2=Phan
|last3=Dipe|first3=Vu Thu
|title=Proceedings of the 2019 8th International Conference on Software and Computer Applications
|chapter=Risk Management in Projects Based on Open-Source Software
|date=2019
|pages= 178–183
|doi=10.1145/3316615.3316648
|isbn=9781450365734
|s2cid=153314145
|chapter-url=https://dl.acm.org/doi/pdf/10.1145/3316615.3316648
}}
- OSS Version Control: risks of changes introduced by new versions
- Security: risks of vulnerabilities in components - Common Vulnerabilities & Exposures (or CVEs)
- License: risks of Intellectual property (IP) legal requirements
- Development: risks of compatibility between existing codebase and open-source software
- Support: risk of poor documentation and Obsolete software components
Shortly after the foundation of the Open Source Initiative in February 1998,{{cite web |url=http://opensource.org/history |title=History of the OSI |date=19 September 2006 | publisher=Opensource.org}} the risks associated with OSS were raised
{{Cite journal
|last1=Payne|first1=Christian
|date=2002
|title= On the security of open source software
|journal=Information Systems Journal
|volume=12
|pages= 61–78
|doi=10.1046/j.1365-2575.2002.00118.x
|s2cid=8123076
|url=https://flosshub.org/sites/flosshub.org/files/Payne2002_ISJ12_SecurityOSS.pdf
}} and organizations tried to manage this using spreadsheets and documents to track all the open source components used by their developers.
{{Cite journal
|last1=Kaur|first1=Sumandeep
|date=April 2020
|title= Security Issues in Open-Source Software
|journal=International Journal of Computer Science & Communication
|pages=47–51
|url=http://csjournals.com/IJCSC/PDF11-2/8.%20Suman.pdf
}}
For organizations using open-source components extensively, there was a need to help automate the analysis and management of open source risk. This resulted in a new category of software products called Software Composition Analysis (SCA) which helps organizations manage open source risk.
SCA strives to detect all the 3rd party components in use within a software application to help reduce risks associated with security vulnerabilities, IP licensing requirements, and obsolescence of components being used.
Principle of operation
SCA products typically work as follows:
{{Cite journal
|last1=Ombredanne|first1=Philippe
|date=October 2020
|title= Free and Open Source Software License Compliance: Tools for Software Composition Analysis
|journal=Computer
|volume=53
|issue=10
|pages=262–264
|doi=10.1109/MC.2020.3011082
|s2cid=222232127
|doi-access=free
}}
- An engine scans the software source code, and the associated artifacts used to compile a software application.
- The engine identifies the OSS components and their versions and usually stores this information in a database creating a catalog of OSS in use in the scanned application.
- This catalog is then compared to databases referencing known security vulnerabilities for each component, the licensing requirements for using the component, and the historical versions of the component.{{citation needed| reason=reference was to blog|date=January 2024}} For security vulnerability detection, this comparison is typically made against known security vulnerabilities (CVEs) that are tracked in the National Vulnerability Database (NVD). Some products use an additional proprietary database of vulnerabilities. For IP / Legal Compliance, SCA products will extract and evaluate the type of licensing used for the OSS component.
{{Cite book
|last1=Duan|first1=Ruian
|last2=Bijlani|first2=Ashish
|last3=Xu|first3=Meng
|last4=Kim|first4=Taesoo
|last5=Lee|first5=Wenke
|title=Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security
|chapter=Identifying Open-Source License Violation and 1-day Security Risk at Large Scale
|date=2017
|pages=2169–2185
|publisher=ACM
|doi=10.1145/3133956.3134048
|isbn=9781450349468
|s2cid=7402387
|chapter-url=https://dl.acm.org/doi/pdf/10.1145/3133956.3134048
}} Versions of components are extracted from popular open source repositories such as GitHub, Maven, PyPi, NuGet, and many others.
- The results are then made available to end users using different digital formats. The content and format depend on the SCA product and may include guidance to evaluate and interpret the risk, and recommendations especially when it concerns the legal requirements of open source components such as strong or weak copyleft licensing. The output may also contain a Software Bill of Materials (SBOM) detailing all the open source components and associated attributes used in a software application
{{Cite journal
|last1=Arora|first1=Arushi
|last2=Wright|first2=Virginia
|last3=Garman|first3=Christina
|date=2022
|title= Strengthening the Security of Operational Technology: Understanding Contemporary Bill of Materials
|journal=Journal of Critical Infrastructure Policy
|volume=3
|pages=111–135
|doi=10.18278/jcip.3.1.8
|url=https://www.jcip1.org/uploads/1/3/6/5/136597491/jcip_3.1_online.pdf#page=117
}}
Usage
As SCA impacts different functions in organizations, different teams may use the data depending on the organization's corporation size and structure. The IT department will often use SCA for implementing and operationalizing the technology with common stakeholders including the chief information officer (CIO), the Chief Technology Officer (CTO), and the Chief Enterprise Architects (EA).{{cite web| title=Software bill of materials: Managing software cybersecurity risks| author1=Bailey, T.| author2=Greis, J.| author3=Watters, M.| author4=Welle, J.| url=https://www.mckinsey.com/capabilities/risk-and-resilience/our-insights/cybersecurity/software-bill-of-materials-managing-software-cybersecurity-risks| publisher=McKinsey & Company| date=19 September 2022| access-date=6 January 2024}} Security and license data are often used by roles such as Chief Information Security Officers (CISO) for security risks, and Chief IP / Compliance officer for Intellectual Property risk management.{{cite book |last=Popp |first=Karl Michael |author-link= |date= 30 October 2019|title= Best Practices for commercial use of open source software|url= https://books.google.com/books?id=w1a6DwAAQBAJ |publisher=BoD – Books on Demand, 2019 |page=10 |isbn=9783750403093}}
Depending on the SCA product capabilities, it can be implemented directly within a developer's Integrated Development Environment (IDE) who uses and integrates OSS components, or it can be implemented as a dedicated step in the software quality control process.
{{Cite book
|last1= Imtiaz|first1=Nasif
|last2=Thorn|first2=Seaver
|last3=Williams|first3=Laurie
|title=Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
|chapter=A comparative study of vulnerability reporting by software composition analysis tools
|date=October 2021
|pages=1–11
|publisher=ACM
|doi=10.1145/3475716.3475769
|arxiv=2108.12078
|isbn=9781450386654
|s2cid=237346987
|chapter-url=https://dl.acm.org/doi/abs/10.1145/3475716.3475769
{{Cite book
|last1=Sun|first1=Xiaohan
|last2=Cheng|first2=Yunchang
|last3=Qu|first3=Xiaojie
|last4=Li|first4=Hang
|title=2021 IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)
|chapter=Design and Implementation of Security Test Pipeline based on DevSecOps
|date=June 2021
|volume=4
|pages=532–535
|publisher=IEEE
|doi=10.1109/IMCEC51613.2021.9482270
|isbn=978-1-7281-8535-4
|s2cid=236193144
|chapter-url=https://ieeexplore.ieee.org/document/9482270
}}
SCA products, and particularly their capacity to generate an SBOM is required in some countries such as the United States to enforce the security of software delivered to one of their agencies by a vendor.{{cite journal| title=Software Bill of Materials Elements and Considerations| url=https://www.federalregister.gov/documents/2021/06/02/2021-11592/software-bill-of-materials-elements-and-considerations| journal=Federal Register| date=6 February 2021| access-date=6 January 2024}}
Another common use case for SCA is for Technology Due diligence. Prior to a Merger & Acquisition (M&A) transaction, Advisory firms review the risks associated with the software of the target firm.
{{Cite book
|last1=Serafini|first1=Daniele
|last2=Zacchiroli|first2=Stefano
|title=The 18th International Symposium on Open Collaboration
|chapter=Efficient Prior Publication Identification for Open Source Code
|date=September 2022
|volume=4
|pages=1–8
|publisher=ACM
|doi=10.1145/3555051.3555068
|arxiv=2207.11057
|isbn=9781450398459
|s2cid=251018650
|chapter-url=https://dl.acm.org/doi/abs/10.1145/3555051.3555068
}}
Strengths
The automatic nature of SCA products is their primary strength. Developers don't have to manually do an extra work when using and integrating OSS components.
{{Cite book
|last1=Chen|first1=Yang
|last2=Santosa|first2=Andrew E
|last3=Sharma|first3=Asankhaya
|last4=Lo|first4=David
|title=Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice
|chapter=Automated identification of libraries from vulnerability data
|date=September 2020
|pages=90–99
|doi=10.1145/3377813.3381360
|isbn=9781450371230
|s2cid=211167417
|url=https://ink.library.smu.edu.sg/sis_research/5501
|chapter-url=https://dl.acm.org/doi/pdf/10.1145/3377813.3381360
}} The automation also applies to indirect references to other OSS components within code and artifacts.
{{Cite book
|last1=Kengo Oka|first1=Dennis
|chapter= Software Composition Analysis in the Automotive Industry
|title=Building Secure Cars
|date=2021
|pages=91–110
|publisher=Wiley
|doi=10.1002/9781119710783.ch6
|isbn=9781119710783
|s2cid=233582862
|url=https://ieeexplore.ieee.org/document/9821841
}}
Weaknesses
Conversely, some key weaknesses of current SCA products may include:
{{Cite book
|last1=Rajapakse|first1=Roshan Namal
|last2=Zahedi|first2=Mansooreh
|last3=Babar|first3=Muhammad Ali
|title=Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
|chapter=An Empirical Analysis of Practitioners' Perspectives on Security Tool Integration into DevOps
|date=2021
|pages=1–12
|doi=10.1145/3475716.3475776
|arxiv=2107.02096
|isbn=9781450386654
|s2cid=235731939
|chapter-url=https://dl.acm.org/doi/pdf/10.1145/3475716.3475776
}}
- Each product uses its own proprietary database of OSS components that can vary dramatically in terms of size and coverage
{{Cite book
|last1=Imtiaz|first1=Nasif
|last2=Thorn|first2=Seaver
|last3=Williams|first3=Laurie
|title=Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
|chapter=A comparative study of vulnerability reporting by software composition analysis tools
|date=2021
|pages=1–11
|doi=10.1145/3475716.3475769
|arxiv=2108.12078
|isbn=9781450386654
|s2cid=237346987
|chapter-url=https://dl.acm.org/doi/pdf/10.1145/3475716.3475769
}}
- Limiting vulnerability data to reporting only on vulnerabilities officially reported in the NVD (which can be months after the vulnerability was originally discovered) {{Cite web|url=https://owasp.org/www-community/Component_Analysis|title=Component Analysis|website=owasp.org}}
- Lack of automated guidance on actions to take based on SCA reports and data
{{Cite book
|last1=Foo|first1=Darius
|last2=Chua|first2=Hendy
|last3=Yeo|first3=Jason
|last4=Ang|first4=Ming Yi
|last5=Sharma|first5=Asankhaya
|title=Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
|chapter=Efficient static checking of library updates
|date=2018
|pages=791–796
|doi=10.1145/3236024.3275535
|isbn=9781450355735
|s2cid=53079466
|chapter-url=https://dl.acm.org/doi/pdf/10.1145/3236024.3275535
}}
{{Cite web
|last1=Millar|first1=Stuart
|date=November 2017
|title= Vulnerability Detection in Open Source Software: The Cure and the Cause
|publisher=Queen's University Belfast
|url=https://pureadmin.qub.ac.uk/ws/portalfiles/portal/128394396/SMillar_13616005_VulnerabilityDetectionInOSS.pdf
}}