February 16, 2022

Driving Developer Productivity via Automated Dependency Tracking

Ankhuri Dubey, Managing Director, CI/CD Engineering

With the proliferation of reusable libraries, one of the challenges that has come about is how to properly manage library dependencies. Lack of transparency into the composition of software systems impacts the software development lifecycle (SDLC) process for a variety of important tasks that developers perform every day, from using external libraries to upgrading them across one or more software products and beyond. For teams and developers that vend software libraries, it is challenging to track usage and version spread across software products that use these libraries. Manually gathering and keeping this information up-to-date is error prone, not scalable, and time-consuming. As we do not know which libraries or dependencies are used where, our ability to track and assess the impact of derived software components that are buggy, stale, have critical vulnerabilities, or need to be demised entirely are limited. For the developer, library version upgrades are done manually, requiring tedious and coordinated changes. This issue compounds as software products need continuous refactoring to reflect an engineering team’s learnings from user feedback and operational issues, which often involves changes to software dependencies.

The Solution

The Continuous Integration/Continuous Delivery (CICD) Platform team at Goldman Sachs (GS) is focused on improving the day-to-day experience of developers via CICD platform solutions. In order to solve the dependency tracking challenge for developers, we launched support for dependency tracking with the help of Software Bill of Materials (SBOMs) artifacts for internal, vendor, and open source software libraries. An SBOM is effectively a nested inventory - a list of ingredients that make up software components. SBOMs provide answers to questions such as: "What are my software components dependencies?" and "Which software components depend on library X?". Our team uses CycloneDX as the SBOM implementation given the rich tooling across all the build tools/programming languages used within Goldman Sachs. SBOM manifests are extracted from and published in the CICD pipeline by using build artifacts/build tools specific to programming languages (e.g. pom.xml, requirements.txt, packages.json). This is available to developers through a GitLab project badge. Now, a library provider can answer which software product is dependent on their component, grouped by version. This dependency graph is captured and kept up-to-date automatically on a recurring basis, without any developer interaction or changes to individual software products, since the SBOM generation is included in CI pipelines by default. Periodic SBOM generation is also executed across all our software products to ensure that we also capture SBOMs for products that are not under active development. We currently have SBOMs captured for software products across Java (Maven and Gradle), Python (pip), JavaScript (npm), C#, and a few other languages and build tools.

SBOM Usage

Developers have found dependency information captured via SBOMs useful in a variety of ways:

Dependents Lookup - Java (Maven)

In Q1 2021, a functional bug was discovered in a version of an internally produced library used by potentially hundreds of internal software products. By the time the issue was identified, the library had long been in use and analyzing the blast radius was a not a trivial task. Using SBOM dependency data and the nature of deployment of these applications (externally hosted or internet facing applications are more sensitive than internally hosted and used), the team was able to identify the list of impacted products within a minute, and using that, completed an impact analysis in a matter of hours. This analysis would otherwise have been a multi-month exercise. The team used this information to reach out to more than 100 software product owners and requested an upgrade to the patched version of the impacted library. The image below shows version spread and number of dependents for each version of an internal library; all this data is available to engineers at the click of a button. More recently, in Q4 2021 when the log4j vulnerability became public, data captured via SBOMs was instrumental in identifying and remediating the impacted software products.

Dependencies Lookup - Java (Maven)

Another engineering team was moving their build pipelines from on-premise to the cloud. On-premise and cloud build ecosystems have their own artifact repositories, and the dependencies available on these repositories might not be the same. This team used the dependencies information captured in SBOM to identify all the missing (direct and transitive) dependencies in the on-premise artifact repository, and added them prior to their migration, saving many hours of analysis for developers.

 

Future Roadmap

The dependency graph is extremely critical for another vital aspect of modern software development: continuous refactoring. Code needs to be kept up-to-date as it accumulates debt, ends up using deprecated classes/APIs, or introduces risks due to incomplete or invalid API usage. Equally, an internal library might want to push a critical bug fix or remediate a vulnerability associated with a specific version that needs to be addressed across all dependent software products. In the past, this was typically done by sending emails and chasing individual teams to fix their code. To help developers address this challenge and provide a mechanism to execute improvements across products, the CICD Platform team will be introducing support for automated code refactoring for dependency version upgrades and configuration artifacts (for pre-defined and widely used files such as log4j.properties) in 2022. This feature is similar to the Dependabot feature provided by GitHub. Comprehensive automated test coverage really improves the efficacy of these automated code changes, increasing the confidence of developers who are accepting the merge requests. See below the sample automated merge request created for a version upgrade of an internally used library. 

We hope you found this blog post informative! If you would like to learn more about exciting opportunities at Goldman Sachs, we invite you to explore our careers page.


See https://www.gs.com/disclaimer/global_email for important risk disclosures, conflicts of interest, and other terms and conditions relating to this blog and your reliance on information contained in it.

GS DAP® is owned and operated by Goldman Sachs. This site is for informational purposes only and does not constitute an offer to provide, or the solicitation of an offer to provide access to or use of GS DAP®. Any subsequent commitment by Goldman Sachs to provide access to and / or use of GS DAP® would be subject to various conditions, including, amongst others, (i) satisfactory determination and legal review of the structure of any potential product or activity, (ii) receipt of all internal and external approvals (including potentially regulatory approvals); (iii) execution of any relevant documentation in a form satisfactory to Goldman Sachs; and (iv) completion of any relevant system / technology / platform build or adaptation required or desired to support the structure of any potential product or activity. All GS DAP® features may not be available in certain jurisdictions. Not all features of GS DAP® will apply to all use cases. Use of terms (e.g., "account") on GS DAP® are for convenience only and does not imply any regulatory or legal status by such term.
Certain solutions and Institutional Services described herein are provided via our Marquee platform. The Marquee platform is for institutional and professional clients only. This site is for informational purposes only and does not constitute an offer to provide the Marquee platform services described, nor an offer to sell, or the solicitation of an offer to buy, any security. Some of the services and products described herein may not be available in certain jurisdictions or to certain types of clients. Please contact your Goldman Sachs sales representative with any questions. Any data or market information presented on the site is solely for illustrative purposes. There is no representation that any transaction can or could have been effected on such terms or at such prices. Please see https://www.goldmansachs.com/disclaimer/sec-div-disclaimers-for-electronic-comms.html for additional information.
Transaction Banking services are offered by Goldman Sachs Bank USA (“GS Bank”). GS Bank is a New York State chartered bank, a member of the Federal Reserve System and a Member FDIC.
Mosaic is a service mark of Goldman Sachs & Co. LLC. This service is made available in the United States by Goldman Sachs & Co. LLC and outside of the United States by Goldman Sachs International, or its local affiliates in accordance with applicable law and regulations. Goldman Sachs International and Goldman Sachs & Co. LLC are the distributors of the Goldman Sachs Funds. Depending upon the jurisdiction in which you are located, transactions in non-Goldman Sachs money market funds are affected by either Goldman Sachs & Co. LLC, a member of FINRA, SIPC and NYSE, or Goldman Sachs International. For additional information contact your Goldman Sachs representative. Goldman Sachs & Co. LLC, Goldman Sachs International, Goldman Sachs Liquidity Solutions, Goldman Sachs Asset Management, L.P., and the Goldman Sachs funds available through Goldman Sachs Liquidity Solutions and other affiliated entities, are under the common control of the Goldman Sachs Group, Inc.
© 2024 Goldman Sachs. All rights reserved.