November 19, 2021

Builds Today, Breaks Tomorrow: The Mystery of the Disappearing NPM Dependency

Bella Wiseman, Lead Engineer, Open Source Program Office

This blog post is a recap of a talk I gave at the Open Source Strategy Forum in November 2021. You can watch the video replay here.


The Mystery

Several teams are reporting sudden unexpected JavaScript build failures. The first clue: all these teams use GatsbyJS, an open source JavaScript framework. The second clue: an error message that states that smartwrap-1.2.5 is unavailable for download due to a GPL 2.0 license.

Despite all these clues, mysteries linger: What is smartwrap, and why are we depending on it? Why did the build suddenly fail now, when it succeeded an hour ago? How could we fix this before today's release? What is GPL, and why are the artifacts licensed under this license not permitted for use?

First - a note on GPL that is a simplification, but will suffice for this blog post. (If you have a real-life legal question, seek answers from a lawyer, not a blog post. :) ) Broadly speaking, there are two categories of open source licenses: permissive and copyleft. Permissive open source licenses, like Apache 2.0 and MIT, allow use of the open source software with very few obligations. By contrast, copyleft licenses, such as GPL, require "derivative" software to distribute its source code to its users (e.g. open source it). Many companies do not permit using GPL dependencies in internally developed software due to the logistical complexity of ensuring that all the derivative source code is promptly open sourced under a GPL compatible license.

Now, on to the build failures! Below is the dependency graph we found when debugging the issue earlier this year. This deeply nested GPL dependency was identified during the build, and prevented the dependent products from being built.

While the dependency graph looks straightforward, teams were nevertheless mystified. They were not aware that they depended on smartwrap, and had never encountered a build failure for it before. No code had changed since the last build, and the open source and SDLC teams confirmed that the GPL policy had not changed either. So what had happened?

To solve the mystery, we first need some background on versioning in npm. In npm, dependencies are specified using semantic versioning in a package.json file. In semantic versioning, the version is specified as x.y.z, with x being the major version, y the minor version, and z the patch version. In the npm world, it is a common practice to allow ranges of dependencies. For example, ^1.0.0 means "1.0.0 or any later minor version" (e.g. 1.1.0, 1.2.0... or 1.13.9). This is so common that the ^ x.y.z is actually the default dependency added to your package.json when you run npm install --save package.

While this is great for keeping your dependencies up-to-date, if handled incorrectly, it can lead to builds that are not reproducible over time. You run the risk of pulling in new versions of dependencies without warning and at any time. Even if you choose to use exact versions in your package.json, you are still impacted by this behavior, as your dependencies almost certainly use version ranges, and the exact version of your transitive dependencies will be ambiguous.

As we discovered during our investigation, a classic example is... Gatsby! Specifically, when we looked at the package.json for gatsby-recipes, we found this:

"dependencies": {
    "@babel/core": "^7.12.3",
    "@babel/generator": "^7.12.5",
    "@babel/helper-plugin-utils": "^7.10.4",
     ...
    "@babel/types": "^7.12.6",
    "@graphql-tools/schema": "^7.0.0",   <-----
    "@graphql-tools/utils": "^7.0.2",
    "@hapi/hoek": "8.x.x",
    "@hapi/joi": "^15.1.1",
    "better-queue": "^3.8.10",
    ...

In the above incident, we dug down, and found that graphql-tools/schema had released a patch version which added a dependency on a new package value-or-promise. Since graphql-tools/schema and value-or-promise were both licensed under the permissive MIT license, adding this dependency appeared at first glance to be a trivial change. However, deeply nested within the dependency tree of value-or-promise were several GPL open source projects, which are blocked for use within the company, including, smartwrap. Thus, in the course of a couple of hours, several teams had their builds starting to fail, though not a single line of their code had changed.

Mystery solved! Just yesterday, Gatsby did not rely on smartwrap. When the build failures started happening, Gatsby did rely on smartwrap. No change occurred in Gatsby or smartwrap - the only change was that graphql-tools/schema added a new dependency on value-or-promise, which pulled in smartwrap as a transitive dependency. And thus concludes our mystery. The end.

Just kidding! You are most likely not a detective. And while solving mysteries is fun, you likely still have some more questions that will now keep you up at night as a software engineer: How can I stop this from happening to my team? How can I make sure it doesn’t happen right before the big release tomorrow? How did you fix your build? And if you're an open source maintainer: how can I make sure changes to my open source project don’t have unintended impact?

Don't Be Taken By Surprise

Let's start with prevention. While this dependency would come in scope eventually on version upgrades, most teams would not want it to randomly fail their build at an unplanned time when they may be under deadline for a release. Nor would they want the build failing on the server, but succeeding on their local machine. To eliminate the element of surprise, you will want to follow some best practices in your JavaScript projects.

Since npm5, npm by default creates a package-lock.json on the first run of a build in an environment. This contains the exact version of each dependency in your tree. Subsequent builds will download and use the exact versions of the dependencies in package-lock.json, even if a newer dependency version becomes available. Thus, builds will be reproducible, as long as the package-lock.json is not modified. Conversely, if there is no package-lock.json in the project directory, npm will download or reuse any version of each dependency that is compatible with the version range; the exact version used can change depending on when you run the build, and what packages are already installed (Please note: this greatly oversimplifies the algorithm that npm uses to determine the version of a package to use.).

However, generating the lock file is not enough. To ensure reproducible builds in your build pipeline, you must commit your package-lock.json to source control. While many auto-generated files should not be committed to source control, package-lock.json is an exception. The clear consensus and best practice in the node community is to commit this file. Once you do, your builds will be reproducible, and you will be in control of when to update your dependencies. This guidance applies to other lock files as well, like pnpm-lock.yamlyarn.lock and npm's older shrinkwrap.json.

Checking in your lock file will only work if it doesn't get overridden during your build. In order to ensure this won't happen, make sure that your build pipeline runs npm ci rather than npm installThis provides a number of additional benefits, including faster builds. If you follow these best practices, every change to your dependency graph will have a corresponding change in your lock file, making dependency changes significantly easier to detect and debug.

Quick - Fix the Build!

In the short-term, you can remove your dependency on the package with the license that you are unable to use by pinning to the last known good version of the intermediate dependency (in this case, version 7.1.3 of graphql-tools/schema, the version before value-or-promise was introduced). This is not a long-term solution, as you will be permanently pinned to an old version of a package, which is problematic for security and many other reasons. However, it can save the day in the short-term and is a useful technique to know about.

If you use vanilla npm, there are tools that will edit package-lock.json before each install, providing a quick way to override dependency versions. If you use a monorepo management tool, you may have more options. For example, in Rush, one can override dependency versions globally by following these instructions. The benefits of monorepos shine through here, since you can override versions in one central location, rather than doing it for each package (which would take a while, and delay that release you're trying to get out).

Fix the Underlying Issue

In many cases, the maintainer does not have full insight into the impact that a small pull request can have on their users. Reaching out to the maintainer to collaborate on solving your issue is always an ideal solution. In this case, an issue was raised in value-or-promise, noting that the intermediate dependency changesets/cli (see below) was incorrectly marked as a "dependency" when it should have been a "devDependency". The maintainer fixed the issue within 2 days, and now value-or-promise (and Gatsby) no longer pull in any GPL transitive dependencies. In the graph below from https://www.npmjs.com/package/smartwrap, you'll note the spike, and subsequent decline, in usage of smartwrap, as the dependency was introduced, and then removed.

Maintaining Open Source Packages

If you maintain an open source package that is licensed under a permissive license (e.g. Apache or MIT), and you want it to be widely used by commercial enterprises, here are few things to be aware of. First, adding a new dependency can be a breaking change, particularly if it pulls in a package that has a less permissive license (e.g. GPL). To ensure you are not pulling in licenses that are less permissive, you need to traverse the entire dependency graph. Luckily, deps.dev from Google can help you analyze the dependency graph. This is something you may want to consider automating as well, so that new dependencies are scanned for each new pull request.

That's a Wrap!

If you use npm, your dependency graph is always changing...and that's a good thing! Using some of the best practices above can help you keep your dependencies up-to-date in a controlled manner, and help you navigate unexpected changes to your dependency graph, so that you can keep delivering value to your users....and building awesome software!


See https://www.gs.com/disclaimer/global_email for important risk disclosures, conflicts of interest, and other terms and conditions relating to this blog and your reliance on information contained in it.