The repercussions of poorly managed code security and leakage of sensitive data in code call for the creation of precautionary tools. Data secrets are fundamental to productivity in collaborative and complex software development cycles. But if handled improperly, they can put one's entire infrastructure at risk. Researchers have found that thousands of secrets are leaked every day. Hackers have also realized that these secrets are a treasure trove for their efforts, as they can frequently unlock systems upstream and downstream from the code itself.
CatchIT scanner is a security software developed by Tech Risk Advisory at Goldman Sachs and aimed at detecting sensitive information that is harmfully exposed in code repositories. If someone checks a secret with a known pattern into a public or private repository, CatchIT catches the secret as it is checked in, and enables one to mitigate the impact of the leak. Repository maintainers are notified about any commits that contain a secret, and they can quickly view all detected secrets. CatchIT was recently open sourced and contributed to FINOS (Fintech Open Source Foundation).
At Goldman Sachs, not only do we leverage existing open source software, but we are also devoted to open sourcing projects that originate within our firm. Our Open Source Program Office (OSPO) and developer community work closely to advance our open source contributions.
CatchIT scanner detects sensitive information in source code with a strong emphasis on low execution time, CI/CD integration, high customization and minimizing false positive rates. CatchIT is a simple yet powerful framework that helps developers and organizations mitigate the risk of credentials leaking, which further minimizes disruption to the developer experience. It can be embedded as an ad hoc job in the CI/CD pipeline, as a python zip application or as a Docker image and thus eliminates the need to deploy or maintain a dedicated server. It is a regex-based scanner that leverages Linux commands such as “grep” and “find” to search for pre-defined regular expressions. In addition to its pattern-based mechanism, CatchIT uses the entropy of the identified findings and a confidence factor, per regular expression, to further prioritize results and classify them into distinct categories. CatchIT scans for sensitive code, passwords, AWS account IDs, GCP keys as well as sensitive files such as KEY and PEM files among others. Furthermore, it provides results in JSON format. Currently, the tool contains regular expressions in two categories to identify the following secrets and files:
We will continue to improve and expand our regular expression based ruleset to accommodate new secrets introduced in different cloud-based and on-premises environments.
We believe that collaboration is one of the key factors in securing supply chains and this inspired us to share CatchIT with the community as open source.
We encourage developers and businesses to explore and utilize CatchIT as a risk mitigation component within their Software Development Life Cycle. Your feedback, issues and contributions are more than welcome. You can explore the CatchIT code base on GitHub, in which there is the project issue backlog, as well as more information about contributing to CatchIT. You can read more about the FINOS Contribution and Contributor License Agreement requirements on the community section of the FINOS GitHub.
We look forward to hearing from you!
See https://www.gs.com/disclaimer/global_email for important risk disclosures, conflicts of interest, and other terms and conditions relating to this blog and your reliance on information contained in it.