September 20, 2022

How Legend has Empowered Global Markets Engineering at Goldman Sachs and the Derivatives Industry as a Whole

Shilpi Singh, Mariania Cassimiro, Prashant Nirwal, Toh Newin, Tom Barkes, Eric Esterkin

John Waldron, President and Chief Operating Officer of Goldman Sachs, has advocated that "high quality data is one of the most important commercial drivers for automation and scale". Given the growing volume and complexity of the financial service data in combination with an increased demand for actionable data from stakeholders, the firm developed Legend, an open source data platform that breaks down data silos and connects business and technology teams resulting in improved data governance for automation, transparency and consistency.

An introduction to Legend and financial service industry collaboration 

Trading by definition involves multiple parties. There is a natural limit to how much efficiency can be wrought within a single party; data quality is also a product of the data coming from a counter-party. To improve data quality across institutional boundaries, counterparties require a few components:

  1. Common languages through which to represent financial instruments
  2. Shared models and definitions of instruments represented in common Domain Specific Languages (DSLs)
  3. Tooling and platforms for managing financial instrument models.

 Ideally this tooling should be licensed as open sourced such that industry participants experience limited barriers and can contribute new features. 

In 2020 Goldman Sachs open sourced our internal data modeling platform Legend. With Legend, the industry now had access to the first component - a DSL in which to compose financial instruments, in the form of PURE, the Legend DSL, as well as the third, through the overall Legend suite, to which more modules and features have been subsequently contributed. On the second item - shared definitions - Goldman together with other industry participants use the hosted public instance of Legend Studio in FINOS to do collaborative modeling.

The pilot use case, hosted on FINOS infrastructure in the public cloud builds extensions to the Common Domain Model (CDM) developed by the International Swaps and Derivatives Association (ISDA).

Average Model Diagram. Averaging model developed by participants of the FX Options Working Group
Average Model Diagram. Averaging model developed by participants of the FX Options Working Group

The combination of an open sourced DSL - Legend Pure; open source model definitions - the CDM; and an open sourced suite of products and tools to manage these models and drive their definitions back into physical systems In the front, middle, and back offices will, among other benefits, accelerate industry interoperability, especially in trading workflows, and drive efficiencies for, and better data quality, across trading partners. 

To learn more about Legend and our open sourcing partnership, check out our launch video. There will also be opportunities at the upcoming Open Source Strategy Forum in New York City on December 8 to connect with individuals close to this project and to hear about the collaborative efforts you could get involved with through the FINOS Financial Objects SIG and more.

Our team is also growing and we have several exciting opportunities!

Commodity Payout created by participants of Commodity Reference Data Working Group
Commodity Payout created by participants of Commodity Reference Data Working Group

Legend not only provides a user interface to collaboratively express data models, but also has a powerful query generation capability with connectivity to commonly used data stores. It is possible for end users to seamlessly navigate data models and create complex queries and REST APIs without having to worry about database connectivity or writing SQL.

How did we get here: a sneak peak into expanding efficiencies and driving data quality within Global Markets at Goldman Sachs

For heavy data producers, like the Global Markets Division, Legend, and data models specifically, have materially improved data quality, resulting in new commercial opportunities, improved risk management and reduced operational overhead by decommissioning legacy systems.

The Global Markets Division at Goldman Sachs enables clients to buy and sell global financial products and manage risk across fixed income, equity, currency and commodity products, including complex derivatives.  Global Markets produces a significant amount of transactional data for consumption not only by internal teams (including Global Markets itself), but also external stakeholders including clients and regulators.  The diverse stakeholders, in combination with new demands from clients and regulators require high data quality to scale to remain competitive in the marketplace.  Below are a few representative examples of how Legend data models facilitate this scale for the firm.

Modeled Derivatives data delivers scale and improved time to market for new revenue opportunities 

Post trade execution functions are similar to many other data distribution types with opaque traceability, potential loss of data across system hops and difficult change management from a developer experience perspective.  Within the derivatives space specifically, scale is critical, with hundreds of products and many different consumers including clients of the firm, global financial and non-financial regulators, and internal teams who manage risk.  Historically, for every new product request or new regulation, business users relied on manual code changes from engineers to execute a transaction and fulfill all the downstream obligations. 

By converging post trade infrastructure around common data models and creating model-to-model mappings between front office product models and the consumer models, Global Markets is able to represent hundreds of products across all major markets in about 30 different data models, scaling the workflows to meet demand from all stakeholders.

Data consumers leverage automated rules based on cardinality, data type, and enumerations in order to validate the quality of the data upstream at the source rather than downstream after processing.  Change management is transparent, with agreed upon data definitions and lineage remaining intact and engineering dependencies are reduced improving time to market, reducing operational overhead and delivering solutions to stakeholders.

Recently, as featured in BloombergLegend enabled Goldman Sachs to serve Blackrock's equity swap business via Veris, Axoni's distributed ledger network for equity swaps.  While the industry "has encountered operational complexities in keeping up with trade volumes", firm's leveraging model-to-model mappings based on the International Swaps and Derivatives Association's Common Domain Model (CDM) experience "scalability while mitigating risks in the investment life cycle".  This same data modeling process is applied across products and regions, offering a competitive advantage to the Global Markets Division.

Modeled Comprehensive Capital Analysis and Review (CCAR) data improves transparency by replacing siloed data tools an automated and consolidated solution

While the above showcases Legend's impact in providing scale and driving new growth opportunities, Global Markets has also leveraged data models to evolve the firm's ability to assess and react to its own risk profile using CCAR data.  Accurate CCAR data facilitates effective capital planning processes and ensures sufficient capital to absorb losses during stressful conditions, while meeting obligations to creditors and counterparties.  Historically, Global Markets risk management teams sourced the various CCAR inputs from spreadsheets, Tabular Datasets and disparate databases using custom code to transform and aggregate the data into a single source.  Engineers and business users exchanged data lineage and transformation knowledge by consistently communicating with one another via phone and email, a time consuming process which delayed analyses.

By converging the disparate data within a unified data model and exposing the results through self-service queries, Legend improved transparency on the data transformations, milestoned the data for historical comparison and reduced the dependency on engineers and custom code.  

As both a producer and consumer of the data, Global Markets now has significantly streamlined their operations, increased transparency in an automated manner and accelerated risk analysis.  Capturing persisted history enables consumers to validate historical submissions while collaborating and agreeing with other teams on the purpose and usage of the data ensures reporting consistency.  The data consumers have developed a greater understanding of the data, leading to more informed decision-making and risk reduction.

Modeled trade processing data centralizes architecture leading to decommissioning of legacy platforms

Not only have data models driven scale and new business opportunities while at the same time reducing risk, but they have also reduced expenses through the decommissioning of legacy systems.  These legacy data platforms utilize expensive infrastructure and require significant engineering support to maintain.  In these platforms, opaque code drives the entire data lifecycle, from ingestion to transformation to pipelining to reporting.  Data consumers are siloed from producing teams, unable to trace lineage or self-service the data to a large extent.  Teams of engineers maintain the code base and knowledge transfer is expensive and time consuming.

The rearchitected solutions are driven by centralized data stores and enterprise data models, greatly simplifying the workflows and reducing dependencies on engineering teams.  These enterprise models are aligned with industry standards, with simple and straightforward change management and out of the box self-service for customers.  Singular databases become the "golden source" of data with producer and consumer model-to-model mappings to drive automation and consistency.  The singular database also drives all downstream pipelines and reporting, offering transparent lineage for front office teams. This also helps to easily reconcile the original terms as agreed upon by the customer with the post-settlement results, without the dependency on engineers.

The transformation from lines of code in legacy platforms to singular databases with data models built on top has saved the firm millions in infrastructure costs and immeasurable engineering overhead.  Engineering teams are now free to build rather than support legacy code. It has transformed job description for Engineers in Global Markets. 

Bringing it all together to empower engineers and benefit the entire derivatives industry

Through data modeling, engineers at Goldman Sachs are empowering business teams and directly contributing to the firm's bottom line rather than reacting to incidents and manually adjusting code.  As featured on CNBC, the software and language “have grown to become critical tools within our firm across the trade lifecycle that help us price, assess and evaluate risk, clear transactions, and perform regulatory reporting,” and by making it publicly available, “we’ll unlock tremendous value for the industry when we co-develop and share models.”

The collaborative FINOS effort received a big boost of momentum this month in the announcement that the three trade associations in partnership; ISDA, ISLA and ICMA have appointed FINOS as the future repository of the Common Domain Model (CDM). With the open sourcing of the CDM, the ecosystem now has the final piece in place to accelerate collaborative financial instrument modeling - the underlying definitions to a broad set to financial instruments, in particular related to derivatives.


See https://www.gs.com/disclaimer/global_email for important risk disclosures, conflicts of interest, and other terms and conditions relating to this blog and your reliance on information contained in it.