July 28, 2022

Legend + Snowflake Native Apps = Fast, Easy, Secure Access to Data

Neema Raphael, Chief Data Officer; Abhishek Narang, Managing Director, Data Engineering

Last month Goldman Sachs' Chief Data Officer, Neema Raphael gave a keynote presentation at the Snowflake Summit Conference.  Neema's talk focused on combining Goldman Sachs Financial Cloud for Data with our open sourced data platform Legend powering a Snowflake application to generate transformational business insights for our clients, business partners and engineers. What used to be a painful, multi-week process requiring support from engineering has become a self-serve, intuitive experience that takes a couple of days. Further, the application offers the best of all worlds: research velocity, performance, and governance.  The Legend data platform uses Snowflake native app functionality to provide the governance and benefits of API style application development with the native performance of database joins and predicate pushdown.  Watch the replay of the talk from the Snowflake Summit


How engineers at Goldman Sachs and Snowflake partnered to unlock native Snowflake performance on top of Legend APIs

Goldman Sachs's data strategy is tied directly to Legend and the firm's Financial Cloud for Data. Our latest offering gives our clients access to curated GS data and provides a client-owned AWS runtime to power the behind-the-scenes data movement. We are also users of Snowflake for relational data warehousing and data engineering. 

The idea to combine our capabilities was born from a simple conversation that happened in April 2022... Abhishek relates how the idea came about

Running our vendor data engineering team I have personally seen the time and energy required to set up teams with access to new datasets. It is a frustrating process - for both business users and engineers; 
At a recent Snowflake event, I learned about Native Apps, a marketplace of new capabilities they built on top of secure data sharing. Following the event, my manager stopped by my desk to learn more about these features and brainstormed how Snowflake Native Apps could potentially solve a critical and long-standing issue for our users
This conversation quickly led to an “ah ha!” moment: if we can combine Legend with the Snowflake Native App capabilities, we can unlock the full performance of Snowflake while still providing logical separation of client, vendor, and Goldman Sachs data. 

This new idea presented a great opportunity to work with internal researchers to understand the impact we could achieve. And the impact really was phenomenal - our research partners could setup new datasets in days instead of months. They could do this in a fully self-serve model while adhering to data governance standards. And our engineering team didn't have to support our business teams every single time they wanted new datasets or insights. A win-win for everyone!

But our ambitions didn't end there. Once we unlocked the foundational set of capabilities from this new solution, we wanted to push the boundaries of this innovation. The next step was for us to identify a real-world use case that could scale this not just internally at GS, but to external clients of the firm. We decided to leverage data from the newly launched Goldman Sachs Financial Cloud for Data to make the experience of sharing insights with our clients even more seamless. Adding more functionality within our existing application suite is a powerful value proposition for our clients.  We are excited to go-to-market with this service once Snowflake Native Apps are publicly available (expected end of this year).

Our problem statement:

A researcher at an institutional investor client who wants to access new datasets from various external sources has to work with a data engineer to complete a multi-step time consuming process. They need to spend a significant amount of time figuring out how their internal data stitches together with third party data across multiple ecosystems. It’s also hard on the engineering side to understand the intent of Quants/Researchers as they run into complex extract, transform and load (ETL) problems with multi hop operations yielding lack of ownership and performance issues. After this is all in production, the researcher still does not get lineage.

Our mission:

Allow business users to pull together all the data in need using an intuitive, self-service experience that is highly performant, secure and scalable.

A flow diagram as explained in the problem statement above the diagram showing the before of the multi-step data engineering process.
A flow diagram as explained in the problem statement above the diagram showing the before of the multi-step data engineering process.

Our solution:

Our underlying technology is based on Legend, GS’s open source contribution to FINOS. Legend is a single platform for data model driven and insights generation. It is platform agnostic and can transpile model queries to SQL with full predicate pushdown.  We have contributed a Snowflake native app that has been supercharged with capabilities from Legend (e.g., by transforming APIs to SQL, ensuring native database performance and GS specific data models are still being enforced). This gives users a simple process to get the data they need to generate insights.

  1. Get access rights to the new data in the Goldman Sachs Financial Cloud for Data and to the new Legend native app
  2. Download the Legend based native app from the snowflake marketplace and join their new datasets
  3. Upload the new data to the Goldman Sachs Financial Cloud for Data – their preferred solution for plotting graphs and generating insights
GS Legend Native App Architecture Diagram explains the flow between the client's Snowflake instance, the GS Financial Cloud and Goldman Sachs Instance of Snowflake.
GS Legend Native App Architecture Diagram explains the flow between the client's Snowflake instance, the GS Financial Cloud and Goldman Sachs Instance of Snowflake.

Here’s the impact:

This new app not only allows us to share relevant datasets with our clients but also using the same paradigm clients can share data internally with encapsulation and central integration / control point. Engineering teams no longer need to support business partners with the tedious work of creating and supporting custom onboarding data processes. They can instead focus on higher value tasks such as enabling connectivity across datasets through data models and easy tools to graduate from experimentation phase to production.

Internally, we are partnering with researchers in a business of Goldman Sachs that has been #1 ranked by our clients multiple years in a row. Our Vendor data acquisition engineering team can now bring their data discovery timeline from months to days. 

And we do this while making our technology more performant and maintaining our standards around data governance and security.

After: Client Journey map as described in the text above.
After: Client Journey map as described in the text above.

By constantly thinking about new and innovative ways to extract the best out of Goldman native technologies and our partners like Snowflake, we are on track to making data consumption, sharing and analysis lightning fast, highly accurate and enormously simple.


See https://www.gs.com/disclaimer/global_email for important risk disclosures, conflicts of interest, and other terms and conditions relating to this blog and your reliance on information contained in it.

This site is for informational purposes only and does not constitute an offer to sell, or the solicitation of an offer to buy, any security. The Goldman Sachs Marquee® platform is for institutional and professional clients only. Some of the services and products described on this site may not be available in certain jurisdictions or to certain types of client. Please contact your Goldman Sachs sales representative with any questions. Nothing on this site constitutes an offer, or an invitation to make an offer from Goldman Sachs to purchase or sell a product. This site is given for purely indicative purposes and does not create any contractual relationship between you and Goldman Sachs. Any market information contained on the site (including but not limited to pricing levels) is based on data available to Goldman Sachs at a given moment and may change from time to time. There is no representation that any transaction can or could have been effected on such terms or at such prices. Please see https://www.goldmansachs.com/disclaimer/sec-div-disclaimers-for-electronic-comms.html for additional information. © 2023 Goldman Sachs. All rights reserved.
Transaction Banking services are offered by Goldman Sachs Bank USA (“GS Bank”). GS Bank is a New York State chartered bank, a member of the Federal Reserve System and a Member FDIC. © 2023 Goldman Sachs. All rights reserved.
Not all products and functionality mentioned on this website are currently available through our API platform.
All loans and deposit products are provided by Goldman Sachs Bank USA, Salt Lake City Branch. Member FDIC.
Brokerage and investment advisory services offered by our investment products are provided by Goldman Sachs & Co. LLC (`‘GS&CO.`’), which is an SEC registered broker-dealer and investment adviser, and member FINRA/SIPC. Research our firm at FINRA's BrokerCheck. Custody and clearing services are provided by Apex Clearing Corporation, a registered broker-dealer and member FINRA/SIPC. Please consider your objectives before investing. A diversified portfolio does not ensure a profit or protect against a loss. Past performance does not guarantee future results. Investment outcomes and projections are forward-looking statements and hypothetical in nature. Neither this website nor any of its contents shall constitute an offer, solicitation, or advice to buy or sell securities in any jurisdictions where GS&Co. is not registered. Any information provided prior to opening an investment account is on the basis that it will not constitute investment advice and that GS&Co. is not a fiduciary to any person by reason of providing such information. For more information about our investment offerings, visit our Full Disclosures.
Investment products are: NOT FDIC INSURED ∙ NOT A DEPOSIT OR OTHER OBLIGATION OF, OR GUARANTEED BY, GOLDMAN SACHS BANK USA ∙ SUBJECT TO INVESTMENT RISKS, INCLUDING POSSIBLE LOSS OF THE PRINCIPAL AMOUNT INVESTED