February 09, 2022

Enabling Highly Available Trino Clusters at Goldman Sachs

Ramesh Bhanan, Vice President; Siddhant Chadha, Associate; Sumit Halder, Vice President; and Suman Baliganahalli Narayan Murthy, Vice President - Business Platform Engineering

We have been invited to chat about the content in this blog post on the Trino Community Broadcast. Enjoy the live stream on February 17, 2022.


The Challenge

As one of the Data Platform teams at Goldman Sachs (GS), we are responsible for making accurate and timely data available to our analytics and application teams. At GS, we work with various types of data, such as transaction-related data, valuations, and product reference data from external vendors. These datasets can reside in multiple heterogeneous data sources like HDFS, S3, Oracle, Snowflake, Sybase, Elasticsearch, and MongoDB. Each of these options presents datasets in different ways, each of which must be individually dealt with. The challenge we encountered was how to consistently make these varied datasets from different sources centrally available to our data science team for analytics purposes.

A few of the goals we wanted to achieve include:

  • Reduce last mile ETL (Extract, Transform, Load) - ETL pipelines require significant effort to create and stabilize. Maintenance is costly - ex: resolving various pipeline failures and doing reconciliations.
  • Access data in a unified way - a common language (SQL) to access different types of data sources.
  • Federated Joins - a way to join datasets residing in varied data sources.
  •  

The Solution

Trino, an open source distributed SQL query engine, gives users the ability to query various datasets through a single uniform platform. Users can connect to multiple data sources and perform federated joins with its connector-based architecture, eliminating ETL development and maintenance costs.

Integrating Trino into the Goldman Sachs Internal Ecosystem 

Our first step was to integrate Trino within the Goldman Sachs on-premise ecosystem. This meant:

  • Integration with internal authentication and authorization systems.
  • Integration with in-house tracking, monitoring, and auditing systems.
  • Integration with in-house credential stores.
  • Integration with data discovery and cataloguing services
  • Support for new data source connectors such as SingleStore, Sybase, and more.
  • Updating various existing connectors such as Elasticsearch, Mongo DB, etc. 

We were able to extend Trino plugins to support all of the above integrations. For example, we implemented the EventListenerFactory interface to gather query-level statistics. Also, as a part of this process, we made sure that we contributed all of the relevant patches back to the Trino open source community - for example: SingleStore connector, MongoDB connectors, and additional features for Elasticsearch. 

Achieving Scaling and High Availability

A typical Trino cluster consists of a coordinator and multiple worker nodes. We can scale the clusters by adding more worker nodes, but the coordinator is still the single point of failure. We then revisited our requirements to see what the best path forward was.

We wanted to achieve the following:

  • Scaling
  • High Availability
  • Multi-tenancy
  • Resource isolation
  • Ability to perform blue-green deployments

We quickly realized that for all of the above requirements, we would need a multiple cluster setup. To get around the limitations of having a single coordinator per cluster, we decided to have an Envoy-based proxy layer with a group of clusters behind it.

Trino Ecosystem at Goldman Sachs

The diagram shows how we have integrated the clusters with our in-house authentication, authorization, data cataloging, and monitoring systems. All of these systems form the operational backbone of our Trino offering. We support both cloud and on-premise data sources. We have also established connectivity with our Hadoop Distributed File System (HDFS) store through the Hive connector.

Users utilize various clients such as CLI, JupyterHub, SQL editors, custom reporting tools, and other JDBC-based apps to connect to Trino. When a user fires a query, the request first lands in the client-side load balancer (DNS-based LB). The LB routes the request to an underlying envoy router. The router parses the request header to determine the user for that request, and based on the user, it determines which cluster group the query should be routed to. Once it lands into a cluster group, the request is assigned to one of its child clusters. In the above example, when a user (User 1) fires a query, it gets routed to cluster group B per the routing rules defined. It can be routed to child clusters B1, B2, or B3.

The section below explores the routing logic in more detail.

Dynamic Query Routing

Main Components

Envoy Proxy

Envoy is an open source edge and service proxy, designed for cloud-native applications. Envoy provides features sych as routing, traffic management, load balancing, external authorization, rate limiting, and more. We use Envoy proxy as the Trino gateway. It helps us achieve dynamic routing externally without changing Trino's default behavior.

Above: Control Plane - xDS server

  • LDS - Listener Discovery Service - using this API Envoy can dynamically add/delete/update the entire listener, including the L4/L7 filters
  • CDS - Cluster Discovery Service - using this API Envoy can dynamically add/update/delete upstream clusters
  • RDS - Route Discovery Service - using this API Envoy can add/update HTTP route tables
  • EDS - Endpoint Discovery Service - using this API Envoy can add/update cluster members

 

Cluster Groups

For our purposes, a cluster group is a logical namespace that maps to multiple Trino clusters. For example, cluster group A can map to child clusters A1 and A2. Any query landing on a cluster group is load-balanced between the child clusters. The routing layer has the intelligence to load balance the queries among the child clusters in a “round robin” fashion. We also integrated our cluster groups with cluster health check APIs to remove unhealthy child clusters while balancing the load.

Overall there are three different types of cluster groups:

  1. Default cluster group: This is our primary cluster group. When a user is not explicitly assigned to a cluster group, their queries are routed to the default group. There can only be one default cluster group at a time.
  2. Named cluster group: There can be multiple named cluster groups. When a user is assigned to one of the named cluster groups, their queries are routed to their respective group. This helps us segment the incoming workload.
  3. Fallback cluster group: Traffic will be routed to the fallback cluster group if the default or the assigned cluster group is down or unhealthy. This helps provide resiliency in case of a sudden cluster outage.

When a user wants to perform an ad-hoc analysis and is not sure of their requirements, we route their traffic to the default cluster group. Once they have finalized their requirements and want to isolate their workload from all other traffic, we map them to a dedicated cluster group.

Cluster Metadata Service

Metadata Service is a service that provides the Envoy routers with all the cluster related configurations. It contains mappings for cluster groups, cluster groups to child clusters, users to cluster groups, and beyond. Under the hood, it is a spring boot service backed by a persistent storage. The DevOps or cluster admins use this service to manage the clusters. The Metadata Service exposes APIs for the following operations:

  1. Cluster Group: Add/Update/Delete
  2. Cluster: Add/Update/Delete 
  3. Cluster: Map/Un-map cluster to cluster groups
  4. Cluster: Activate/Deactivate
  5. User to Cluster Mapping: Add/Remove 

Router Service

Envoy Control Plane

The Envoy Control Plane is an xDs gRPC-based service, and is responsible for providing dynamic configurations to Envoy. When Envoy starts, it queries xDs APIs to get the upstream cluster configurations dynamically. It periodically polls the Metadata Service to detect changes in the cluster configurations. We can add, update, or delete upstream clusters without restarting the Envoy Control Plane.

Upstream Cluster Selection

Envoy provides HTTP filters to parse and modify both request and response headers. We use a custom Lua filter to parse the request and extract the x-trino-user header. Then, we call the Router Service, which returns the upstream cluster address. A health check is also completed while selecting the clusters.

Challenges with Node Affinity

Unlike simple HTTP requests, which are independent, Trino follows the following JDBC protocol:

  • A POST to /v1/statement runs the query string in the POST body, and returns a JSON document containing the query results. If there are more results, the JSON document contains a nextUri URL attribute.
  • A GET to the nextUri attribute returns the next batch of query results.
  • A DELETE to nextUri terminates a running query.

This means that, during the routing phase, if a request is routed to a cluster, all the consecutive nextUri calls should also get routed to the same cluster. We solve this by keeping a query_id to cluster map in our distributed cache layer. The process flow resembles the following:

  1. The Trino client initiates a POST to /v1/statement, which lands on the Envoy gateway.
  2. on_request()
  3. Envoy parses the header, extracts the x-trino-user and calls the router service to get the upstream cluster.
  4. It then sets the cluster_header with the upstream cluster name. Envoy routes the request by reading this header.
  5. on_response()
  6. We get the response from the upstream cluster, then parse the response to extract the query_id and cluster address.
  7. We then persist the query_id to map the cluster address to the distributed cache.

From this point onwards, all the nextUri calls are checked against this map for routing.

Summary

We use Trino for many applications, from analytics, to data quality, to reporting, and more. Within Goldman Sachs, we have tried to create an ecosystem that can help manage our Trino infrastructure in the most efficient way. With the above architecture, we have achieved that, and will continue to iterate and improve. 

Want to learn more about exciting engineering opportunities at Goldman Sachs? Explore our careers page.


See https://www.gs.com/disclaimer/global_email for important risk disclosures, conflicts of interest, and other terms and conditions relating to this blog and your reliance on information contained in it.

GS DAP® is owned and operated by Goldman Sachs. This site is for informational purposes only and does not constitute an offer to provide, or the solicitation of an offer to provide access to or use of GS DAP®. Any subsequent commitment by Goldman Sachs to provide access to and / or use of GS DAP® would be subject to various conditions, including, amongst others, (i) satisfactory determination and legal review of the structure of any potential product or activity, (ii) receipt of all internal and external approvals (including potentially regulatory approvals); (iii) execution of any relevant documentation in a form satisfactory to Goldman Sachs; and (iv) completion of any relevant system / technology / platform build or adaptation required or desired to support the structure of any potential product or activity. All GS DAP® features may not be available in certain jurisdictions. Not all features of GS DAP® will apply to all use cases. Use of terms (e.g., "account") on GS DAP® are for convenience only and does not imply any regulatory or legal status by such term.

¹ Real-time data can be impacted by planned system maintenance, connectivity or availability issues stemming from related third-party service providers, or other intermittent or unplanned technology issues.

Transaction Banking services are offered by Goldman Sachs Bank USA ("GS Bank") and its affiliates. GS Bank is a New York State chartered bank, a member of the Federal Reserve System and a Member FDIC. For additional information, please see Bank Regulatory Information.

Certain solutions and Institutional Services described herein are provided via our Marquee platform. The Marquee platform is for institutional and professional clients only. This site is for informational purposes only and does not constitute an offer to provide the Marquee platform services described, nor an offer to sell, or the solicitation of an offer to buy, any security. Some of the services and products described herein may not be available in certain jurisdictions or to certain types of clients. Please contact your Goldman Sachs sales representative with any questions. Any data or market information presented on the site is solely for illustrative purposes. There is no representation that any transaction can or could have been effected on such terms or at such prices. Please see https://www.goldmansachs.com/disclaimer/sec-div-disclaimers-for-electronic-comms.html for additional information.

² Source: Goldman Sachs Asset Management, as of March 31, 2025.

Mosaic is a service mark of Goldman Sachs & Co. LLC. This service is made available in the United States by Goldman Sachs & Co. LLC and outside of the United States by Goldman Sachs International, or its local affiliates in accordance with applicable law and regulations. Goldman Sachs International and Goldman Sachs & Co. LLC are the distributors of the Goldman Sachs Funds. Depending upon the jurisdiction in which you are located, transactions in non-Goldman Sachs money market funds are affected by either Goldman Sachs & Co. LLC, a member of FINRA, SIPC and NYSE, or Goldman Sachs International. For additional information contact your Goldman Sachs representative. Goldman Sachs & Co. LLC, Goldman Sachs International, Goldman Sachs Liquidity Solutions, Goldman Sachs Asset Management, L.P., and the Goldman Sachs funds available through Goldman Sachs Liquidity Solutions and other affiliated entities, are under the common control of the Goldman Sachs Group, Inc.

Goldman Sachs & Co. LLC is a registered U.S. broker-dealer and futures commission merchant, and is subject to regulatory capital requirements including those imposed by the SEC, the U.S. Commodity Futures Trading Commission (CFTC), the Chicago Mercantile Exchange, the Financial Industry Regulatory Authority, Inc. and the National Futures Association.

FOR INSTITUTIONAL USE ONLY - NOT FOR USE AND/OR DISTRIBUTION TO RETAIL AND THE GENERAL PUBLIC.

This material is for informational purposes only. It is not an offer or solicitation to buy or sell any securities.

THIS MATERIAL DOES NOT CONSTITUTE AN OFFER OR SOLICITATION IN ANY JURISDICTION WHERE OR TO ANY PERSON TO WHOM IT WOULD BE UNAUTHORIZED OR UNLAWFUL TO DO SO. Prospective investors should inform themselves as to any applicable legal requirements and taxation and exchange control regulations in the countries of their citizenship, residence or domicile which might be relevant. This material is provided for informational purposes only and should not be construed as investment advice or an offer or solicitation to buy or sell securities. This material is not intended to be used as a general guide to investing, or as a source of any specific investment recommendations, and makes no implied or express recommendations concerning the manner in which any client's account should or would be handled, as appropriate investment strategies depend upon the client's investment objectives.

United Kingdom: In the United Kingdom, this material is a financial promotion and has been approved by Goldman Sachs Asset Management International, which is authorized and regulated in the United Kingdom by the Financial Conduct Authority.

European Economic Area (EEA): This marketing communication is disseminated by Goldman Sachs Asset Management B.V., including through its branches ("GSAM BV"). GSAM BV is authorised and regulated by the Dutch Authority for the Financial Markets (Autoriteit Financiële Markten, Vijzelgracht 50, 1017 HS Amsterdam, The Netherlands) as an alternative investment fund manager ("AIFM") as well as a manager of undertakings for collective investment in transferable securities ("UCITS"). Under its licence as an AIFM, the Manager is authorized to provide the investment services of (i) reception and transmission of orders in financial instruments; (ii) portfolio management; and (iii) investment advice. Under its licence as a manager of UCITS, the Manager is authorized to provide the investment services of (i) portfolio management; and (ii) investment advice.

Information about investor rights and collective redress mechanisms are available on www.gsam.com/responsible-investing (section Policies & Governance). Capital is at risk. Any claims arising out of or in connection with the terms and conditions of this disclaimer are governed by Dutch law.

To the extent it relates to custody activities, this financial promotion is disseminated by Goldman Sachs Bank Europe SE ("GSBE"), including through its authorised branches. GSBE is a credit institution incorporated in Germany and, within the Single Supervisory Mechanism established between those Member States of the European Union whose official currency is the Euro, subject to direct prudential supervision by the European Central Bank (Sonnemannstrasse 20, 60314 Frankfurt am Main, Germany) and in other respects supervised by German Federal Financial Supervisory Authority (Bundesanstalt für Finanzdienstleistungsaufsicht, BaFin) (Graurheindorfer Straße 108, 53117 Bonn, Germany; website: www.bafin.de) and Deutsche Bundesbank (Hauptverwaltung Frankfurt, Taunusanlage 5, 60329 Frankfurt am Main, Germany).

Switzerland: For Qualified Investor use only - Not for distribution to general public. This is marketing material. This document is provided to you by Goldman Sachs Bank AG, Zürich. Any future contractual relationships will be entered into with affiliates of Goldman Sachs Bank AG, which are domiciled outside of Switzerland. We would like to remind you that foreign (Non-Swiss) legal and regulatory systems may not provide the same level of protection in relation to client confidentiality and data protection as offered to you by Swiss law.

Asia excluding Japan: Please note that neither Goldman Sachs Asset Management (Hong Kong) Limited ("GSAMHK") or Goldman Sachs Asset Management (Singapore) Pte. Ltd. (Company Number: 201329851H ) ("GSAMS") nor any other entities involved in the Goldman Sachs Asset Management business that provide this material and information maintain any licenses, authorizations or registrations in Asia (other than Japan), except that it conducts businesses (subject to applicable local regulations) in and from the following jurisdictions: Hong Kong, Singapore, India and China. This material has been issued for use in or from Hong Kong by Goldman Sachs Asset Management (Hong Kong) Limited and in or from Singapore by Goldman Sachs Asset Management (Singapore) Pte. Ltd. (Company Number: 201329851H).

Australia: This material is distributed by Goldman Sachs Asset Management Australia Pty Ltd ABN 41 006 099 681, AFSL 228948 (‘GSAMA’) and is intended for viewing only by wholesale clients for the purposes of section 761G of the Corporations Act 2001 (Cth). This document may not be distributed to retail clients in Australia (as that term is defined in the Corporations Act 2001 (Cth)) or to the general public. This document may not be reproduced or distributed to any person without the prior consent of GSAMA. To the extent that this document contains any statement which may be considered to be financial product advice in Australia under the Corporations Act 2001 (Cth), that advice is intended to be given to the intended recipient of this document only, being a wholesale client for the purposes of the Corporations Act 2001 (Cth). Any advice provided in this document is provided by either of the following entities. They are exempt from the requirement to hold an Australian financial services licence under the Corporations Act of Australia and therefore do not hold any Australian Financial Services Licences, and are regulated under their respective laws applicable to their jurisdictions, which differ from Australian laws. Any financial services given to any person by these entities by distributing this document in Australia are provided to such persons pursuant to the respective ASIC Class Orders and ASIC Instrument mentioned below.

  • Goldman Sachs Asset Management, LP (GSAMLP), Goldman Sachs & Co. LLC (GSCo), pursuant ASIC Class Order 03/1100; regulated by the US Securities and Exchange Commission under US laws.
  • Goldman Sachs Asset Management International (GSAMI), Goldman Sachs International (GSI), pursuant to ASIC Class Order 03/1099; regulated by the Financial Conduct Authority; GSI is also authorized by the Prudential Regulation Authority, and both entities are under UK laws.
  • Goldman Sachs Asset Management (Singapore) Pte. Ltd. (GSAMS), pursuant to ASIC Class Order 03/1102; regulated by the Monetary Authority of Singapore under Singaporean laws
  • Goldman Sachs Asset Management (Hong Kong) Limited (GSAMHK), pursuant to ASIC Class Order 03/1103 and Goldman Sachs (Asia) LLC (GSALLC), pursuant to ASIC Instrument 04/0250; regulated by the Securities and Futures Commission of Hong Kong under Hong Kong laws

No offer to acquire any interest in a fund or a financial product is being made to you in this document. If the interests or financial products do become available in the future, the offer may be arranged by GSAMA in accordance with section 911A(2)(b) of the Corporations Act. GSAMA holds Australian Financial Services Licence No. 228948. Any offer will only be made in circumstances where disclosure is not required under Part 6D.2 of the Corporations Act or a product disclosure statement is not required to be given under Part 7.9 of the Corporations Act (as relevant).

FOR DISTRIBUTION ONLY TO FINANCIAL INSTITUTIONS, FINANCIAL SERVICES LICENSEES AND THEIR ADVISERS. NOT FOR VIEWING BY RETAIL CLIENTS OR MEMBERS OF THE GENERAL PUBLIC

Canada: This presentation has been communicated in Canada by GSAM LP, which is registered as a portfolio manager under securities legislation in all provinces of Canada and as a commodity trading manager under the commodity futures legislation of Ontario and as a derivatives adviser under the derivatives legislation of Quebec. GSAM LP is not registered to provide investment advisory or portfolio management services in respect of exchange-traded futures or options contracts in Manitoba and is not offering to provide such investment advisory or portfolio management services in Manitoba by delivery of this material.

Japan: This material has been issued or approved in Japan for the use of professional investors defined in Article 2 paragraph (31) of the Financial Instruments and Exchange Law ("FIEL"). Also, any description regarding investment strategies on or funds as collective investment scheme under Article 2 paragraph (2) item 5 or item 6 of FIEL has been approved only for Qualified Institutional Investors defined in Article 10 of Cabinet Office Ordinance of Definitions under Article 2 of FIEL.

Interest Rate Benchmark Transition Risks: This transaction may require payments or calculations to be made by reference to a benchmark rate ("Benchmark"), which will likely soon stop being published and be replaced by an alternative rate, or will be subject to substantial reform. These changes could have unpredictable and material consequences to the value, price, cost and/or performance of this transaction in the future and create material economic mismatches if you are using this transaction for hedging or similar purposes. Goldman Sachs may also have rights to exercise discretion to determine a replacement rate for the Benchmark for this transaction, including any price or other adjustments to account for differences between the replacement rate and the Benchmark, and the replacement rate and any adjustments we select may be inconsistent with, or contrary to, your interests or positions. Other material risks related to Benchmark reform can be found at https://www.gs.com/interest-rate-benchmark-transition-notice. Goldman Sachs cannot provide any assurances as to the materialization, consequences, or likely costs or expenses associated with any of the changes or risks arising from Benchmark reform, though they may be material. You are encouraged to seek independent legal, financial, tax, accounting, regulatory, or other appropriate advice on how changes to the Benchmark could impact this transaction.

Confidentiality: No part of this material may, without GSAM's prior written consent, be (i) copied, photocopied or duplicated in any form, by any means, or (ii) distributed to any person that is not an employee, officer, director, or authorized agent of the recipient.

GSAM Services Private Limited (formerly Goldman Sachs Asset Management (India) Private Limited) acts as the Investment Advisor, providing non-binding non-discretionary investment advice to dedicated offshore mandates, involving Indian and overseas securities, managed by GSAM entities based outside India. Members of the India team do not participate in the investment decision making process.

© 2025 Goldman Sachs. All rights reserved.