June 16, 2022

Building Automated Scalable Network Policy Management: A Look into PINACL

Kishore Kumar Anand, VP, Engineering and Irene Nydees, VP, Engineering

At Goldman Sachs, the infrastructure resources in our cloud can be managed using Infrastructure-as-Code (IaC). The engineering teams at Goldman Sachs request these resources, via code, on-demand and expect the provisioning to be completed in a matter of minutes. The various infrastructure resources include compute, network and storage along with other services. This post will explore how network connectivity requirements are fulfilled by PINACL.

PINACL is a GS Accelerate sponsored product that seeks to drive end-to-end automation of network security connectivity policies that control network traffic in an enterprise. PINACL allows connectivity policy to be authored and owned by application developers rather than network operators. It leverages network topology, built for the enterprise by PINACL, to discover the impact on the existing security posture and delivers the user's intent.

Enterprise Requirements

  • Scale: Application developers and network operators submit multiple connectivity requests which results in a large number of connectivity flows.
  • Consistency: Data should be strictly consistent across the system because inconsistencies could lead to unauthorized access and security incidents.
  • Latency: Users would like to realize their infrastructure changes in a timely manner. Since a single connectivity request could impact multiple security enforcement points, the fulfillment system should have high throughput to keep the end-to-end latency low.

Solutions

  • Highly scalable distributed services that asynchronously receive and deliver proper provisioning.
  • PINACL uses CockroachDB as a datastore. CockroachDB is a distributed database with a SQL interface that promises consistency and partition tolerance.

PINACL Architecture

PINACL's architectural and design decisions are influenced by these challenges and solutions listed above. The product's capabilities can be broadly divided into two parts: upstream and downstream systems. 

Upstream System

The upstream system accepts requests through REST API endpoints for realizing any change in the network security posture. It processes the events in an asynchronous manner and adds tasks to the policy fulfillment queue.

Please reed the description.
Please reed the description.

Description of the upstream system: A flow chart  diagram detailing the flow from IAC application developers, network operators , event streams and topology changes funneling into the api server, the processor queue, pipeline processor fulfillment queue. * Each USER (use case) will have a dedicated / independent UPSTREAM SYSTEM processing the requests / changes.


High availability, low coupling, and high cohesion are the driving forces for the design of the upstream system.

Each use case like policy management by network engineers or software developers has been divided into individual services with single responsibility. Each component has its own dedicated upstream service independent of others making it easier to maintain and extend the system to support new use cases in the future. All the upstream systems communicate with the downstream system through the same fulfilment queue.

The upstream system is further divided into two parts—API server and Pipeline processor—to ensure high availability. The multiple instances or replicas of API servers can continue accepting and queuing the orders, even when the pipeline processor is down due to issues with the processing of any request. All the requests will eventually be processed by the pipeline processor.

API Server: Authorized users can interact with PINACL using the RESTful APIs exposed by the server component to CREATE / READ / UPDATE / DELETE the security policies managed by them.

Pipeline Processor: Processes user connectivity intent by first detecting the impacted rules and then with the help of network topology determines the security devices which need to be reprovisioned. These security devices are then added to the fulfillment queue which will be processed by the downstream system.

For example, consider the following network topology where there is a security device (SD1) blocking the network traffic between Host A and Host B.

Please see description below.
Please see description below.

A flow diagram showing the before and after of the connectivity between Host A and Host B with the security device interrupting the flow. In the after portion of the diagram is unblocked i- with the policy flowing through from PINACL allowing the security device being programmed. 


Downstream / Fulfillment System

The downstream system processes the tasks in the policy fulfillment queue by preparing and pushing the latest policies to the security devices.

Please see descriptive text below.
Please see descriptive text below.

A diagram showing multiple upstream systems flowing into a priority based fulfilment queue into orchestraqtors controlled by requesting agents, then down into the vendor provided security devices. 


The rate at which the upstream systems add tasks to the fulfillment queue can be high, driven by the volume of changes it receives, making horizontal scalability the driving factor in designing the downstream system in order to maintain low end-to-end latency. To ensure scalability, the fulfillment system adopts the Orchestrator - Agent pattern with task stealing approach, where the agents request (pull) the tasks from the orchestrator when it's free. We adapted a pull-based approach over a push-based one as the rate of processing and completion of each task varies widely based on the policy size and vendor. The agent nodes are elastic. During heavy load, new agent nodes can be dynamically added to the group to share the load / tasks and increase the throughput. Each task is a unit of work which involves pushing the security policy to a device. 

Orchestrator acts as a mediator to group the same security devices currently pending in the queue to create a single task, when an agent requests for one. It also ensures the same security device is not allocated to different agents at the same time to avoid policy provisioning conflicts.  

Agent composes the policy by optimizing / compressing the flows that will be pushed to a security device from many millions to a few thousand. The agent then translates and pushes the policy to the security device using the driver of the respective vendor. 

Queue Control: Priority can be assigned to the task which influences the order of fulfillment which becomes effective in controlling the queue especially during heavy loads.

Data Management in PINACL

At a high level, the PINACL product needs to keep track of a few key datasets:

  • Connectivity policy contains a collection of rules. Rules reference endpoints (or endpoint groups) and app protocol (or app protocol groups). Endpoint groups are defined as a collection of other endpoint groups (nested) or leaf endpoints. The rules and endpoint group relationships change often. They are entities with strong relationships between them and have their own lifetime and ownership.
  • Network Topology information, which changes less frequently.
  • Upstream system computes the impact caused by policy changes and determines which security devices need to be programmed. This requires joins between the entities and Recursive Common Table Expression, i.e. CTE to resolve nested dependencies. 

Relational Database Choice

Both the upstream and downstream processing requires significant data joins to fulfill the requests. With most document DBs, server-side join is either non-existent or works with only non-sharded collections (e.g. MongoDB's lookup operator) and limited to equi-joins. This requires applications to perform client side joins which are relatively expensive due to network latency, amount of data transferred over the network, and the number of calls to the DB.

While a database which doesn't enforce any schema is easier to develop, there is value in enforcing a schema and referential integrity at the database layer. Datasets tend to outlive the application, so ensuring the data integrity is paramount and having the database prevent data quality issues due to bugs in the application is valuable.

With de-normalized documents in either JSON/BSON format, there is a significant overhead to the dataset size compared to a relational database. It is partly to do with denormalization introducing repeated values and partly to do with JSON/BSON requiring repeated string keys.

Support for ACID transactions simplifies application development without having to detect and workaround various failure scenarios.

Selecting CockroachDB

CockroachDB is an open source database providing support for relational data model, range sharded data partitioning, replication, and high availability. CockroachDB uses MVCC to provide ACID and more specifically serializable isolation level. It supports PostgreSQL wire protocol allowing existing PostgreSQL clients (eg: JDBC driver) to work seamlessly with CockroachDB.

While there are lots of features and benefits with using CockroachDB, here are a few that were important when making a selection:

  • Supports transactions with serializable isolation level which simplifies application development.
  • Leverages its MVCC to support point-in-time queries to reduce contention with other writes in the system.
  • Range partitioning of data allows the data storage layer to scale by addition of more nodes to the cluster.
  • Provides high availability out of the box, as long as the quorum is available i.e. N/2+1 nodes are up, with replication of every range N times across the cluster.
  • Supports online backup by leveraging point-in-time query capabilities.
  • Supports rolling upgrades.
  • Supports IPv4 and IPv6 data types.

What's Next?

Today, PINACL supports multiple security device vendors and with the growing need of multi vendor solutions, we are planning to add support for more vendors. With the increasing adoption of cloud infrastructure along with on-premise, PINACL seeks to extend the support for the hybrid environment. The network topology consists of different kinds of devices which can enforce security like Router ACLs. PINACL is envisioned to become the single solution to manage all kinds of security enforcement points.

PINACL is actively hiring software engineers in Bengaluru, see roles here


About GS Accelerate

Since launching in 2018, Goldman Sachs employees have submitted more than 1,800 ideas, many of which, including PINACL, have been funded, resulting in seven new products available for Goldman Sachs clients and people. GS Accelerate businesses are hiring across teams and functions in offices across the globe including New York, Dallas, and Birmingham. To view open roles, visit here.

PINACL technology is covered by pending and issued patents.


See https://www.gs.com/disclaimer/global_email for important risk disclosures, conflicts of interest, and other terms and conditions relating to this blog and your reliance on information contained in it.

GS DAP® is owned and operated by Goldman Sachs. This site is for informational purposes only and does not constitute an offer to provide, or the solicitation of an offer to provide access to or use of GS DAP®. Any subsequent commitment by Goldman Sachs to provide access to and / or use of GS DAP® would be subject to various conditions, including, amongst others, (i) satisfactory determination and legal review of the structure of any potential product or activity, (ii) receipt of all internal and external approvals (including potentially regulatory approvals); (iii) execution of any relevant documentation in a form satisfactory to Goldman Sachs; and (iv) completion of any relevant system / technology / platform build or adaptation required or desired to support the structure of any potential product or activity. All GS DAP® features may not be available in certain jurisdictions. Not all features of GS DAP® will apply to all use cases. Use of terms (e.g., "account") on GS DAP® are for convenience only and does not imply any regulatory or legal status by such term.
Certain solutions and Institutional Services described herein are provided via our Marquee platform. The Marquee platform is for institutional and professional clients only. This site is for informational purposes only and does not constitute an offer to provide the Marquee platform services described, nor an offer to sell, or the solicitation of an offer to buy, any security. Some of the services and products described herein may not be available in certain jurisdictions or to certain types of clients. Please contact your Goldman Sachs sales representative with any questions. Any data or market information presented on the site is solely for illustrative purposes. There is no representation that any transaction can or could have been effected on such terms or at such prices. Please see https://www.goldmansachs.com/disclaimer/sec-div-disclaimers-for-electronic-comms.html for additional information.
Transaction Banking services are offered by Goldman Sachs Bank USA (“GS Bank”). GS Bank is a New York State chartered bank, a member of the Federal Reserve System and a Member FDIC.
Mosaic is a service mark of Goldman Sachs & Co. LLC. This service is made available in the United States by Goldman Sachs & Co. LLC and outside of the United States by Goldman Sachs International, or its local affiliates in accordance with applicable law and regulations. Goldman Sachs International and Goldman Sachs & Co. LLC are the distributors of the Goldman Sachs Funds. Depending upon the jurisdiction in which you are located, transactions in non-Goldman Sachs money market funds are affected by either Goldman Sachs & Co. LLC, a member of FINRA, SIPC and NYSE, or Goldman Sachs International. For additional information contact your Goldman Sachs representative. Goldman Sachs & Co. LLC, Goldman Sachs International, Goldman Sachs Liquidity Solutions, Goldman Sachs Asset Management, L.P., and the Goldman Sachs funds available through Goldman Sachs Liquidity Solutions and other affiliated entities, are under the common control of the Goldman Sachs Group, Inc.
© 2024 Goldman Sachs. All rights reserved.