June 30, 2022

How to Think Like a Programmer (Part Two)

Denis Olasehinde Akinmolasire, Vice President, Data Engineering

In the previous part we introduced this blog series with the aim of helping developers that are new to the software engineering industry and highlight potential challenges that they may encounter as they begin their journey into this field. The next parts will focus on specific technical areas of software development. This part will cover how to model solutions using domain driven development.

I have a business problem, where do I start?

Start off with your domain

At university, you typically will be focused on learning the syntax for a particular language, or understanding key aspects of operating systems, or computational logic/math. The challenge when you get to industry is how to apply that theory in practice. Most likely, a lot of your time will be spent with users in your organization understanding what they do on a day-to-day basis and how software can improve their current functions, such as reducing the number of steps it takes to complete a process, improving the speed of certain processing and just making sure you have a complete understanding of the problem space before writing any lines of code. It is not uncommon for you as a software engineer to spend a significant amount of time trying to model your organization function into a process that can be automated through software. One methodology that you may find useful is Domain Driven Development.

Domain Driven Development is an industry design practice of modeling your software within the context a given domain, based on inputs from domain experts. This domain can be a business, it can be a particular functional area, or it can even be a particular product. Domain Driven Development lends itself very well to concepts such as object orientated programming, and if modeled correctly you will be able to explain to a non technical user how your software will work by explaining it in terms that relate to their domain.

In theory this all makes sense. So how do we apply this in reality? To showcase how we can do this we're going to walk through an example.

Example Problem Statement

In our workflow, data stewards are responsible for maintaining reference data. Data stewards have the capability to add, update, and delete reference data as appropriate. Data analysts on the other hand can only view reference data for research purposes. Our data stewards currently maintain two sets of reference data; Financial Products and Client Accounts. Financial products contain information such as International Securities Identification Number (ISIN), market code, GSN and take two main forms: Stock and Bonds. Stocks can be sold and bought from companies whereas a Bond can be issued by governments. A client account contains information such as account name, client id, and an is active status.

The above problem statement is a very simplified example of a problem that is often found in areas which deal with financial data management. As simple as it may appear to be, there is a significant amount of detail that can be gleaned. Our next objective is to turn this problem statement into a domain model. There are several ways that this can be done. One approach that I've often used is splitting my problem statement into three parts: Entities, Responsibilities, Associations.

Entities

When I encouter a problem statement like this is I first try to identify the individual entities that will make up the domain model. Based on the above problem statement, I've picked out the below entities. At this stage, I don't focus on what each entity will do; I'm just concerned with picking out the concepts that I think are needed in my domain to produce a model that can be turned into an automated software solution. 

  • Reference data user
  • Data steward
  • Data analyst
  • Product
  • Account
  • Stock
  • Bond
  • Owner
  • Issuer
  • Entitlement
  • Responsibilities

Responsibilities

After identifying my entities I tend to think about the responsibilities of each entity. Responsibilities can be thought of as functions for an entity and define the rules that an entity needs to comply to. Based on the problem statement, I start to think about the responsibilities each entity will have. I often find modeling to be a very iterative process; typically when I drill into the responsibilities I find that there are other concepts that I need to model. If you look at the list now, there are a couple of other entities that I have identified that I didn't have originally. 

Below you will find my current set of entities with a brief a description of what they are responsible for.

  • Reference data user - Someone who uses the reference data in some shape or form.
  • Data steward - Data stewards appear to be a reference data user with add, update, and delete access.
  • Data analyst - A reference user with only view access.
  • Product - Based on the above problem statement. this a type of reference data that needs to be maintained. Products have the attributes ISIN, market code, and GSN.
  • Account - The second type of reference data that needs to be maintained.
  • Stock - A type of product.
  • Bond - Another type of product.
  • Owner - A representation of a company who can buy and sell the stock.
  • Issuer - A representation of someone who can issue bonds.
  • Entitlement - A representation of the type of entitlements.
  • Reference data repository - A container for all the different types of reference data that needs to be maintained.

Associations

After listing out the responsibilities, I then start to think about how the different entities will interact with each other. There a several techniques that can be used here such as a flowchart, block diagrams or entity relationship diagram. Entity relationship diagrams were created for the purpose of modeling entities but I often use a UML (Unified Modeling Language) class diagram to do this. Sometimes using object oriented concepts to model your domain often can be a natural fit for explaining how the different entities that you have come up with interact with each other. In addition, UML also provides a good means for modeling entity behavior. In this blog, I use a UML class diagram to demonstrate how to highlight associations in my domain.

UML Diagram summarizing the relationships in a domain model.
UML Diagram summarizing the relationships in a domain model.

 

Domain Model Benefits

A domain model can bring you several benefits in terms of potential problem solving, some of which are listed below.

Affirmation of understanding

  • Modeling is a good way to affirm that you have understood the problem. 
  • If you can visualize the problem-space, it makes it a lot easier to come up with the eventual solution.

User engagement

  • A domain model can help you when you interact with your end users.
  • Your end users are not necessarily going to be very technical; so if you're able to come up with a mechanism for visualizing their problem-space it will make it easier for you to capture feedback and facilitate better engagements.
  • Making your users feel part of the design process will gain buy in from them and establish you in your team/organization. 
  • The trust you gained will give you more confidence to work on more complicated problems.

Functional clarity

  • Modeling your domain will also help avoid the pitfall of having to use programming language jargon/syntax to explain key concepts in terms of automation.
  • Confirmation of user workflow: A domain model can help with the conceptual design for potential user workflows and help spot issues/gaps in understanding and identify potential areas where the proposed workflow may not work as expected.
  • Can serve as a functional model/architecture for what the eventual proposed solution should be aiming for. Having a good functional understanding of your domain will help serve as a guide to ensuring that the eventual product serves the need concerned.

Tooling available for me to undertake domain driven development

One tool that is available for you to use, is an open source technology created by Goldman Sachs called Legend. It's a data management and governance platform that enables technical and non -technical users to create models to produce data-centric applications. Goldman Sachs contributed Legend to the Fintech Open Source Foundation (FINOS) to make it available as an open source product  Legend has features that allow you create a data model  (The final class diagram shown above was produced in Legend).

Legend offers a solution to rapidly growing data challenges by providing efficient and reliable access to accurate, timely, and safe data internally and within the industry.
Pierre De Belen - Head Architect for the Legend Platform.

Summary 

In summary, if you ever have a particular problem statement and are not sure where to begin, try producing a domain model. You may find it helps shed some light on certain aspects of the problem. In my current role, I used domain driven development to figure out how to model sequential file processing for a particular vendor as they had a concept of full universe and delta files - where files needed to be process in sequence of each other until certain conditions were met. I was able to produce a solution that not only worked for the vendor concerned but also resulted in a generic model that provided a foundation for solving problems of that nature going forward.

Part Three

The next blog will be focused on how to solve a production and/or uat bug and demonstrate how component testing can aid with this.

Further reading on problem solving approaches:


See https://www.gs.com/disclaimer/global_email for important risk disclosures, conflicts of interest, and other terms and conditions relating to this blog and your reliance on information contained in it.