Data Lineage 103: Legislative requirements

In ‘Data Lineage 102: Definition and key components’ we aligned the definition of data lineage and specified its key components from the viewpoint of data management. Now it is time to take a look at the requirements from the viewpoint of legislation.

Basel Committee on Banking Supervision‘s standard number 239 (BSBC 239)1and the EU’s General Data Protection Regulation (GDPR)2as key legislative triggers for data lineage

As an example, we will take requirements of the BCBS 239 “Principles for effective risk data aggregation and risk reporting” and the GDPR.

The strangest thing is that you will never find the term ‘data lineage’ mentioned in these regulatorydocuments.

So where does that leave data management professionals? They need to investigate the requirements and translate them into the data management language. Let’s do the same with 2 above mentioned legislations.

Key data lineage components from data management viewpoint

As I discussed in ‘Data Lineage 102’, data lineage consists of the following interlinked components, also shown in Figure 1 below:

- IT systems (application, database, network segment)

Data flows through the chain of systems or applications in which data is being transformed and integrated.

‘Golden sources’ and reports/ dashboards are two boundaries that denote correspondingly the point where data is created and its final destination.

- Business process

Business processes ensure a set of activities related to data processing. Business processes usually include references to related applications.

- Data (elements)themselves forms the key component of data lineage. Data (elements) can be specified at different levels of abstraction and details. Usually, you do it at one of the following data model levels:

  • Conceptual: data elements are presented in the form of terms and related constrains.
  • Logical, application related: data entities & attributes of a specific database and related data transformation rules.
  • Logical, not application related: data entities & attributes and related data transformation rules.
  • Physical: tables & columns & related ETLs (Extract, Transform, Load).

Usually, you would link data elements on different levels of data models. Such a link is sometimes called ‘vertical data lineage’ as opposed to ‘horizontal data lineage’ that represents the path of data from the point of origination to the point of usage. DAMA-DMBOK2 mentions the term ‘linkage’3between different data model levels. In any case, physical data models are always linked to a specific application.

- Data checks and controls.

In the definition of data lineage specified by Enterprise Data Management, ‘lineage may include a mapping of the data controls’4.

Now let’s to plot the BCBS 239 and GDPR requirements to the scheme of data lineage (Figure 1):

Figure 1. Key components of data lineage from the perspective of data management.

Key data lineage components from legislative viewpoint

There are certain requirements in the legislation that you can interpret as components of data lineage, see Figure 2:

Figure 2. Legislative requirements in relation to data lineage.

Information / reports

BCBS 239 stresses the necessity that ‘the right information needs to be presented to the right people at the right time’5, followed by requirements to ‘distribute risk reports to the relevant parties’6. In Figure 2, this component is mentioned as ‘Dashboards/ Reports’.

Business process

BCBS 239 also specifies that it is necessary to ‘to document and explain all of their risk data aggregation processes whether automated or manual’7.

Business dictionary

BCBS 239 draws attention of organizations to a business dictionary, which is ‘the concepts used in a report such that data is defined consistently across the organization’8.

From data management perspective on data lineage, business dictionary, which is the set of business terms, corresponds to the conceptual level of data models.

Data elements and business rules at logical level

BCBS 239 points out the requirement to maintain ‘inventory and classification of risk data items’9, which you could translate as data elements at logical level of data models. In addition to that, ‘automated and manual edit and reasonableness checks, including an inventory of the validation rules that are applied to quantitative information’10are also required. ‘The inventory should include explanations of the conventions used to describe any mathematical or logical relationships that should be verified through these validations or checks’11. In the language of data management, it is interpreted as a repository of business rules.

Application landscape

One of the BCBS239 principles states that ‘a bank should design, build and maintain data architecture and IT infrastructure which fully supports its risk data aggregation capabilities and risk reporting practices’12.

GDPR requires that a company should ‘implement appropriate technical and organizational measures to ensure and to be able to demonstrate that processing is performed in accordance with this Regulation’13. There are several articles in GDPR, i.e. 24,25, 32 that focus on necessity of appropriate technical and organizational measures to ensure proper processing of personal data.

Even if there is no direct requirement to document data flow through applications, every data management professional still ‘translates’ these requirements as such.

Business and technical metadata

Metadata is one of the crucial components of data lineage. Metadata describes all other types of data, including all other components of data lineage mentioned above.

BCBS 239 stresses the necessity to record business metadata, i.e. in the form of ‘ownership of risk data and information for both the Business and IT function’ 14. It also recommends to document ‘integrated data taxonomies and architecture… which includes information on the characteristics of the data (metadata) as well as use of single identifiers and / or unified naming conventions for data including legal entities, counterparties, customers and accounts’15. This last requirement is obviously related to both business and technical metadata.

GDPR has extended requirements for recording personal information, such as, requirements that ‘each controller… shall maintain a record of processing activities under its responsibility. That record shall contain all of the following information: (a) the purposes of the processing; (b) a description of the categories of data subjects and of the categories of personal data; (c) the categories of recipients to whom the personal data have been or will be disclosed including recipients in third countries or international organizations; (d)where applicable, transfers of personal data to a third country or an international organization, including the identification of that third country or international organization ; (e) where possible, the envisaged time limits for erasure of the different categories of data; (f) where possible, a general description of the technical and organizational security measures .’16. Everything that is mentioned in Article 30 of GDPR can be recognized as business metadata.

Furthermore, to ensure the rights of the data subject can be exercised a company definitely must have knowledge of metadata and data lineage capabilities in place. Think, for example, about such rights of data subject as the ‘right to obtain from the controller the erasure of personal data concerning him or her’17, or the ‘right to obtain from the controller restriction of processing’18, or the ‘the right to receive the personal data … in a structured, commonly used and machine-readable format and …the right to transmit those data to another controller’19. Knowing how data flows through applications on the physical level seems to be an unavoidable condition.

Data (quality) controls

BCBS 239 is rather direct about the necessity to ‘measure and monitor accuracy of data’20. It stresses that ‘Banks must produce aggregated risk data that is complete and measure and monitor the completeness of their risk data’21and ‘controls surrounding risk data should be as robust as those applicable to accounting data’22. ‘Integrated procedures for identifying, reporting and explaining data errors or weaknesses in data integrity via exceptions reports’23are to be in place.

GDPR rather focuses on ’technical and organizational measures to ensure a level of security appropriate to the risk’24related to processing of personal data.

After considering the requirements of each legislation, I have come to the following definition of components of data flow/ lineage that your company should document and maintain:

  • Report (catalogue)
  • Application flow
  • Conceptual level of data model: terms and business dictionary
  • Logical level of data model: data entities and repository of related business / validation rules
  • Physical level of data model: database schemes and ETLs repository
  • Business processes
  • Data (quality) checks and controls.

By now, you already know what components of data lineage you should document. Your next question might be “How do l complete that step?”. This is the topic for my next article, Data Lineage 104: Documenting data lineage.

————————————————————————————————————-

References:

1 BCBS 239

2Regulation (EU) 2016/679 of the European parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation).

3DAMA International. DAMA-DMBOK: Data Management Body of Knowledge, Second Edition. Bradley Beach, N.J.: Technics Publications, 2017, p.105.

4 Enterprise Data Management Council. The Standard Glossary of Data Management Concepts, version 0.2.1, 2017, p.9.

5 BCBS 239, par.51.

6BCBS 239, Principle 11.

7BCBS 239, Principle 3, par.39.

8 BCBS 239, Principle 6, par.37.

9BCBS 239, Principle 8, par.67

10BCBS 239, Principle 7, par.53b.

11BCBS 239, Principle 7, par.53b.

12 BCBS 239, Principle 2.

13 Regulation (EU) 2016/679 of the European parliament and of the Council of 27 April2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation), Art.24.

14 BCBS 239, Principle 2, par.34.

15BCBS 239, Principle 2, par.33.

16Regulation (EU) 2016/679 of the European parliament and of the Council of 27 April2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation), Art.30.

17Regulation (EU) 2016/679 of the European parliament and of the Council of 27 April2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation), Art.17.

18Regulation (EU) 2016/679 of the European parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation), Art.18.

19Regulation (EU) 2016/679 of the European parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation), Art.20.

20BCBS 239, Principle 3, par.40.

21BCBS 239, Principle 4, par.43.

22BCBS 239, Principle 3, par.36a.

23BCBS 239, Principle 7, par.53c.

24Regulation (EU) 2016/679 of the European parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation), Art.32In this article, I would like to discuss and give my answer to the most complicated question: how should data lineage be documented?


Identify your path to CFO success by taking our CFO Readiness Assessmentᵀᴹ.

For the most up to date and relevant accounting, finance, treasury and leadership headlines all in one place subscribe to The Balanced Digest.

Follow us on Linkedin!