Data lineage 105: Implementation guidelines

Data lineage 105: Implementation guidelines

In our previous ‘Data Lineage’ articles, we have discussed:

Data Lineage 101’: Why do we need data lineage?

Data Lineage 102’: What is data lineage?

Data Lineage 103’: What are the key legislation requirements for data lineage?

Data Lineage 104’: How can you document data lineage?

Here is a draft of the key steps your team can take to start your data lineage journey.

It is worth mentioning that an established data management framework and collaboration between data management professionals and stakeholders are a prerequisite for successful implementation of data lineage.

The implementation itself can be divided into 7 steps.

1. Identify your key business drivers for data lineage

Your company should have serious reasons to start thinking about documenting of data lineage. For example:

  • legislation requirements
  • business changes
  • data quality initiatives
  • supervisory and audit requirements.

If any of these becomes crucial for the business, then you are ready to start discussing data lineage documentation with the top management of your company.

2. Buy-in support and involvement of top management

Data management and data lineage both require a lot of resources, human, as well as financial, and will consume a lot of time. Without the dedication of top management, such initiatives have no future. There are two key groups of benefits to support these initiatives. These are:

I. Improved work efficiency and increased revenue in the medium-term. This can be achieved rather fast by, for example, only improving the quality of your data. In more concrete terms improving data quality can lead to:

  • increased revenue by 15-20 %1
  • reduced operational costs by 40%2
  • decreased IT maintenance cost by 40-50%3.

` These monetary benefits will be a result of reducing the cost of a lot of manual operations with data, optimized application landscape etc.

II. Compliancy with regulations, e.g. GDPR (the EU General Data Protection Regulation). If you live in the EU, you are probably familiar with the fines your company could receive due to data breaches.

When your top management gives you the ‘green light’ to the data lineage initiative, it is time to think about the scope of your initiative.

3. Scope your data lineage initiative

For each business driver you have chosen, you can find corresponding data sets. This is the first filter to use that will help you narrow your scope. For example, GDPR focuses on personal data. If you just start your data quality initiative, the chance is high that the first thing you look at, will be customer data.

The second filter is identification of critical data elements (CDE) within these data sets. CDEs are data elements that make the biggest impact on the performance of your company and customer experience. Usually, these are the key KPIs used to manage the company.

The techniques to identify these CDEs (KPIs) are rather simple. First, you need to choose the most critical management reports and the KPIs which are located there. The difficulties start when you need to identify which source data elements are needed to calculate these CDEs. And this is where the story with data lineage documentation begins. Once you have agreed on the scope of your initiative, you can define the scope of data lineage.

4. Define the scope for data lineage

You scope data lineage by using the concepts of ‘horizontal and vertical data lineage’.

The whole scope of data lineage starts with the original data sources and ends at the point of final usage. In large companies, especially with a lot of subsidiaries, such chains are rather long and complicated. That is why very often a company starts with a limited ‘length’ of data lineage, for example, at some point of data aggregation.

You can document data lineage on different levels of data models: conceptual, logical and physical. The choice of the number of levels on which you will document data lineage will drive the scope of data lineage project.

5. Prepare the business requirements for data lineage

Different groups of stakeholders have different requirements and expectation for data lineage.

There are at least two key groups: business stakeholders, i.e. audit, business and data analysis, financial controllers, and technical stakeholders, i.e. IT engineers, database managers etc.

If your company has little experience with data lineage, the topic remains very abstract. As said by one of my colleagues: ‘Everyone wants data lineage, but no one can exactly explain what they mean by that and what their expectations are’. While conducting interviews with business stakeholders, I came to fully agree with this statement.

There are some specific features when it comes to the requirements of these two groups.

Business stakeholders are mostly interested in:

  • the ability to run root-cause analysis, starting from the end reports and going back to the ‘golden’ source
  • the value of data lineage rather than its design (The differences between these two types of data lineage I explained in ‘Data Lineage 104’.)
  • data lineage on conceptual or logical data model levels

(The ‘in depth’ explanation of data lineage components at these levels you will also find in ‘Data Lineage 104’.)

On the other hand, the technical stakeholders focus on:

  • impact analysis, starting from the source of data elements and its path to its final destination
  • metadata design lineage
  • data lineage on physical level.

I would advise you to spend time talking with different groups of business stakeholders to clarify their expectations, make them more realistic and align all the requirements in a unified document.

When this is done, you can finally move to deciding how you will document data lineage.

6. Choose the method to document data lineage

The comparative analysis of two methods: descriptive and automated I already provided in ‘Data Lineage 104’.

As I have already stressed several times, documentation of data lineage is a very time and resource consuming task.

First of all, you should assess which of the existing methods is most feasible for your company’s resources.

The level of documentation of data lineage will also impact your decision regarding the method used. You also should be aware that regardless of the method you choose, a lot of manual work will still be required to document data lineage.

As soon as decision is made you should think about suitable software.

7. Choose the suitable application to document data lineage

Not surprisingly, even large companies document data lineage using MS applications such as Excel, Word, PowerPoint, Visio. If you decide to document data lineage on conceptual or logical levels and are presented with a choice of applications, take a look at such applications as Axon, Collibra, or Erwin. Should you opt for an automated solution, market leaders such as SAS and Informatica would get your attention first. The key providers of metadata automated metadata lineage are available at metaintegration.com.

At this point, our development adventure into the data lineage world comes to a close. During my personal journey into data lineage I have realized that establishing a data management framework revolves around optimization of data lineage, in other words, the data & information value chain.

If you are interested to know more about documentation of data lineage and information value chain you can consult my latest book ‘The Data Management Toolkit’. I am here to help you if the water gets rough or the path overgrown. Good luck creating your data lineage.

——————————————————————————————————–

References:

BackOffice Associates. “How Data Quality Impacts Business Processes”: boaweb.com/rs/backoffice/images/Infographic-DQ_FINAL.pdf.

BackOffice Associates. “How Data Quality Impacts Business Processes”: boaweb.com/rs/backoffice/images/Infographic-DQ_FINAL.pdf.

BackOffice Associates. “How Data Quality Impacts Business Processes”: boaweb.com/rs/backoffice/images/Infographic-DQ_FINAL.pdf.


Identify your path to CFO success by taking our CFO Readiness Assessmentᵀᴹ.

Become a Member today and get 30% off on-demand courses and tools!

For the most up to date and relevant accounting, finance, treasury and leadership headlines all in one place subscribe to The Balanced Digest.

Follow us on Linkedin!