Critical Data Elements – How to Implement Them for Your Business

Recently, the concept of critical data has caught the attention of data management professionals. I myself was no exception, so I decided to dive deeper into this subject and do some research. In this article, I would like to share some results of my research and my experience with:

  • …definitions of critical data and critical data elements (CDEs)
  • …reasons to use CDEs
  • …key challenges with CDEs in practical implementation.

The definition of critical data

As a starting point of my research, I decided to consult the leading data management guides and legislation documentation to see what they had to say about critical data (elements).

The concept of critical data has appeared in the second edition of DAMA-DMBOK (DAMA-DMBOK 2) by DAMA International in the topics related to the Data Quality Knowledge Area1 . DAMA-DMBOK2 provides only general characteristics of critical data.

Critical data is specified by its usage, which is ‘regulatory reporting, financial reporting, business policy, ongoing operations, business strategy2. DAMA-DMBOK2 also stresses that ‘specific drivers for criticality will differ by industry’ 3. This was all they had related to the topic, so it seems if I wanted to go beyond these definitions, I would have to develop the concept of the critical data myself.

The critical data concept has also been introduced in The Basel Committee on Banking Supervision‘s standard number 239: “Principles for effective risk data aggregation and risk reporting” (BCBS 239 or PERDARR). BCBS239 speaks about critical data in the following contexts:

  1. ‘data that is critical to enabling the bank to manage the risks it faces’4
  2. ‘data critical to risk data aggregation and IT infrastructure initiative’5
  3. ‘aggregated information to make critical decisions about risk’6.

After consulting these guidelines and regulations, I concluded the concept of critical data is not yet defined or aligned in various sources. For the purpose of this article, we might keep in mind the following:

  • critical data influences the company’s management decisions and performance, both financial and non-financial
  • the criteria of criticality should be developed on a company by company basis.

Now let’s talk about the business value of implementing the critical data elements concept.

Reasons to use CDEs

The key reason to use the concept of CDEs in your practice is to limit the scope of your data management initiatives to the feasible minimum.

Assume that your key data management driver is compliancy with a regulation. If you focus on compliancy with The EU General Data Protection Regulation, you will deal only with personal data. If you deal with BCBS 239, you will limit your data to that which is related to risk reporting. Still, the scope of the definition of ‘risk’ data is very wide. Therefore, you should focus only on those risk metrics or KPIs that your company is using to manage business risks. The same applies to financial data which constitutes almost 80% of data circulating within a company.

Even though the definition of CDE is not very well aligned, in theory, everything seems to be rather clear… Until you start implementing the concept in practice. Then it can become rather challenging. Let’s take a closer look at what the main challenges are and how to deal with them.

Key challenges in practical implementation of CDEs

1. How to define CDEs.

For the resolution of this challenge, let’s look back at our first conclusion: critical data elements are those that influence company performance, both financial and non-financial. The easiest way to define your company’s CDEs can be described in several steps:
Think about compliance with regulations, improvement of customer experience, optimization of decision making and so on. Each business driver will require a specific set of data and/or information. For example, for improvement of customer experience, you will mainly focus on customer data.
Despite a lot of talk about digitalization, when it comes to decision making, major companies still rely on different reports. Reports are simply containers of information. What you could do, is list your reports and choose the most critical ones. Such an analysis will help optimize information delivery in your company.
Once the critical reports are specified, you should start analysing the critical data elements. They usually reside within reports in the form of KPIs or metrics. You might count 50-100 such critical data elements.
You can minimize the number of critical data elements by involving subject matter experts. You might ask me now: why do we need to minimize the number of CDEs? This is the second challenge you deal with: what to do with the CDEs?

  • Identify your key drivers for the current data management initiative.
  • Specify your key (critical) reports and information.
  • Define your critical data elements
  • Minimize the number of critical data elements

2. What are you going to do with CDEs?

We have specified that CDEs are data elements that have the biggest influence on decision making and company performance. It means that the value of the CDEs depends on their reliability. Therefore, you need to ensure that the calculation of these CDEs is based on correct data and is calculated correctly. What we face here are basic data quality challenges. Your key goal is to check and prove the reliability of the KPIs or metrics that you specify as critical.

3. Recognise ultimate and transitional CDEs and different criteria of their criticality

So far, we have talked about how CDEs usually reside in reports in the form of KPIs or metrics. I call them the ‘ultimate’ CDEs because they are located at the final point of data / information processing path. Such a path is called ‘data lineage’ or ‘data/ information value chain’. I consider information value chain a set of business capabilities that enable transformation of raw data into meaningful information to enhance decision making at different organizational level in the company. In this respect, data lineage is the way to document or record the information value chain. If we need to ensure critical data elements being of required quality, we need to be able to perform root-cause analysis and investigate the whole chain of data being processed and used to derive the specified ultimate CDEs. All data elements that are involved in the calculation of the ultimate CDEs I call the ‘transitional’ ones. The criteria of criticality for the transitional CDEs is their impact on the calculation results of the ultimate ones. The illustration of the concept of ‘ultimate’ and ‘transitional’ CDEs is shown in Figure 1:

Figure 1. The concept of the ‘ultimate’ and ‘transitional’ CDEs.
Figure 1. The concept of the ‘ultimate’ and ‘transitional’ CDEs.

If you take a look at the illustrated relationships between the ‘ultimate’ and the ‘transitional’ CDEs you will understand that this is visualization of data lineage. And this brings you to the next challenge: to ensure ultimate CDEs are trustful and auditable, you need to have the whole data lineage in place.

4. Data lineage and CDEs: the ‘chicken or the egg’ dilemma

We just faced the most critical challenge: data lineage is a prerequisite to manage CDEs!! I hope that you are familiar with the data lineage concept. If you need to refresh your knowledge, I can refer to the set of articles I just published on the subject (Data Lineage 101, 102, 103, 104 & 105)

The reality is that not many companies have data lineage in place. So, the situation reminds me of the well-known ‘the chicken or the egg’ dilemma. To manage CDEs you need to have data lineage, but to be able to document data lineage, you need to have CDEs to limit the scope.

What should you do in such a situation? One of the practical tips is the following. If you know your sourcing data elements, you can use experts to specify the most critical ones and hope that the specified CDEs really make the biggest influence on the calculation results. Such an approach does not exclude the necessity to make attempts to document data lineage. The last challenge that relates to both data lineage and CDEs concept, is: on which level of data models should you specify CDEs?

5. Specify the level of data models to document CDEs and data lineage

Dealing with this challenge, you may conclude the documentation for the ‘ultimate’ and the ‘transitional’ data elements will differ.

‘Ultimate’ data elements that are often KPIs or metrics in reports will need to be specified on conceptual or logical levels of data models. To explain how the ‘transitional’ data elements are being processed to derive the ‘ultimate’ ones, you will need to document them per the illustrated below in Figure 2:

Figure 2: Data model levels to document ultimate and transitional CDEs.
Figure 2: Data model levels to document ultimate and transitional CDEs.

There are some other questions be answered such as, who is responsible for documenting ultimate and transitional CDEs? I will come back to the topic of data management related roles in my future articles.

I hope that by now you are reasonable equipped to continue with practical implementation of critical data elements concept.

For those, who are interested to know more about the application of the concept of CDEs and information value chain, can consult my new book The Data Management Toolkit. You can download the first chapter for free HERE, or purchase it on Amazon HERE.

————————————————————————————————————————-

References

1. DAMA International. DAMA-DMBOK: Data Management Body of Knowledge, Second Edition. Bradley Beach, N.J.: Technics Publications, 2017, p.454.

2. DAMA International. DAMA-DMBOK: Data Management Body of Knowledge, Second Edition. Bradley Beach, N.J.: Technics Publications, 2017, p.454.

3. DAMA International. DAMA-DMBOK: Data Management Body of Knowledge, Second Edition. Bradley Beach, N.J.: Technics Publications, 2017, p.454.

4. The Basel Committee on Banking Supervision‘s standard number 239: “Principles for effective risk data aggregation and risk reporting” (BCBS 239 or PERDARR), par.16.

5. The Basel Committee on Banking Supervision‘s standard number 239: “Principles for effective risk data aggregation and risk reporting” (BCBS 239 or PERDARR), par.30.

6. The Basel Committee on Banking Supervision‘s standard number 239: “Principles for effective risk data aggregation and risk reporting” (BCBS 239 or PERDARR), par.52.


​Not a member-scholar yet? Join our financial community here!

Identify your path to CFO success by taking our CFO Readiness Assessmentᵀᴹ.

For the most up to date and relevant accounting, finance, treasury and leadership headlines all in one place subscribe to The Balanced Digest.

Follow us on Linkedin, Facebook, Twitter.