This is part 2 of a 4 part series on Data Governance, taken from a paper written by Managing Partner Virginia Flores.
There are numerous problems which can affect data quality. As Figure 1 shows, these issues can be grouped into three basic categories:
Figure 1: Processes affecting data quality
Processes bringing data in from the outside including data feeds and initial data conversions are two of the biggest sources of problems resulting in immediate declines in data quality. Simply stated, most companies do not have standard checks and balances built into the interfaces or in place within the application to account for the data being imported because it is assumed that the data is coming from a “trusted” source. In this case, the data is simply migrated from one point to another, feeding incorrect data through the organization.
Processes causing data decay are those processes internal to the application. Some of these processes are routine, while others are brought about by periodic system upgrades, mass data updates, database redesign, and a variety of ad-hoc activities. Unfortunately, in practice most of these procedures lack time and resources, as well as reliable metrics necessary to understand all data quality implications.
Processes changing data from within are processes that cause accurate data to become inaccurate over time, without any physical changes made to it. The main reason for this is while data within an application is static; the real-time data is always changing. This is shown by the following graphic where t0 represents the moment in time when the data is entered into the system:
The idea that data quality is a constantly changing target (static versus real-time data) is one of the main issues driving the creation of a data governance infrastructure within organizations.
These days, data quality impacts an organization at every level and in every department. Initially, data quality was confined to CRM (Customer Relation Management) systems. Now complexity extends beyond structured customer data. Organizations are concerned about governing access to many types of data including unstructured content, trade secrets, financial data, patient information, video, audio, etc. Data that is incorrect or inconsistent can have a profound effect on an organization’s day to day operations. In work order systems, work orders are not completed correctly because part numbers cannot be matched up, in financial systems vendors are told “the check is in the mail” when it has actually gone to the wrong address, or in a time system employees are not paid overtime correctly because timesheets were incorrectly coded to the pay differential scale. Many organizations neither recognize nor accept the bad quality status of their data, and try instead to divert the attention to supposed faults within their respective systems or processes.
In the next post, we will talk about the importance of a Data Governance Framework.