Decisions are only as good as the information on which they are based. The potential damage to service users arising from poor data quality as well as the legal, financial and reputational costs to the organisation are of such magnitude that organisations must be willing to take the time and give the necessary commitment to improve data quality. Every organization today depends on data to understand its customers and employees, design new products, reach target markets, and plan for the future. Accurate, complete, and up-to-date information is essential if you want to optimize your decision making, avoid constantly playing catch-up and maintain your competitive advantage.
Business leaders recognize the value of big data and are eager to analyse it to obtain actionable insights and improve the business outcomes. Unfortunately, the proliferation of data sources and exponential growth in data volumes can make it difficult to maintain high-quality data. To fully realize the benefits of big data, organizations need to lay a strong foundation for managing data quality with best-of-breed data quality tools and practices that can scale and be leveraged across the enterprise.
What can your organisation do to make data quality a success?
Within an organization, acceptable data quality is crucial to operational and transactional processes and to the reliability of business analytics (BA) / business intelligence (BI) reporting.
Confidence in the quality of the information it produces is a survival issue for government agencies around the world. Health information and Quality Authority of Ireland has adopted a business-driven approach to standards for data and information and endorse “Seven essentials for improving data quality” guide:
Data Quality is central to an effective performance management system throughout the organization. Data quality is a complex measure of data properties from various dimensions and determined by whether or not the data is suitable for its intended use. This is generally referred to as being “fit-for-purpose”. Data is of sufficient quality if it fulfils its intended use (or re-use) in operations, decision making or planning. Maintaining data quality requires going through the data periodically and scrubbing it. Typically this involves updating it, standardizing it, and de-duplicating records to create a single view of the data, even if it is stored in multiple disparate systems.
Data Quality Management entails the establishment and deployment of roles, responsibilities, policies, and procedures concerning the acquisition, maintenance, dissemination, and disposition of data. A partnership between the business and technology groups is essential for any data quality management effort to succeed. The business areas are responsible for establishing the business rules that govern the data and are ultimately responsible for verifying the data quality. The Information Technology (IT) group is responsible for establishing and managing the overall environment – architecture, technical facilities, systems, and databases – that acquire, maintain, disseminate, and dispose of the electronic data assets of the organization.
A data quality assurance program.
Data quality assurance (DQA) is the process of verifying the reliability and effectiveness of data: an explicit combination of organization, methodologies, and activities that exist for the purpose of reaching and maintaining high levels of data quality. To make the most of open and shared data, public and government users need to define what data quality means with reference to their specific aim or objectives. They must understand the characteristics of the data and consider how well it meets their own needs or expectations. For each dimension of quality, consider what processes must be in place to manage it and how performance can be assessed.
Data quality control
Data quality control is the process of controlling the usage of data with known quality measurements for an application or a process. This process is usually done after a Data Quality Assurance (QA) process, which consists of discovery of data inconsistency and correction. Data quality is affected by the way data is entered, stored and managed. Analytics can be worthless, counterproductive and even harmful when based on data that isn’t high quality. Without high-quality data, it doesn’t matter how fast or sophisticated the analytics capability is. You simply won’t be able to turn all that data managed by IT into effective business execution.
Difference between Data and Information
Data and information are interrelated. In fact, they are often mistakenly used interchangeably.
Data is raw, unorganized facts that need to be processed. Data can be something simple and seemingly random and useless until it is organized.
When data is processed, organized, structured or presented in a given context so as to make it useful, it is called information.
If the information we derive from the data is not accurate, we cannot make reliable judgments or develop reliable knowledge from the information. And that knowledge simply cannot become wisdom, since cracks will appear as soon as it is tested.
Bad data costs time and effort, gives false impressions, results in poor forecasts and devalues everything else in the continuum.
What are the factors determining data quality?
Understanding the difference between data and information is the key to solving data quality. To be most effective, the right data needs to be available to decision makers in an accessible format at the point of decision making. The quality of data can be determined through assessment against the following internationally accepted dimensions.
In 1987 David Garvin of the Harvard Business School developed a system of thinking about quality of products. He proposes eight critical dimensions or categories of quality that can serve as a framework for strategic analysis: performance, features, reliability, conformance, durability, serviceability, aesthetics, and perceived quality.
Agencies create or collect data and information to meet their operational and regulatory requirements. They will define their own acceptable levels of data quality according to these primary purposes. It is often a mistake to stick with old quality measures when the external environment has changed.
Thus dimensions of quality also differ from user to user: completeness, legibility, relevance, reliability, accuracy, timeliness, accessibility, interpretability, coherence, accessibility, Interpretability and validity. Data also has to be volume manageable, cost effective and coherent. Clearly they are not independent of each other. This will help ensure that an organisation has a good level of data quality supporting the information it produces.
The dimensions contributing to data quality
Master Data Management
A lot of business problems traces back to lack of data governance and poor quality data in the end. Master data management technology can address a lot of these issues, but only when driven by an MDM strategy that includes a vision that supports the overall business and incorporates a metrics-based business case. Data governance and organizational issues must be put front and centre, and new processes designed to manage data through the entire information management life cycle. Only then can you successfully implement the new technology you’ll introduce in a data quality or master data management initiative.
At its recent Master Data Management Summit in Europe, Gartner recommended a structural approach to implementing master data management, beginning with strategy for development and planning, then setting up a process to govern data. Subsequently, this will aid change management of all types and smartly utilize data targeted at strategic business goals. Once set, data management can be measured, monitored and altered to stay on course.
MDM software includes process, governance, policy, standards and tools to manage an organization’s critical data. MDM applications manage customer, supplier, product, and financial data with data governance services and supporting world-class integration and BI components. Data quality is a first step towards MDM, which allows you to start with one application knowing that MDM will be introduced as more applications get into the act.
Effective data governance serves an important function within the enterprise, setting the parameters for data management and usage, creating processes for resolving data issues and enabling business users to make decisions based on high-quality data and well-managed information assets. But implementing a data governance framework isn’t easy. Complicating factors often come into play, such as data ownership questions, data inconsistencies across different departments and the expanding collection and use of big data in companies.
At its core, data governance incorporates three key areas: people, process and technology. In other words, a data governance framework assigns ownership and responsibility for data, defines the processes for managing data, and leverages technologies that will help enable the aforementioned people and processes. At the end of the day, data quality and data governance are not synonymous, but they are closely related. Quality needs to be a mandatory piece of a larger governance strategy. Without it, your organization is not going to successfully manage and govern its most strategic asset: its data.
Any good active data governance methodology should let you measure your data quality. This is important because data quality actually has multiple dimensions which need to be managed. Data governance initiatives improve data quality by assigning a team responsible for data’s accuracy, accessibility, consistency, and completeness, among other metrics. This team usually consists of executive leadership, project management, line-of-business managers, and data stewards. The team usually employs some form of methodology for tracking and improving enterprise data, such as Six Sigma, and tools for data mapping, profiling, cleansing, and monitoring data.
ISO 8000 is the international standard that defines the requirements for quality data, understanding this important standard and how it can be used to measure data quality is an important first step in developing any information quality strategy.
- NSW Government Standard for Data Quality Reporting March 2015
- Health information and Quality Authority of Ireland 2012 “What you should know about data quality”