Data Quality in Internal Audit: Why Bad Data Produces Misleading Conclusions

The shift toward data analytics in internal audit has created a new category of audit risk that many functions have not fully addressed: the risk that the data being analysed does not accurately represent the population being audited. Audit analytics is only as good as the data it runs on — and many organisations' data environments contain quality issues that, if undetected, will cause analytical procedures to produce misleading or incorrect conclusions.

The Dimensions of Data Quality

Data quality in the context of audit analytics can be assessed across several dimensions, each of which can independently compromise the reliability of analytical conclusions.

Completeness: Does the dataset contain all the records that should be present? Missing records — through extraction errors, system boundaries that exclude certain transaction types, or deliberate manipulation — will cause analytics to undercount the population being examined. A completeness test compares the number of records in the extract against an independent count from the source system and checks for gaps in sequence numbers or date ranges that should have continuous coverage.

Accuracy: Do the values in the dataset correctly represent the underlying transactions? Accuracy issues include data entry errors, system conversion errors, rounding differences, and currency conversion inconsistencies. Accuracy testing typically involves reconciling key totals in the extract against source system reports and tracing a sample of individual records back to source documentation.

Consistency: Are the same concepts represented consistently across the dataset? Common consistency issues include multiple representations of the same entity — the same vendor coded differently across systems or periods — inconsistent categorisation of similar transaction types, and date field inconsistencies where different systems record dates at different points in the transaction cycle.

Validity: Do the data values conform to expected formats, ranges, and domain constraints? Validity issues include dates outside plausible ranges, amounts with incorrect sign conventions, codes that do not correspond to valid reference table entries, and records with mandatory fields blank.

Timeliness: Does the dataset reflect the state of the population at the appropriate point in time? Data that was current two days ago may have changed significantly in a volatile transaction environment, and analytics designed to test the current state of controls will produce incorrect results if the underlying data reflects a different point in time.

Data Validation as a Mandatory Audit Step

Data validation should be a documented audit step that precedes any substantive analytical procedure — not an optional preliminary that delays the real work. Running completeness and accuracy tests, documenting the results, and either resolving material data quality issues before proceeding or qualifying the analytical conclusions to reflect identified limitations is a professional obligation, not a bureaucratic formality.

Many audit functions skip data validation because it feels like a technical preliminary. This is a mistake. Data quality testing is audit work. It provides assurance that the conclusions drawn from subsequent analytics are valid, and it surfaces data integrity issues that are themselves reportable findings.

An audit that uses unvalidated data is like a physical inspection conducted through a dirty window. You may see something meaningful — or you may be looking at an artefact of the glass. You cannot tell which without cleaning the window first.

When Data Quality Issues Are Findings

Data quality problems identified during audit validation are frequently reportable in their own right. An accounts payable dataset with 3% of records showing blank vendor codes is not merely an inconvenience — it indicates a gap in the vendor master validation controls that should be preventing incomplete records from being saved. A general ledger extract that does not reconcile to the published trial balance within defined materiality thresholds suggests a data integrity issue warranting escalation. A payroll system extract where employee IDs do not consistently match HR system records indicates a reconciliation gap between systems that could mask unauthorised payroll changes.

Data quality findings in high-risk transaction systems — particularly those supporting financial reporting or processing high-value payments — can be significant control findings in their own right, independent of what substantive analytics subsequently reveal. Reporting these findings clearly, with an explanation of their control significance and not just their analytical inconvenience, ensures that data governance receives appropriate governance attention.

Improving Data Quality Over Time

Internal audit can play a constructive role in improving organisational data quality over time by consistently reporting data quality issues, tracking their remediation, and providing management with visibility into the aggregate data quality posture across key systems. Organisations that understand their data quality gaps — and that have governance accountability for addressing them — progressively build the data infrastructure that makes sophisticated analytics reliable and sustainable. This is a long-term contribution that adds value well beyond the individual audit in which the issue was first identified.

Data Quality in Internal Audit: Why Bad Data Produces Misleading Conclusions

The Dimensions of Data Quality

Data Validation as a Mandatory Audit Step

When Data Quality Issues Are Findings

Improving Data Quality Over Time

Request Training

Related Publications

About the Author

Data Quality in Internal Audit: Why Bad Data Produces Misleading Conclusions

The Dimensions of Data Quality

Data Validation as a Mandatory Audit Step

When Data Quality Issues Are Findings

Improving Data Quality Over Time

Request Training

Related Publications

About the Author

Continue Reading