Audit Rating Systems: How to Choose and Apply Them Consistently

Inconsistent audit ratings undermine the credibility of the entire internal audit function. When the same control weakness receives a "high" rating in one audit and a "medium" rating in another — based on the individual auditor's judgement rather than a defined standard — stakeholders lose confidence in the function's ability to provide reliable, comparable assurance. Rating consistency is not a bureaucratic requirement. It is a prerequisite for credibility.

Why Ratings Matter

Audit ratings communicate significance. They tell auditees, management, and governance bodies how serious a finding is, how urgently it needs to be addressed, and how it compares to other findings across the function's portfolio. When ratings are calibrated and consistently applied, the aggregate pattern of ratings provides meaningful governance intelligence — showing which business units or processes have the highest concentration of significant issues and guiding the prioritisation of management attention and audit resources.

When ratings are inconsistent, none of this is possible. Aggregate reporting becomes meaningless. Trend analysis is unreliable. And auditees quickly learn that the severity of a finding reflects the personality of the auditor as much as the nature of the control weakness — which invites negotiation and pressure that audit teams should not have to manage.

Common Rating Systems

Internal audit functions use several different rating approaches, each with distinct advantages and disadvantages.

Satisfactory/Needs Improvement/Unsatisfactory: Simple three-level systems are intuitive and easy to communicate to non-audit audiences. The challenge is that "Needs Improvement" is a very wide band that can cover both minor administrative issues and significant control failures. Without subcategories or additional guidance, application is highly subjective.

High/Medium/Low: Perhaps the most commonly used system, H/M/L provides a slightly finer gradation than three-level systems. The critical design requirement is a clear definition of each level in terms of business impact, probability, and control significance — not vague descriptions that leave application to individual judgement.

Numerical scales (1-5 or 1-10): Numerical scales create the appearance of precision but can actually increase inconsistency, because they require finer-grained judgements and the anchoring of specific numbers to specific control characteristics is even more difficult. They are most useful when paired with detailed guidance on exactly which combination of factors produces each score.

Red/Amber/Green (RAG): RAG systems are visually intuitive and work well in dashboards and management reporting. The same design requirements apply — without precise definitions, the middle category (Amber) tends to absorb most findings, reducing the system's discriminatory power.

What Makes a Rating System Work

The difference between a rating system that produces consistent results and one that does not lies almost entirely in the quality of the rating criteria. Effective criteria define the threshold for each rating level in terms of:

The potential impact of the control weakness if exploited — financial, reputational, regulatory, or operational
The likelihood that the weakness will result in a loss or compliance failure without remediation
The pervasiveness of the issue — whether it affects a single transaction, a process, or a systemic control across the organisation
The management's awareness of the issue — whether it is a newly identified risk or a recurrence of a previously reported finding

A finding is not high-rated because it makes the auditor uncomfortable. It is high-rated because it meets defined, documented criteria that the entire function has agreed to apply consistently.

Quality Control for Rating Consistency

Even with well-designed criteria, rating consistency requires active quality control. Working paper review processes should specifically check that finding ratings are supported by the criteria. Periodic calibration exercises — where the team rates the same set of hypothetical findings independently and then discusses differences — identify and correct divergent interpretations. And trend analysis of ratings by auditor can surface patterns of systematic over- or under-rating that require coaching and correction.

Audit Rating Systems: How to Choose and Apply Them Consistently

Why Ratings Matter

Common Rating Systems

What Makes a Rating System Work

Quality Control for Rating Consistency

Request Training

Related Publications

About the Author

Audit Rating Systems: How to Choose and Apply Them Consistently

Why Ratings Matter

Common Rating Systems

What Makes a Rating System Work

Quality Control for Rating Consistency

Request Training

Related Publications

About the Author

Continue Reading