HomeAbout UsServices Our ExpertsResources InsightsGet in Touch
Home/ Insights/ Internal Audit
Internal Audit

Audit Sampling in Practice: What Every Internal Auditor Needs to Know

Kamran Iqbal, CIA, CISA, CFE, CRMA April 2026 9 min read
Sampling is the mechanism by which internal auditors draw conclusions about large populations from examination of a subset of items. It is one of the most frequently applied audit techniques — and one of the most frequently misapplied. Auditors who do not understand the statistical logic of sampling cannot properly size their samples, interpret their results, or communicate the limitations of their conclusions. This article covers the essential concepts every internal auditor needs to understand.

Why Sampling Exists — And When It Does Not Apply

Sampling is a response to practical constraints. When a population is too large to examine in its entirety within the available time and resources, sampling allows the auditor to examine a representative subset and draw conclusions about the whole. This is the foundational logic: if the sample is drawn correctly, conclusions about the sample can be extended to the population with a defined level of confidence.

It is important to establish when sampling does not apply. When entire-population testing is practical — particularly with data analytics tools that can process thousands or millions of transactions — sampling may be unnecessary and inferior. A duplicate payment test run across all 50,000 payment transactions in a year is more powerful than a sample of 60. Sampling is appropriate when population testing is genuinely impractical, not as a default methodology when better options exist.

Statistical vs. Non-Statistical Sampling

The most fundamental distinction in audit sampling is between statistical and non-statistical sampling. Statistical sampling uses probability theory to select items and measure results, allowing the auditor to quantify the sampling risk associated with their conclusions. Non-statistical sampling uses auditor judgement to select items and interpret results, without the mathematical underpinning that would allow sampling risk to be quantified.

Both approaches are permitted under professional standards, but they have different implications. Statistical sampling produces conclusions with explicitly stated confidence levels — "I am 95% confident that the error rate in this population does not exceed 3%." Non-statistical sampling produces conclusions that reflect the auditor's judgement but cannot be expressed with statistical precision. The choice between them should reflect the risk level of the area being tested and the need for defensible, precisely stated conclusions.

Key Statistical Concepts

Confidence level is the probability that the true population characteristic falls within the range estimated from the sample. A 95% confidence level — the most common in audit work — means that if the same sampling procedure were repeated 100 times on the same population, the true population value would fall within the estimated range in 95 of those 100 repetitions. A higher confidence level requires a larger sample size.

Tolerable error (also called tolerable deviation rate or materiality, depending on the test type) is the maximum error rate or monetary error the auditor is willing to accept in the population while still concluding that controls are effective or the account balance is not materially misstated. A lower tolerable error requires a larger sample size, because the auditor needs more evidence to support a tighter conclusion.

Expected error rate is the auditor's estimate of the error rate likely to exist in the population before testing. When the expected error rate is higher, sample sizes must increase to maintain the same confidence level, because the auditor needs enough sample items to reliably distinguish between the expected error rate and the tolerable error rate.

Sampling risk is the risk that the sample conclusion differs from the conclusion that would be reached by examining the entire population. It has two components: the risk of assessing control risk as too low (over-reliance) and the risk of assessing control risk as too high (under-reliance). Statistical sampling allows sampling risk to be quantified and controlled; non-statistical sampling does not.

Attribute Sampling for Control Testing

Attribute sampling is used to test the operating effectiveness of controls — to determine whether a control operated as designed throughout the period. Each item in the population is assessed as a deviation (the control did not operate) or a non-deviation (the control operated correctly). The sample result is the deviation rate — the proportion of items where the control did not operate as intended.

Sample sizes for attribute sampling are determined by the confidence level required, the tolerable deviation rate, and the expected deviation rate. Standard attribute sampling tables or formulas produce specific sample sizes for each combination of these factors. A confidence level of 95%, a tolerable deviation rate of 5%, and an expected deviation rate of 1% produces a required sample size of approximately 93 items.

When a sample produces deviations, the auditor must evaluate whether the sample deviation rate is sufficiently below the tolerable deviation rate to support reliance on the control, or whether the deviation rate indicates that the control is not operating effectively and cannot be relied upon. This evaluation should consider not just the rate but the nature of the deviations — a systematic pattern of deviations is more concerning than randomly distributed ones of the same frequency.

Monetary Unit Sampling for Substantive Testing

Monetary unit sampling (MUS) — also known as probability-proportional-to-size sampling — is designed for substantive tests of account balances and transaction streams, where the auditor is testing for monetary error rather than deviation rates. In MUS, each monetary unit in the population has an equal probability of selection, meaning that larger-value items have a proportionally higher probability of inclusion in the sample. This property makes MUS particularly effective for testing high-value balances where large individual errors are most significant.

MUS sample sizes are determined by the confidence level, the tolerable monetary error, and the expected monetary error. The technique automatically produces a sample that emphasises higher-value items without the need for separate stratification of the population.

The most common sampling error in internal audit is selecting items judgementally and then reporting conclusions as if they were statistically valid. If a sample was not drawn using probability selection methods with defined confidence levels, the conclusions must be expressed as the auditor's judgement rather than as statistical conclusions. Overstating the precision of non-statistical sampling is a professional standards issue.

Selecting the Sample

Once the sample size is determined, the selection method must ensure that every item in the population has a defined, non-zero probability of selection — this is the prerequisite for statistical validity. Random number selection (using random number generators applied to a sequentially numbered population), systematic selection (every nth item), and stratified random selection (random selection within defined subgroups) all meet this requirement. Haphazard selection — where the auditor selects items that seem representative — does not meet the statistical requirement and should not be described as random sampling.

Further Reading

For a comprehensive and practical treatment of audit sampling methodology including worked examples, sample size tables, and guidance on evaluating results, the book Internal Audit Sampling in Practice is available from the CTC Global Gumroad store. It covers both attribute and MUS techniques in depth, with practical exercises designed for working audit professionals.

Share