Evaluation: Testing misstatement

Koen Derks

last modified: 19-08-2021

Hypothesis testing

In an audit sampling test the auditor generally assigns performance materiality, \(\theta_{max}\), to the population which expresses the maximum tolerable misstatement (as a fraction or a monetary amount). The auditor then inspects a sample of the population to make a decision between the following two hypotheses:

\[H_1:\theta<\theta_{max}\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, H_0:\theta\geq\theta_{max}\].

The evaluation() function allows you to make a statement about the credibility of these two hypotheses after inspecting a sample. Note that this requires that you specify the materiality argument in the function.

Classical hypothesis testing using the p-value

Classical hypothesis testing uses the p value to make a decision about whether to reject the hypothesis \(H_0\) or not. As an example, consider that an auditor wants to verify whether the population contains less than 5 percent misstatement, implying the hypotheses \(H_1:\theta<0.05\) and \(H_0:\theta\geq0.05\). They have taken a sample of 100 items, of which 1 contained an error. They set the significance level for the p value to 0.05, implying that a p value < 0.05 will be enough to reject the hypothesis \(H_0\).

result_classical <- evaluation(materiality = 0.05, x = 1, n = 100)
summary(result_classical)
## 
##  Classical Audit Sample Evaluation Summary
## 
## Options:
##   Confidence level:               0.95 
##   Materiality:                    0.05 
##   Materiality:                    0.05 
##   Hypotheses:                     H0: T >= 0.05 vs. H1: T < 0.05 
##   Method:                         poisson 
## 
## Data:
##   Sample size:                    100 
##   Number of errors:               1 
##   Sum of taints:                  1 
## 
## Results:
##   Most likely error:              0.01 
##   95 percent confidence interval: [0, 0.047439] 
##   Precision:                      0.037439 
##   p-value:                        0.040428

As we can see, the p value is lower than 0.05 implying that the hypothesis \(H_0\) is rejected.

Bayesian hypothesis testing using the Bayes factor

Bayesian hypothesis testing uses the Bayes factor, \(BF_{10}\) or \(BF_{01}\), to make a statement about the evidence provided by the sample in support for one of the two hypotheses \(H_1\) or \(H_0\). The subscript The Bayes factor denotes which hypothesis it favors. By default, the evaluation() function returns the value for \(BF_{10}\).

As an example of how to interpret the Bayes factor, the value of \(BF_{10} = 10\) (provided by the evaluation() function) can be interpreted as: the data are 10 times more likely to have occurred under the hypothesis \(H_1:\theta<\theta_{max}\) than under the hypothesis \(H_0:\theta\geq\theta_{max}\). \(BF_{10} > 1\) indicates evidence for \(H_1\), while \(BF_{10} < 1\) indicates evidence for \(H_0\).

\(BF_{10}\) Strength of evidence
\(< 0.01\) Extreme evidence for \(H_0\)
\(0.01 - 0.033\) Very strong evidence for \(H_0\)
\(0.033 - 0.10\) Strong evidence for \(H_0\)
\(0.10 - 0.33\) Moderate evidence for \(H_0\)
\(0.33 - 1\) Anecdotal evidence for \(H_0\)
\(1\) No evidence for \(H_1\) or \(H_0\)
\(1 - 3\) Anecdotal evidence for \(H_1\)
\(3 - 10\) Moderate evidence for \(H_1\)
\(10 - 30\) Strong evidence for \(H_1\)
\(30 - 100\) Very strong evidence for \(H_1\)
\(> 100\) Extreme evidence for \(H_1\)

Example

Again, consider the same example of an auditor who wants to verify whether the population contains less than 5 percent misstatement, implying the hypotheses \(H_1:\theta<0.05\) and \(H_0:\theta\geq0.05\). They have taken a sample of 100 items, of which 1 contained an error. The prior distribution is assumed to be a default beta(1,1) prior.

The output below shows that \(BF_{10}=515\), implying that there is extreme evidence for \(H_1\), the hypothesis that the population contains misstatements lower than 5 percent of the population.

prior <- auditPrior(materiality = 0.05, method = "default", likelihood = "binomial")
result_bayesian <- evaluation(materiality = 0.05, x = 1, n = 100, prior = prior)
summary(result_bayesian)
## 
##  Bayesian Audit Sample Evaluation Summary
## 
## Options:
##   Confidence level:               0.95 
##   Materiality:                    0.05 
##   Materiality:                    0.05 
##   Hypotheses:                     H0: T > 0.05 vs. H1: T < 0.05 
##   Method:                         binomial 
##   Prior distribution:             beta(a = 1, ß = 1) 
## 
## Data:
##   Sample size:                    100 
##   Number of errors:               1 
##   Sum of taints:                  1 
## 
## Results:
##   Posterior distribution:         beta(a = 2, ß = 100) 
##   Most likely error:              0.01 
##   95 percent credible interval:   [0, 0.046107] 
##   Precision:                      0.036107 
##   BF10:                            515.86

Sensitivity to the prior distribution

In audit sampling, the Bayes factor is dependent on the prior distribution for \(\theta\). As a rule of thumb, when the prior distribution is very uninformative (as with method = 'default') with respect to \(\theta\), the Bayes factor tends to overquantify the evidence in favor of \(H_1\). You can mitigate this dependency using method = "impartial" in the auditPrior() function, which constructs a prior distribution that is impartial with respect to the hypotheses \(H_1\) and \(H_0\).

The output below shows that \(BF_{10}=47\), implying that there is strong evidence for \(H_1\), the hypothesis that the population contains misstatements lower than 5 percent of the population. Since the two priors both resulted in convincing Bayes factors, the results are robust to the choice of prior distribution.

prior <- auditPrior(materiality = 0.05, method = "impartial", likelihood = "binomial")
result_bayesian <- evaluation(materiality = 0.05, x = 1, n = 100, prior = prior)
summary(result_bayesian)
## 
##  Bayesian Audit Sample Evaluation Summary
## 
## Options:
##   Confidence level:               0.95 
##   Materiality:                    0.05 
##   Materiality:                    0.05 
##   Hypotheses:                     H0: T > 0.05 vs. H1: T < 0.05 
##   Method:                         binomial 
##   Prior distribution:             beta(a = 1, ß = 13.513) 
## 
## Data:
##   Sample size:                    100 
##   Number of errors:               1 
##   Sum of taints:                  1 
## 
## Results:
##   Posterior distribution:         beta(a = 2, ß = 112.513) 
##   Most likely error:              0.0088878 
##   95 percent credible interval:   [0, 0.041108] 
##   Precision:                      0.03222 
##   BF10:                            47.435

References