Part 5: Accuracy Measurement and Reporting

Social Security Benefits: Accuracy of Benefit Administration.

Defining Accuracy
The Ministry’s Performance Measure for Accuracy
The Accuracy Reporting Programme – What It Measures and What It Doesn't
Unverifiable Items
Uncertainty of ARP Results
Measuring and Reporting the Size of Errors and Fraud
Identifying and Reporting the Causes of, and Hardship Caused by, Errors

Defining Accuracy

5.1
An observer who had the luxury of perfect information might define perfect accuracy in this context as the condition where everyone in New Zealand who has a legal entitlement to a benefit receives that exact entitlement on the date when it is legally due. However, such perfect accuracy cannot be achieved.

5.2
The Ministry has designed its accuracy measure to reflect the legal position that it cannot grant a benefit until application is made for it (see paragraph 2.9 on page 29).

5.3
In this part, we look at how the Ministry measures and reports accuracy through its Accuracy Reporting Programme.

The Ministry’s Performance Measure for Accuracy

5.4
The performance measure in the Ministry’s Statement of Intent for accuracy of benefits granted to working age beneficiaries is that the percentage of entitlement assessments completed accurately will be no less than 88%-90% (92% for people receiving New Zealand Superannuation). This is the only performance measure of accuracy that is reported to Parliament.

5.5
The Ministry’s prime focus is thus on accuracy in relation to eligible people who have applied for a benefit. The Ministry’s Accuracy Reporting Programme (ARP) does not address eligible people who have either:

made no contact with the Ministry; or
contacted the Ministry but not applied for a benefit.

5.6
The Ministry’s approach to accuracy is consistent with its legal obligation (see paragraphs 2.9-2.10 on page 29). The Purchase Agreement – in which the Ministry’s performance on accuracy is limited to matters after the application has been received – specifically refers to the accuracy of the processing of applications and reviews¹⁴ of benefit entitlement.

The Accuracy Reporting Programme – What It Measures and What It Doesn't

5.7
The ARP measures and reports the accuracy performance measure against an annual target stated in the Output Agreement. The Ministry samples about 250 applications and 250 reviews a year from each of its 13 regions, and checks that these have been processed correctly. Currently, the target is that no less than 88-90% of cases sampled for working age beneficiaries will be completed accurately (92% for people receiving New Zealand Superannuation).

5.8
The target has varied over time. In 1996-97, the Department of Social Welfare’s performance target for its Income Support Service was 95% accuracy for applications and 95% for reviews. In that year, the Income Support Service did not meet the target – actual performance was 82% for applications and only 74% for reviews.¹⁵ An accuracy target of 80% was set in 1998-99 and has since increased incrementally to the present level as performance has improved.

5.9
The Ministry’s reported accuracy performance for 2002-03 was 89.2% (2001-02 – 91.1%) for working-age beneficiaries, and 93.1% (2001-02 – 96.6%) for people receiving New Zealand Superannuation. The Ministry considers that the accuracy target (88-90% for working-age beneficiaries and 92% for superannuitants) is at a level that successive governments have been comfortable with. It has therefore not recently sought to make any fundamental changes to accuracy targets.

Scope of the Accuracy Reporting Programme

5.10
Since the scope of the ARP covers only the Ministry’s performance in assessing benefit applications, it does not and cannot estimate the proportion of eligible beneficiaries who have not “applied”. In the Ministry’s terminology, this is a matter of “uptake” rather than “accuracy”.

5.11
At an operational level, an application results in the creation of a computer record relating to that person and that transaction. It is from these computer records that the Ministry draws its sample to measure and report on the accuracy of benefit payments.

5.12
Consequently if, for example, someone –

does not know about their possible eligibility (and so does not contact the Ministry); or
knows about their possible eligibility but makes a deliberate choice not to contact the Ministry; or
knows about their possible eligibility and contacts the Ministry but receives incorrect advice that causes them not to make an application – they will not be included in the sample.

People Who Are Eligible But Not Aware of It

5.13
An example of a person who is eligible for a benefit but not aware of it is someone on a low income (and not receiving any first-tier benefit) who may be eligible for an Accommodation Supplement (a second-tier benefit) to offset their housing costs. The person may mistakenly believe that they could receive the Accommodation Supplement only if they are receiving one of the first-tier benefits. Their lack of awareness of eligibility results in them bearing the full cost of their housing.

5.14
Because the Ministry considers these cases as people who do not take up the benefit rather than people who are being underpaid, it does not regularly collect information on:

the likely size of the group;
the characteristics of the group; or
the average dollar amount that each person in the group should be receiving.

In making this point, we also acknowledge the Ministry’s substantial and ground-breaking work in relation to the Living Standards Research Project. This is an ongoing research programme focusing on developing a comprehensive description of the living standards of New Zealanders, which will enable governments and communities to develop evidence-based policies to address disparities between different groups of New Zealanders.

5.15
It is feasible to run a special exercise to collect information that could not be obtained through the ARP. Figure 9 below shows the results of a national survey the Ministry conducted in 1996 to ascertain the proportions of people who were and were not receiving the Accommodation Supplement.¹⁶ The three key features of the results are that:

Of the people determined to be eligible for the supplement (19% of the sample), just under two-thirds were receiving it.
Of the people determined not to be eligible (77% of the sample), a small group was receiving the benefit. The people in that group had (presumably) been granted the benefit on the basis of incorrect information (innocently or falsely given in their application) or on the basis of correct information incorrectly assessed by the Ministry.
The eligibility of 4% of the people in the sample could not be determined. Inadequate or incomplete information creates the risk that a person eligible for a benefit might be denied it or a person not eligible for a benefit might be granted it.

Figure 9
Benefit Accuracy Information – 1996 National Survey of Eligibility for the Accommodation Supplement

	Proportion of sample %
People eligible for the supplement–
• receiving it	12
• not receiving it	7
		19
People not eligible for the supplement –
• receiving it	3
• not receiving it	74
		77
Indeterminate eligibility		4
		100

Conclusions

5.16
The ARP measures and reports accuracy on the basis of the Ministry’s statutory obligations and Purchase Agreement. It does not, and is not intended to, provide a wider measure of benefits being paid accurately to all who are eligible to apply.

5.17
However, it is possible to measure and report data not collected by the ARP by conducting separate special exercises in data collection and analysis.

Unverifiable Items

5.18
Some of the cases selected as part of the ARP sample cannot be assessed as correct or incorrect because files or parts of files are found to be missing. The Ministry excludes these cases – which it refers to as “unverifiables” – from the ARP sample results. At the time the decision to exclude unverifiables from the sample results was taken, we agreed to this approach. However, we now take the view that the decision was incorrect, because cases for which papers cannot be found may well have a different error rate from the rate for cases that are well documented. Excluding them may therefore result in incorrect estimates.

5.19
Figure 10 on the next page the regional accuracy results against the national target for working age beneficiaries, excluding unverifiables, as measured by ARP.

Figure 10
Regional Accuracy Results for 2001-02 Against the National Target as Measured by ARP Excluding Unverifiables

Figure 10.

5.20
The national average for unverifiables as a proportion of the sample was 6.6% in 2001-02.¹⁷The Ministry has been trying to reduce the level of unverifiable cases. The level of unverifiables has come down to 4% for the 2002-03 year.

5.21
We understand that the improved results have been achieved mainly by the Ministry putting in place specific strategies to ensure that the level of unverifiables was reduced in 2002-03. Since July 2002, the Ministry has:

Introduced a procedure of sending, on the fourth working day of each month, a summary spreadsheet to each regional ARP liaison person providing the results of the sample and highlighting any outstanding unverifiable cases. The regional liaison person then signs off on the unverifiable spreadsheet and comments on each case referred back to them. They confirm how each case has been addressed with the case manager and advise ARP when additional information will be provided so that the original sample case can be assessed.
Introduced a revised deadline from 12 months to 30 days to have cases returned for assessment from 1 July 2002.
In those cases where no papers whatever could be located, introduced a process by which Benefit Control would check for indicators of fraud.
Reviewed results to identify common types of errors to identify training issues and provided this information to regional liaison people and National Training and Helpline teams.

Conclusions

5.22
With the present practice of excluding unverifiables from the sample, the results could over-estimate the overall level of accuracy. By changing this practice to include unverifiable cases as errors, there would be a strong incentive for the Ministry to reduce the unverifiables to as low a level as possible. However, there would also be a risk that the results could under-estimate the overall level of accuracy, since it is unlikely that all unverifiable items contain errors.

5.23
An alternative approach would be for the Ministry to clearly explain unverifiables and disclose their level in its annual report.

Recommendation 8
We recommend that – The Ministry explains unverifiables and discloses their level in its annual report. The precise form of the disclosure should be agreed with the Audit Office as part of the annual audit.

Uncertainty of ARP Results

5.24
Subject to appropriate treatment of unverifiables, the ARP provides a suitable national measure – its sample size was designed to produce, at national level, a 95% confidence interval of ±1.4% for both applications and reviews of existing benefits. When applications and reviews are aggregated (as at present), the 95% confidence interval for the national figure is ±1%. This means that, if the point estimate (i.e. the result) from the sample was 90% accuracy, the range of accuracy would be between 89% and 91% (i.e. 90% ±1%).

5.25
When applied at regional level, the current ARP sample results in a much wider 95% confidence interval of about ±5%. So, for example, while the Waikato region reported an accuracy level of 84.4% for 2001-02, the range of accuracy would be between 80.1% and 88.7%.

5.26
This high degree of sampling error makes it difficult to draw useful conclusions from the analysis of ARP data at a regional level. For example, Figure 11 (below) illustrates that at a 95% confidence level there is only one region whose range of possible results does not either exceed the national target or overlap it. This means that it would be unsafe to conclude (except for the one region’s case) that any particular region does not meet the national target.

5.27
Regional trend data is also difficult to interpret. For example, if a region produced a point estimate of 90% one year and 85% the next, it would be unsafe to conclude that there had been any change in performance between the two years, notwithstanding the difference between the two estimates. This is because the range that could be inferred from the first estimate would be 85% to 95% (i.e. 90% ±5%).

5.28
National Office provides the regions with their accuracy results, but only as a point estimate, as shown for each region in Figure 10 on page 62. Regions are not told the ranges of results, as shown in Figure 11 below.

Figure 11
Ranges of Regional Accuracy Results for 2001-02 Against the National Target

Figure 11.

5.29
Regions use their ARP results as one among a number of indicators of their performance in processing accuracy, and they take them into account in making operational changes to service delivery. Some regions aggregate their service centres’ 5+5 checks as an additional regional measure of accuracy. We believe there is merit in this approach, so long as the regions are carrying out checks to ensure that the data from different service centres and teams is consistent.

5.30
The Ministry outlined to us a range of other information sources available to Regional Commissioners and their staff that can supplement ARP data in helping them form a view of their region's performance in respect of benefit accuracy. These sources include:

The Ministry’s risk management framework (Tickit), which requires service centre managers to certify compliance with key procedures.
Work product and process monitoring by a national quality assurance team to ensure consistency of practice across all regions.
Periodic reports by Internal Audit that are completed after examinations of both samples of work and operating procedures.
A system which provides a central repository for information training material and processes to help staff in dealing with frequently asked questions and requests, enabling them to provide consistent information regardless of where they are based. The system incorporates a number of tools, including one that lodges details of client requests for a review and monitors each review from the date on which the request was received.
Reports from the Ministry’s Helpline service that record both the number of calls received from staff and the reasons for the call.
Reports on staff turnover rates and average duration of employment.
Individual performance assessments for each member of staff.
Information provided by regional training co-ordinators and the team coaches located in each service centre.

5.31
These additional sources do not provide statistically reliable measures of a region’s performance, but they do give Regional Commissioners:

information relevant to the competence and capability of the service centre staff in their regions;
assurance about the degree to which their staff are complying with established procedures; and
supplementary indicators that would help to disclose any significant problems relating to processing accuracy – and provide a basis for reconsidering any potentially misleading picture produced by ARP results.

5.32
To the extent that ARP does not provide a precise measure of accuracy at regional level, that deficiency is partially compensated by the existence of these other indicators of performance. The Ministry expressed confidence that Regional Commissioners would not react to only one indicator but would form their views about the performance of their regions having regard to all the available information.

5.33
Because the Ministry has adopted a regional management structure and has delegated to Regional Commissioners the task of deploying their resources to manage performance effectively within their regions, we regard the quality of the management information available to Regional Commissioners as very important.

5.34
The Ministry considers that the Government does not require any region to stay within a specified range of accuracy so long as the national target is met. However, even if the Ministry is meeting performance targets at a national level, substandard performance by one or more regions could still adversely affect a considerable number of people. For example, beneficiaries in one region being disadvantaged where a low level of accuracy in their region leads to an above average level of under- and overpayments, and the associated consequences.

5.35
We agree that the additional indicators reduce the risk of relying on ARP alone. However, in our view, it is questionable whether the information currently available to regional managers is sufficient.

5.36
For example, if a Regional Commissioner were to be confronted with an apparently adverse change in the ARP results, in circumstances where staff turnover was higher than usual and there was some evidence of noncompliance with established procedures, any conclusion about the actual state of regional performance would still be uncertain. This is because the apparently adverse change in ARP might be wholly caused by sampling error.

5.37
Conversely, an apparently positive ARP result that indicated no change or a positive change in the region’s performance might be similarly misleading. When information is imprecise, there is always a risk of responding to false signals.

5.38
In the same vein, the Ministry’s National Office needs to consider carefully the implications of the information it receives about regional performance and what management signals it sends in relation to its interpretation of that information.

5.39
National Office produces each month a document entitled Regional Performance Summary that contains a range of KPI indicators of regional performance. Each KPI is assigned one of four colours – blue, green, yellow or red – depending on whether the KPI exceeds, meets, is just below, or is significantly below a national performance standard. In response, Regional Commissioners each produce monthly for the National Commissioner a one-page National Overview Report that, among other items, provides commentary on the region’s KPI results and indicates what corrective action is being taken.

5.40
In respect of ARP data, the Regional Performance Summary presents year-to-date point estimates for each region. Regions for which the point estimate is below the national target are signified in yellow or red, irrespective of whether the confidence interval overlaps the national target. In our view, this is likely to induce errors of inference.

Conclusions

5.41
The ARP was not designed as a statistically valid instrument for measuring regional performance. Consequently, because the ARP estimates the level of accuracy of the total “population” of benefits on the basis of the accuracy of a relatively small random sample of those benefits, any use of ARP results in respect of regions needs to be approached with caution. Once down to regional level, the samples are small, and the smaller the sample on which an estimate is based, the greater the level of uncertainty of the estimate.

5.42
Earlier in the report (paragraph 4.32 on page 50) we pointed to the possibility of using 5+5 data to provide a better picture of accuracy performance at regional level. The foregoing discussion in this section highlights the importance of the issue.

5.43
In our view, the relative levels of accuracy between different parts of the country are important. Regions’ ARP results provide an imperfect measure of accuracy, but, in the absence of something better, it is acceptable for regional managers to use them as one of their operational tools. However, it is most important that, in doing so, they are clear about the uncertainties of the data they are using.

Recommendation 9
We recommend that – The Ministry continues to give the regions their ARP results, but in a form similar to Figure 11 on page 64 – showing each region’s data at a 95% confidence level, and comparative data of other regions.

Recommendation 10
We recommend that – The Ministry provides all Regional Commissioners and Regional Operations Managers with training on the nature of sampling error and the appropriate interpretation of statistical estimates that include confidence intervals.

Measuring and Reporting the Size of Errors and Fraud

5.44
The ARP provides a measure of the proportion of cases found to be accurate rather than any measure of the size of any incorrect payments or fraud. It cannot therefore be used to estimate the total amounts of under- or over-payments or fraud that potentially exist in the benefit system.

5.45
Benefit Control reports only the over-payments (including frauds) that it identifies and seeks to recover. It does not estimate the total value of over-payments (including fraudulent over-payments) that may actually exist.

5.46
It is nevertheless possible to assess the extent of under- and overpayments or fraud. In December 2001, the Ministry’s internal auditors completed a study of the processing of benefit applications. The study found that 2.7% of the cases sampled had errors that resulted in benefits being paid inaccurately. However, it did not examine whether these payments were ongoing (i.e. would continue to be inaccurate week after week) or were once-only errors.

5.47
In practice, a number of factors limit the risk that inaccuracies will be ongoing – for example:

There are standard reviews – after 26 weeks or 52 weeks, depending on the benefit type – at which time the beneficiary has to re-declare their core eligibility. These reviews provide an opportunity to pick up an ongoing inaccuracy that has arisen from past provision of incorrect or incomplete information.
Beneficiaries receiving certain categories of benefit (such as Unemployment Benefit) are required to have client plans that involve reassessments at much more frequent intervals than the periods between the standard reviews. Any inaccuracies dating from the grant of the benefit may therefore be picked up much sooner.
For some (often third-tier) benefits, ongoing inaccuracy is not an issue as they may be once-only payments or they may be paid for only a short time.

5.48
Our examination of ARP data confirmed that many errors are small and non-recurring.

5.49
In 1996 the Ministry undertook a risk-sizing exercise with the aim of quantifying the amount of over-payment (including those made as the result of fraud) that may exist. The Ministry and we both consider that the method used in the 1996 exercise was flawed. However, we do see value in estimating the extent of fraud at regular intervals by an appropriate method. Such an exercise would involve taking a sample of beneficiaries and measuring the extent to which their payments match their entitlements.

Conclusions

5.50
The ARP does not estimate the amount of under- or over-payments. Other available information and our own analysis suggest that the risk of large errors is low. However, it is important to undertake a specific exercise periodically to estimate the amount of over-payments (including fraud). The estimate would also provide the Ministry with a factual basis on which to estimate whether the current level of expenditure on Benefit Control ($38.3 million in 2003-04) yields the greatest cost/benefit.

Recommendation 11
We recommend that – The Ministry regularly performs a risk-sizing exercise to estimate the amount of over-payments.

Identifying and Reporting the Causes of, and Hardship Caused by, Errors

5.51
When under- or over-payments were identified, the Ministry previously did not identify their causes – for example, whether they were because of an error by the beneficiary or by the case manager. In particular, inaccurate payments caused by processing errors were not clearly identified and reported. However, the Ministry has recently begun undertaking an analysis of errors, validating it against other data and communicating its findings to the regions. We applaud this initiative.

5.52
A systematic collection of information on errors, their size and cause, and how they were found can provide useful indicators to help management assess for policy purposes:

priorities for focusing effort to achieve better accuracy – by identifying processing errors early and by avoiding them; and
the hardship that such errors might be causing beneficiaries.

Recommendation 12
We recommend that – The Ministry continues to explore the collection and analysis of information on errors, their size and cause, and how they were found, and to link this work with enhancements to its information technology systems.

14: A “review” is essentially a decision to change the benefit as the result of a change in any aspect of the beneficiary’s circumstances – for example, to change the amount of accommodation supplement payable because of a change in the beneficiary’s accommodation costs.

15: At that time, the accuracy rates of new applications and reviews of existing benefits were reported separately.

16: The survey was of people in both rental accommodation and their own homes after the 1993 Government decision that accommodation assistance (through a second-tier benefit called the Accommodation Supplement) should be available to low-income families, whether beneficiaries or wage earners, to assist with either rental or home-ownership costs.

17: In previous years the national average was as high as 15%.

page top

Defining Accuracy

The Ministry’s Performance Measure for Accuracy

The Accuracy Reporting Programme – What It Measures and What It Doesn't

Scope of the Accuracy Reporting Programme

People Who Are Eligible But Not Aware of It

Figure 9 Benefit Accuracy Information – 1996 National Survey of Eligibility for the Accommodation Supplement

Conclusions

Unverifiable Items

Figure 10 Regional Accuracy Results for 2001-02 Against the National Target as Measured by ARP Excluding Unverifiables

Conclusions

Uncertainty of ARP Results

Figure 11 Ranges of Regional Accuracy Results for 2001-02 Against the National Target

Conclusions

Measuring and Reporting the Size of Errors and Fraud

Conclusions

Identifying and Reporting the Causes of, and Hardship Caused by, Errors

Figure 9
Benefit Accuracy Information – 1996 National Survey of Eligibility for the Accommodation Supplement

Figure 10
Regional Accuracy Results for 2001-02 Against the National Target as Measured by ARP Excluding Unverifiables

Figure 11
Ranges of Regional Accuracy Results for 2001-02 Against the National Target