This paper provides an assessment of the current quality of administrative (admin) data for deriving the ethnicity of the New Zealand population. It discusses an update to previously used methods for deriving ethnicity information from admin sources (Stats NZ, 2018), by comparing against the 2018 Census and extending to more detailed ethnicities and ethnic groups.
Summary
This paper is one of a series of investigations by Stats NZ's census transformation research programme aimed at identifying and exploring the potential for admin data sources to provide census-type information. This work informs the direction of future censuses, but also underpins the use of administrative data in the 'combined' census model of the 2018 and 2023 Censuses and is the basis for inclusion of variables in the experimental administrative population census (APC).
Past evaluations focused primarily on higher-level ethnicity classifications, revealing consistency variations across admin sources. This report expands on this by examining data quality across all four classification levels of the 2005 New Zealand standard classification of ethnicity, including newly integrated sources like the Household Labour Force Survey (HLFS) and Department of Internal Affairs' (DIA) Deaths register.
Key findings demonstrate notable improvements in data accuracy since prior analyses, particularly for major ethnic groups at the broader classification levels. The APC emerged as a promising model for producing census-type data, displaying high agreement with the 2018 Census across most ethnic groups. However, disparities remain in the accuracy of data on smaller and non-standardised ethnic groups, especially those categorised as 'not further defined (nfd)' or 'not elsewhere classified (nec)'. Additionally, differences across admin sources, such as the over-representation of 'Other Ethnicity' by Ministry of Social Development (MSD) and Accident Compensation Corporation (ACC), indicate areas needing refinement in source ranking and standardisation.
This report sets the foundation for future steps to improve methods for ranking data sources by ethnic group, enhancing data quality, accuracy, and consistency over time. As reliance on admin data grows, these improvements will become increasingly important. Insights from this work also highlight the need for better ethnicity data collection processes across government agencies, supporting a strong, admin-first model for the 2028 Census.
ISBN 978-1-991307-39-2 (online)