The effect of sample size on the efficiency of count data models: applications to Marriage Data
Montshiwa, Volition Tlhalitshi
Moroke, Ntebogang Dinah
MetadataShow full item record
Abstract: Sample size requirements are common in many multivariate analysis techniques as one of the measures taken to ensure the robustness of such techniques, such requirements have not been of interest in the area of count data models. As such, this study investigated the effect of sample size on the efficiency of six commonly used count data models namely: Poisson regression model (PRM), Negative binomial regression model (NBRM), Zero-inflated Poisson (ZIP), Zero-inflated negative binomial (ZINB), Poisson Hurdle model (PHM) and Negative binomial hurdle model (NBHM). The data used in this study were sourced from Data First and were collected by Statistics South Africa through the Marriage and Divorce database. PRM, NBRM, ZIP, ZINB, PHM and NBHM were applied to ten randomly selected samples ranging from 4392 to 43916 and differing by 10% in size. The six models were compared using the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Vuong's test for over-dispersion, McFadden RSQ, Mean Square Error (MSE) and Mean Absolute Deviation (MAD).The results revealed that generally, the Negative Binomial-based models outperformed Poisson-based models. However, the results did not reveal the effect of sample size variations on the efficiency of the models since there was no consistency in the change in AIC, BIC, Vuong's test for over-dispersion, McFadden RSQ, MSE and MAD as the sample size increased.