Distance Functions for Matching in Small Samples
Eva Dettmann, Christian Schmeißer, Claudia Becker
Computational Statistics & Data Analysis,
Nr. 5,
2011
Abstract
The development of ‘standards’ for the application of matching algorithms in empirical evaluation studies is still an outstanding goal. The first step of the matching procedure is the choice of an appropriate distance function. In empirical evaluation situations often the sample sizes are small. Moreover, they consist of variables with different scale levels which have to be considered explicitly in the matching process. A simulation is performed which is directed towards these empirical challenges and supplements former studies in this respect. The choice of the analysed distance functions is determined by the results of former theoretical studies and recommendations in the empirical literature. Thus, two balancing scores (the propensity score and the index score) and the Mahalanobis distance are considered. Additionally, aggregated statistical distance functions not yet used for empirical evaluation are included. The matching outcomes are compared using non-parametric scale-specific tests for identical distributions of the characteristics in the treatment and the control groups. The simulation results show that, in small samples, aggregated statistical distance functions are the better choice for summarising similarities in differently scaled variables compared to the commonly used measures.
Artikel Lesen
Is there a Superior Distance Function for Matching in Small Samples?
Eva Dettmann, Claudia Becker, Christian Schmeißer
Abstract
The study contributes to the development of ’standards’ for the application of matching algorithms in empirical evaluation studies. The focus is on the first step of the matching procedure, the choice of an appropriate distance function. Supplementary o most former studies, the simulation is strongly based on empirical evaluation ituations. This reality orientation induces the focus on small samples. Furthermore, ariables with different scale levels must be considered explicitly in the matching rocess. The choice of the analysed distance functions is determined by the results of former theoretical studies and recommendations in the empirical literature. Thus, in the simulation, two balancing scores (the propensity score and the index score) and the Mahalanobis distance are considered. Additionally, aggregated statistical distance functions not yet used for empirical evaluation are included. The matching outcomes are compared using non-parametrical scale-specific tests for identical distributions of the characteristics in the treatment and the control groups. The simulation results show that, in small samples, aggregated statistical distance functions are the better
choice for summarising similarities in differently scaled variables compared to the
commonly used measures.
Artikel Lesen