Treffer: Propensity Score Matching (PSM) Python-based code

Title:
Propensity Score Matching (PSM) Python-based code
Publisher Information:
Zenodo
Publication Year:
2025
Collection:
Zenodo
Document Type:
E-Ressource software
Language:
unknown
DOI:
10.5281/zenodo.15009139
Rights:
Accession Number:
edsbas.29DF6E29
Database:
BASE

Weitere Informationen

This repository provides several variants of a free, Python-based code for performing propensity score (PS) matching. An initiative of the Camargo Cohort Study, developed with the aim of sharing the tool and spreading the use of PS matching. The code overcomes compatibility issues with R versions and R packages, and implements (i) logistic regression to compute PS, (ii) 1:N matching using the K-nearest neighbour (KNN) algorithm with a customisable caliper, (iii) sampling with or without replacement, (iv) visualisations to assess matching quality and (v) statistics to evaluate the balance. Outputs: Matched pairs stored as '.csv' file, allowing a Coxreg to be performed ('SET' in SPSS). Diagnostic plots stored in the specified output folder, providing a view of SMD and PS distribution. Statistics for matching validation: SMD, variance ratio (VR), McFadden's pseudo-R², and now, L1 multivariate imbalance. The code has been developed using information from the Matplotlib, Numpy and Seaborn libraries and with OpenAI's ChatGPT support and refinements. No funding was received for conducting this work and there are no financial or non-financial interests to disclose. ; Some tips It has been tested and works with datasets in SPSS v25, 28 and 29 ('open script'). Python, v3.10 and 3.11. Regarding R, versions 4.3.0 and 4.4.0, and 'Reticulate' package, 1.39 and 1.40. It tolerates missing values acceptably. However, it is desirable to reduce them as much as possible. UsageRefine the code with your current research:- Rename C:\PATH_TO_YOUR_DATASET.sav- Rename COVS with your data (name, not label)- Choose the ratio (1:1, 1:2.) and the caliper - Choose bar colors and adjust the limits of the x-axis and y-axis to the desired range- Rename C:\PATH_TO_YOUR_FOLDERRun the script [RStudio, SPSS (File / Open script).] All of them perform PS matching and store matched pairs. Features: * Code 1: Sampling without replacement. Five plots showing SMD and PS distributions. * Code 2: Sampling with replacement. Five plots. * Code 3: Sampling ...