Authors: S Malarvizhi

Abstract: Financial statement manipulation continues to undermine the reliability of corporate reporting, while conventional ratio-based detection tools remain largely backward-looking and often insensitive to behaviour-driven precursors. This study proposes a behaviourally enriched, explainable machine learning framework that integrates traditional accounting indicators with managerial and disclosure-based proxies to improve manipulation risk screening. Using a multi-year firm-year panel (2016–2023) of listed non-financial companies, the study constructs a binary manipulation label and develops two benchmark models (Beneish-style screening and logistic regression) alongside ensemble learning models (random forest, XGBoost, and LightGBM). Model evaluation emphasises imbalanced-class robustness using ROC–AUC, precision, recall, F1-score, and confusion-matrix diagnostics. Empirically, behavioural enrichment improves discrimination by approximately 4–7 percentage points in ROC–AUC across models, and the best-performing LightGBM specification achieves Accuracy = 0.95, Precision = 0.92, Recall = 0.90, F1 = 0.91, and ROC–AUC = 0.98. Relative to the logistic baseline, false negatives decline from 56 to 18 (≈68% reduction), strengthening audit-relevant sensitivity. To ensure audit usability, the framework embeds SHAP-based explainability, revealing Earnings Pressure Index and Management Tone Score as dominant predictors alongside DSRI and AQI, thereby demonstrating that manipulation risk is strongly behaviour-linked rather than purely numerical. Overall, the study contributes an interpretable, early-warning analytics tool that improves both predictive performance and decision transparency for auditors, regulators, and governance stakeholders.

DOI: