Authors: Saman Jayasinghe

 

Abstract: Infrastructure-as-Code (IaC) has revolutionized cloud resource management by allowing developers to define complex environments through machine-readable configuration files. However, this shift-left approach also introduces significant security risks, as a single misconfiguration can propagate vulnerabilities across an entire enterprise. Traditional static analysis tools often struggle with the semantic complexity and variety of IaC frameworks like Terraform, Ansible, and Kubernetes. This review examines the emergence of Machine Learning (ML) as a robust solution for IaC security. By leveraging Natural Language Processing (NLP), Deep Learning (DL), and anomaly detection, ML-based systems can identify "security smells," predict compliance violations, and detect configuration drift with higher precision than rule-based systems. This article provides a comprehensive overview of the current state-of-the-art, exploring data representation techniques, the integration of Large Language Models (LLMs), and the transition toward self-healing infrastructures. Finally, we discuss the remaining challenges, including data scarcity and adversarial risks, and outline future research directions in the field.

DOI: https://doi.org/10.5281/zenodo.19437671