DySec: A machine learning-based dynamic analysis for detecting malicious packages in PyPI ecosystem

Author Identifier (ORCID)

Chadni Islam: https://orcid.org/0000-0002-6349-6483

Abstract

Malicious Python packages make software supply chains vulnerable by exploiting trust in open-source repositories like Python Package Index (PyPI). Lack of real-time behavioral monitoring makes metadata inspection and static code analysis inadequate against advanced attack strategies such as typosquatting, covert remote access activation, and dynamic payload generation. To address these challenges, we introduce DySec, a machine learning (ML)-based dynamic analysis framework for PyPI that uses eBPF kernel and user-level probes to monitor behaviors during package installation. By capturing 36 real-time features–including system calls, network traffic, resource usage, directory access, and installation patterns–DySec detects threats like typosquatting, covert remote access activation, dynamic payload generation, and multiphase attack malware. We developed a comprehensive dataset of 14,271 Python packages, including 7,127 malicious sample traces, by executing them in a controlled isolated environment. Experimental results demonstrate that DySec achieves 96% detection accuracy with an ML inference latency of <0.5s after dynamic feature extraction, reducing false negatives by 78.65% compared to static analysis and 82.24% compared to metadata analysis. During the evaluation, DySec flagged eleven packages that PyPI classified as benign. A manual analysis, including installation behavior inspection, confirmed six of them as malicious. These findings were reported to PyPI maintainers, resulting in the removal of four packages. DySec bridges the gap between reactive traditional methods and proactive, scalable threat mitigation in open-source ecosystems by uniquely detecting malicious install-time behaviors.

Document Type

Journal Article

Date of Publication

1-1-2026

Publication Title

IEEE Transactions on Information Forensics and Security

Publisher

IEEE

School

School of Science

Comments

Mehedi, S. T., Islam, C., Ramachandran, G., & Jurdak, R. (2026). DySec: A machine learning-based dynamic analysis for detecting malicious packages in PyPI ecosystem. IEEE Transactions on Information Forensics and Security, 21, 1316–1331. https://doi.org/10.1109/TIFS.2026.3654388

Copyright

subscription content

Share

 
COinS
 

Link to publisher version (DOI)

10.1109/TIFS.2026.3654388