A Large Scale Study and Classification of VirusTotal Reports on Phishing and Malware URLs

Euijin Choo, Mohamed Nabeel, Doowon Kim, Ravindu De Silva, Ting Yu, and and Issa Khalil

Abstract
VirusTotal (VT) is a widely used scanning service for researchers and practitioners to label malicious entities and predict new security threats. Unfortunately, it is little known to the end-users how VT URL scanners decide on the maliciousness of entities and the attack types they are involved in (e.g., phishing or malware-hosting websites). In this paper, we conduct a systematic comparative study on VT URL scanners' behavior for different attack types of malicious URLs, in terms of 1) detection specialties, 2) stability, 3) correlations between scanners, and 4) lead/lag behaviors. Our findings highlight that the VT scanners commonly disagree with each other on their detection and attack type classification, leading to challenges in ascertaining the maliciousness of a URL and taking prompt mitigation actions according to different attack types. This motivates us to present a new highly accurate classifier that helps correctly identify the attack types of malicious URLs at the early stage. This in turn assists practitioners in performing better threat aggregation and choosing proper mitigation actions for different attack types.

Reference
Euijin Choo, Mohamed Nabeel, Doowon Kim, Ravindu De Silva, Ting Yu, and and Issa Khalil. 2023. A large scale study and classification of VirusTotal reports on phishing and malware URLs. In Proceedings of the 49th AMC SIGMETRICS Conference. ACM.

Downloads