Phishing

Phishing attacks pose a significant security threat to billions of Internet users. Phisher launch fake websites that closely resemble legitimate ones (e.g., banking websites) and trick users into entering their personal information (e.g, credentials), resulting in substantial financial losses for victims. According to the US Federal Bureau of Investigation, the phishing victims suffered $10.3 billion in 2022 alone1, an increase of 49% compared to 2021. The long-term goal of our research in this area is to (1) better understand the phishing attacks, with a specific focus on phishing attackers’ behaviors of building phishing websites and (2) design more usable and secure anti-phishing defense systems.


Publications

Conferences

Abstract:  Phishing attacks have persistently remained a prevalent and widespread cybersecurity threat for several years. This leads to numerous endeavors aimed at comprehensively understanding the phishing attack ecosystem, with a specific focus on presenting new attack tactics and defense mechanisms against phishing attacks. Unfortunately, little is known about how client-side resources (e.g., JavaScript libraries) are used in phishing websites, compared to those in their corresponding legitimate target brand websites. This understanding can help us gain insights into the construction and techniques of phishing websites and phishing attackers’ behaviors when building phishing websites. In this paper, we gain a deeper understanding of how client-side resources (especially, JavaScript libraries) are used in phishing websites by comparing them with the resources used in the legitimate target websites. For our study, we collect both client-side resources from phishing websites and their corresponding legitimate target brand websites for 25 months: 3.4M phishing websites (1.1M distinct phishing domains). Our study reveals that phishing websites tend to employ more diverse JavaScript libraries than their legitimate websites do. However, these libraries in phishing websites are older (nearly 21.2 months) and distinct in comparison. For example, Socket.IO is uniquely used in phishing websites to send victims’ information to an external server in real time. Furthermore, we find that a considerable portion of them still maintain a basic and simplistic structure (e.g., simply displaying a login form or image), while phishing websites have significantly evolved to bypass anti-phishing measures. Finally, through HTML structure and style similarities, we can identify specific target webpages of legitimate brands that phishing attackers reference and use to mimic for their phishing attacks.
Abstract:  VirusTotal (VT) is a widely used scanning service for researchers and practitioners to label malicious entities and predict new security threats. Unfortunately, it is little known to the end-users how VT URL scanners decide on the maliciousness of entities and the attack types they are involved in (e.g., phishing or malware-hosting websites). In this paper, we conduct a systematic comparative study on VT URL scanners' behavior for different attack types of malicious URLs, in terms of 1) detection specialties, 2) stability, 3) correlations between scanners, and 4) lead/lag behaviors. Our findings highlight that the VT scanners commonly disagree with each other on their detection and attack type classification, leading to challenges in ascertaining the maliciousness of a URL and taking prompt mitigation actions according to different attack types. This motivates us to present a new highly accurate classifier that helps correctly identify the attack types of malicious URLs at the early stage. This in turn assists practitioners in performing better threat aggregation and choosing proper mitigation actions for different attack types.
Abstract:  Phishing attacks are causing substantial damage albeit extensive effort in academia and industry. Recently, a large volume of phishing attacks transit toward adopting HTTPS, leveraging TLS certificates issued from Certificate Authorities (CAs), to make the attacks more effective. In this paper, we present a comprehensive study on the security practices of CAs in the HTTPS phishing ecosystem. We focus on the CAs, critical actors under-studied in previous literature, to better understand the importance of the security practices of CAs and thwart the proliferating HTTPS phishing. In particular, we first present the current landscape and effectiveness of HTTPS phishing attacks comparing to traditional HTTP ones. Then, we conduct an empirical experiment on the CAs' security practices in terms of the issuance and revocation of the certificates. Our findings highlight serious conflicts between the expected security practices of CAs and reality, raising significant security concerns. We further validate our findings using a longitudinal dataset of abusive certificates used for real phishing attacks in the wild. We confirm that the security concerns of CAs prevail in the wild and these concerns can be one of the main contributors to the recent surge of HTTPS phishing attacks.
Abstract:  As the COVID-19 pandemic started triggering widespread lockdowns across the globe, cybercriminals did not hesitate to take advantage of users' increased usage of the Internet and their reliance on it. In this paper, we carry out a comprehensive measurement study of online social engineering attacks in the early months of the pandemic. By collecting, synthesizing, and analyzing DNS records, TLS certificates, phishing URLs, phishing website source code, phishing emails, web traffic to phishing websites, news articles, and government announcements, we track trends of phishing activity between January and May 2020 and seek to understand the key implications of the underlying trends. We find that phishing attack traffic in March and April 2020 skyrocketed up to 220\% of its pre-COVID-19 rate, far exceeding typical seasonal spikes. Attackers exploited victims' uncertainty and fear related to the pandemic through a variety of highly targeted scams, including emerging scam types against which current defenses are not sufficient as well as traditional phishing which outpaced the ecosystem's collective response.

Journals and Magazines

Abstract:  The ever-increasing phishing campaigns around the globe have been one of the main threats to cyber security. In response, the global anti-phishing entity (e. g., APWG) collectively maintains the up-to-date blacklist database (e. g., eCrimeX ) against phishing campaigns, and so do modern browsers (e. g., Google Safe Browsing). However, our finding reveals that such a mutual assistance system has remained a blind spot when detecting geolocation-based phishing campaigns. In this paper, we focus on phishing campaigns against the web portal service with the largest number of users (42 million) in South Korea. We harvest 1,558 phishing URLs from varying resources in the span of a full year, of which only a small fraction (3.8%) have been detected by eCrimeX despite a wide spectrum of active fraudulence cases. We demystify three pervasive types of phishing campaigns in South Korea: i) sophisticated phishing campaigns with varying adversarial tactics such as a proxy configuration, ii) phishing campaigns against a second-hand online market, and iii) phishing campaigns against a non-specific target. Aligned with previous findings, a phishing kit that supports automating the whole phishing campaign is prevalent. Besides, we frequently observe a hit-and-run scam where a phishing campaign is immediately inaccessible right after victimization is complete, each of which is tailored to a single potential victim over a new channel like a messenger. As part of mitigation efforts, we promptly provide regional phishing information to APWG, and immediately lock down a victim’s account to prevent further damages.