Measurement

Research is most impactful when it is predicated on a deep and thorough understanding of the problem space. This understanding has both technical and sociotechnical aspects.

On the technical side, it is important to understand methods used by adversaries, how well those methods work, and how broadly deployed they are. Similarly, it is critical to understand how effective defenses are for defenders, both in terms of stopping attacks and in how easily those defences are deployed by developers and security operators. It is also necessary to measure the security delta (i.e., improvement) achieved when potential mitigations are deployed. In this vein of research, we use methods from the security and Internet measurement communities—for example, threat modeling, security proofs, and Internet scans.

On the sociotechnical side, it is necessary to understand user perceptions of problems. Without this knowledge, it is far too easy to design systems that while technically interesting, will not be adopted by users as they do not solve the problems that are most important to users or solve problems in a way that is incompatible with user workflows. Similarly, it is is important to understand how well users understand given technical concepts to ensure that system interfaces are adopted to user mental models, increasing the probability of correct usage. For sociotechnical understanding, we use methods from the human-computer interaction (HCI) community—for example, usability studies, interviews, and grounded theory data analysis.


Publications

Conferences

Abstract:  Recent policy initiatives have acknowledged the importance of disaggregating data pertaining to diverse Asian ethnic communities to gain a more comprehensive understanding of their current status and to improve their overall well-being. However, research on anti-Asian racism has thus far fallen short of properly incorporating data disaggregation practices. Our study addresses this gap by collecting 12-month-long data from X (formerly known as Twitter) that contain diverse sub-ethnic group representations within Asian communities. In this dataset, we break down anti-Asian toxic messages based on both temporal and ethnic factors and conduct a series of comparative analyses of toxic messages, targeting different ethnic groups. Using temporal persistence analysis, n-gram-based correspondence analysis, and topic modeling, this study provides compelling evidence that anti-Asian messages comprise various distinctive narratives. Certain messages targeting sub-ethnic Asian groups entail different topics that distinguish them from those targeting Asians in a generic manner or those aimed at major ethnic groups, such as Chinese and Indian. By introducing several techniques that facilitate comparisons of online anti-Asian hate towards diverse ethnic communities, this study highlights the importance of taking a nuanced and disaggregated approach for understanding racial hatred to formulate effective mitigation strategies.
Abstract:  The sharing of Cyber Threat Intelligence (CTI) across organizations is gaining traction, as it can automate threat analysis and improve security awareness. However, limited empirical studies exist on the prevalent types of cybersecurity threat data and their effectiveness in mitigating cyber attacks. We propose a framework named CTI-Lense to collect and analyze the volume, timeliness, coverage, and quality of Structured Threat Information eXpression (STIX) data, a de facto standard CTI format, from a list of publicly available CTI sources. We collected about 6 million STIX data objects from October 31, 2014 to April 10, 2023 from ten data sources and analyzed their characteristics. Our analysis reveals that STIX data sharing has steadily increased in recent years, but the volume of STIX data shared is still relatively low to cover all cyber threats. Additionally, only a few types of threat data objects have been shared, with malware signatures and URLs accounting for more than 90% of the collected data. While URLs are usually shared promptly, with about 72% of URLs shared earlier than or on the same day as VirusTotal, the sharing of malware signatures is significantly slower. Furthermore, we found that 19% of the Threat actor data contained incorrect information, and only 0.09% of the Indicator data provided security rules to detect cyber attacks. Based on our findings, we recommend practical considerations for effective and scalable STIX data sharing among organizations.
Abstract:  There is limited information regarding how users employ password managers in the wild and why they use them in that manner. To address this knowledge gap, we conduct observational interviews with 32 password manager users. Using grounded theory, we identify four theories describing the processes and rationale behind participants' usage of password managers. We find that many users simultaneously use both a browser-based and a third-party manager, using each as a backup for the other, with this new paradigm having intriguing usability and security implications. Users also eschew generated passwords because these passwords are challenging to enter and remember when the manager is unavailable, necessitating new generators that create easy-to-enter and remember passwords. Additionally, the credential audits provided by most managers overwhelm users, limiting their utility and indicating a need for more proactive and streamlined notification systems. We also discuss mobile usage, adoption and promotion, and other related topics.
Abstract:  Code-signing PKI ecosystems are vulnerable to abusers. Kim et al. reported such abuse cases, e.g., malware authors misused the stolen private keys of the reputable code-signing certificates to sign their malicious programs. This certified malware exploits the chain of the trust established in the ecosystem and helps an adversary readily bypass security mechanisms such as anti-virus engines. Prior work analyzed the large corpus of certificates collected from the wild to characterize the security problems. However, this practice was typically performed in a global perspective and often left the issues that could happen at a local level behind. Our work revisits the investigations conducted by previous studies with a local perspective. In particular, we focus on code-signing certificates issued to South Korean companies. South Korea employs the code-signing PKI ecosystem with its own regional adaptations; thus, it is a perfect candidate to make a comparison. To begin with, we build a data collection pipeline and collect 455 certificates issued for South Korean companies and are potentially misused. We analyze those certificates based on three dimensions: (i) abusers, (ii) issuers, and (iii) the life-cycle of the certificate. We first identify that the strong regulation of a government can affect the market share of CAs. We also observe that several problems in certificate revocation: (i) the certificates had issued by local companies that closed the code-signing business still exist, (ii) only 6.8% of the abused certificates are revoked, and (iii) eight certificates are not revoked properly. All of those could lead to extending the validity of certified malware in the wild. Moreover, we show that the number of abuse cases is high in South Korea, even though it has a small population. Our study implies that Korean security practitioners require immediate attention to code-signing PKI abuse cases to safeguard the entire ecosystem.
Abstract:  To provide secure content delivery, Transport Layer Security (TLS) has become a de facto standard over a couple of decades. However, TLS has a long history of security weaknesses and drawbacks. Thus, the security of TLS has been enhanced by addressing security problems through continuous version upgrades. Meanwhile, to provide fast content delivery globally, websites (or origin web servers) need to deploy and administer many machines in globally distributed environments. They often delegate the management of machines to web hosting services or content delivery networks (CDNs), where the security configurations of distributed servers may vary spatially depending on the managing entities or locations. Based on these spatial differences in TLS security, we find that the security level of TLS connections (and their web services) can be lowered. After collecting the information of (web) domains that exhibit different TLS versions and cryptographic options depending on clients' locations, we show that it is possible to redirect TLS handshake messages to weak TLS servers, which both the origin server and the client may not be aware of. We investigate 7M domains with these spatial differences of security levels in the wild and conduct the analyses to better understand the root causes of this phenomenon. We also measure redirection delays at various locations in the world to see whether there are noticeable delays in redirections.
Abstract:  Phishing attacks are causing substantial damage albeit extensive effort in academia and industry. Recently, a large volume of phishing attacks transit toward adopting HTTPS, leveraging TLS certificates issued from Certificate Authorities (CAs), to make the attacks more effective. In this paper, we present a comprehensive study on the security practices of CAs in the HTTPS phishing ecosystem. We focus on the CAs, critical actors under-studied in previous literature, to better understand the importance of the security practices of CAs and thwart the proliferating HTTPS phishing. In particular, we first present the current landscape and effectiveness of HTTPS phishing attacks comparing to traditional HTTP ones. Then, we conduct an empirical experiment on the CAs' security practices in terms of the issuance and revocation of the certificates. Our findings highlight serious conflicts between the expected security practices of CAs and reality, raising significant security concerns. We further validate our findings using a longitudinal dataset of abusive certificates used for real phishing attacks in the wild. We confirm that the security concerns of CAs prevail in the wild and these concerns can be one of the main contributors to the recent surge of HTTPS phishing attacks.
Abstract:  Transport Layer Security (TLS) has become the norm for secure communication over the Internet. In August 2018, TLS 1.3, the latest version that improves security and performance of the previous TLS version, was approved. In this paper, we take a closer look at TLS 1.3 deployments in practice regarding adoption rate, security, performance, and implementation by applying temporal, spatial, and platform-based approaches on 687M connections. Overall, TLS 1.3 has rapidly been adopted mainly due to third party platforms such as Content Delivery Networks (CDNs) makes a significant contribution to the Internet. In fact, it deprecates vulnerable cryptographic primitives and substantially reduces the time required to perform the TLS 1.3 full handshake compared to the TLS 1.2 handshake. We quantify these aspects and show TLS 1.3 is beneficial to websites that do not rely on the third-party platforms. We also review Common Vulnerabilities and Exposures (CVE) regarding TLS libraries and show that many of recent vulnerabilities can be easily addressed by upgrading to TLS 1.3. However, some websites exhibit unstable support for TLS 1.3 due to multiple platforms with different TLS versions or migration to other platforms, which means that a website can show the lower TLS version at a certain time or from a certain region. Furthermore, we find that most of the implementations (including TLS libraries) do not fully support the new features of TLS 1.3 such as downgrade protection and certificate extensions.
Abstract:  Secure messaging tools are an integral part of modern society. While there is a significant body of secure messaging research generally, there is a lack of information regarding users' security and privacy perceptions and requirements for secure group chat. To address this gap, we conducted a survey of 996 participants in the US and UK. The results of our study show that group chat presents important security and privacy challenges, some of which are not present in one-to-one chat. For example, users need to be able to manage and monitor group membership, establish trust for new group members, and filter content that they share in different chat contexts. Similarly, we find that the sheer volume of notifications that occur in group chat makes it extremely likely that users ignore important security- or privacy- notifications. We also find that participants lack mechanisms for determining which tools are secure and instead rely on non-technical strategies for protecting their privacy—for example, self-filtering what they post and carefully tracking group membership. Based on these findings we provide recommendations on how to improve the security and usability of secure group chat.
Abstract:  As the COVID-19 pandemic started triggering widespread lockdowns across the globe, cybercriminals did not hesitate to take advantage of users' increased usage of the Internet and their reliance on it. In this paper, we carry out a comprehensive measurement study of online social engineering attacks in the early months of the pandemic. By collecting, synthesizing, and analyzing DNS records, TLS certificates, phishing URLs, phishing website source code, phishing emails, web traffic to phishing websites, news articles, and government announcements, we track trends of phishing activity between January and May 2020 and seek to understand the key implications of the underlying trends. We find that phishing attack traffic in March and April 2020 skyrocketed up to 220\% of its pre-COVID-19 rate, far exceeding typical seasonal spikes. Attackers exploited victims' uncertainty and fear related to the pandemic through a variety of highly targeted scams, including emerging scam types against which current defenses are not sufficient as well as traditional phishing which outpaced the ecosystem's collective response.
Abstract:  Recent measurement studies have highlighted security threats against the code-signing public key infrastructure (PKI), such as certificates that had been compromised or issued directly to the malware authors. The primary mechanism for mitigating these threats is to revoke the abusive certificates. However, the distributed yet closed nature of the code signing PKI makes it difficult to evaluate the effectiveness of revocations in this ecosystem. In consequence, the magnitude of signed malware threat is not fully understood. In this paper, we collect seven datasets, including the largest corpus of code-signing certificates, and we combine them to analyze the revocation process from end to end. Effective revocations rely on three roles: (1) discovering the abusive certificates, (2) revoking the certificates effectively, and (3) disseminating the revocation information for clients. We assess the challenge for discovering compromised certificates and the subsequent revocation delays. We show that erroneously setting revocation dates causes signed malware to remain valid even after the certificate has been revoked. We also report failures in disseminating the revocations, leading clients to continue trusting the revoked certificates.
Abstract:  Digitally signed malware can bypass system protection mechanisms that install or launch only programs with valid signatures. It can also evade anti-virus programs, which often forego scanning signed binaries. Known from advanced threats such as Stuxnet and Flame, this type of abuse has not been measured systematically in the broader malware landscape. In particular, the methods, effectiveness window, and security implications of code-signing PKI abuse are not well understood. We propose a threat model that highlights three types of weaknesses in the code-signing PKI. We overcome challenges specific to code-signing measurements by introducing techniques for prioritizing the collection of code-signing certificates that are likely abusive. We also introduce an algorithm for distinguishing among different types of threats. These techniques allow us to study threats that breach the trust encoded in the Windows code-signing PKI. The threats include stealing the private keys associated with benign certificates and using them to sign malware or by impersonating legitimate companies that do not develop software and, hence, do not own code-signing certificates. Finally, we discuss the actionable implications of our findings and propose concrete steps for improving the security of the code-signing ecosystem.
Abstract:  Understanding how people behave when faced with complex security situations is essential to designing usable security tools. To better understand users' perceptions of their digital lives and how they managed their online security posture, we conducted a series of 23 semi-structured interviews with mostly middle-aged parents from suburban Washington state. Using a grounded theory methodology, we analyzed the interview data and found that participants chose their security posture based on the immense value the Internet provides and their belief that no combination of technology could make them perfectly safe. Within this context, users have a four-stage process for determining which security measures to adopt: learning, evaluation of risks, estimation of impact, and weighing trade-offs to various coping strategies. Our results also revealed that a majority of participants understand the basic principles of symmetric encryption. We found that participants' misconceptions related to browser-based TLS indicators lead to insecure behavior, and it is the permanence of encrypted email that causes participants to doubt that it is secure. We conclude with a discussion of possible responses to this research and avenues for future research.
Abstract:  We measure the prevalence and uses of TLS proxies using a Flash tool deployed with a Google AdWords campaign. We generate 2.9 million certificate tests and find that 1 in 250 TLS connections are TLS-proxied. The majority of these proxies appear to be benevolent, however we identify over 1,000 cases where three malware products are using this technology nefariously. We also find numerous instances of negligent, duplicitous, and suspicious behavior, some of which degrade security for users without their knowledge. Distinguishing these types of practices is challenging in practice, indicating a need for transparency and user awareness.
Abstract:  This paper reports the results of a survey of 1,976 individuals regarding their opinions on TLS inspection, a controversial technique that can be used for both benevolent and malicious purposes. Responses indicate that participants hold nuanced opinions on security and privacy trade-offs, with most recognizing legitimate uses for the practice, but also concerned about threats from hackers or government surveillance. There is strong support for notification and consent when a system is intercepting their encrypted traffic, although this support varies depending on the situation. A significant concern about malicious uses of TLS inspection is identity theft, and many would react negatively and some would change their behavior if they discovered inspection occurring without their knowledge. We also find that a small but significant number of participants are jaded by the current state of affairs and have lost any expectation of privacy.

Journals and Magazines

Abstract:  Secure messaging tools are an integral part of modern society. To understand users’ security and privacy perceptions and requirements for secure group chat, we surveyed 996 respondents in the US and UK. Our results show that group chat presents important security and privacy challenges, some of which are not present in one-to-one chat. For example, users need to be able to manage and monitor group membership, establish trust for new group members, and filter content that they share in different chat contexts. We also find that respondents lack mechanisms for determining which tools are secure and instead rely on non-technical strategies for protecting their privacy—for example, self-filtering and carefully tracking group membership.

To better understand how these results relate to existing tools, we conduct cognitive walkthroughs (a form of expert usability review) for five popular group chat tools. Our results demonstrate that while existing tools address some items identified in our surveys, this support is partial and is insufficient in many cases. As such, there is a need for improved group chat tools that better align with user perceptions and requirements. Based on these findings, we provide recommendations on improving the security and usability of secure group chat.
Abstract:  TLS inspection—inline decryption, inspection, and re-encryption of TLS traffic—is a controversial practice used for both benevolent and malicious purposes. This article describes measurements of how often TLS inspection occurs and reports on a survey of the general public regarding the practice of TLS inspection. This helps inform security researchers and policymakers regarding current practices and user preferences.