Doowon Kim

Co-Director of the USER Lab
Assistant Professor

Email: doowon@utk.edu
Address: Min H. Kao Building, Room 345
1520 Middle Drive
Knoxville, TN 37996-2250
Phone: 865-974-8061

Education

  • Ph.D. in Computer Science, The University of Maryland, College Park, 2020

Bio

I’m an Assistant Professor in the Department of Electrical Engineering and Computer Science at the University of Tennessee, Knoxville. I received my Ph.D. in Computer Science from the University of Maryland, College Park in May 2020 where I worked with Prof. Tudor Dumitras.

My research interests include computer security (data-driven security and usable security) and computer networks (Internet measurement). I am interested in identifying the root causes of security threats by understanding actors (e.g., adversary and end-users) involved, with data-driven and human-centered perspectives (i.e., usability study).

I have been awarded the 5th annual NSA Best Scientific Cybersecurity Paper (2017) and the Ann G. Wylie Dissertation Fellowship (2019). My works about the Code-Signing PKI have been featured in Ars Technica, The Register, Schneier on Security, Threatpost, etc.


Selected Recent Publications

Abstract:  AI-powered coding assistant tools have revolutionized the software engineering ecosystem. However, prior work has demonstrated that these tools are vulnerable to poisoning attacks. In a poisoning attack, an attacker intentionally injects maliciously crafted insecure code snippets into training datasets to manipulate these tools. The poisoned tools can suggest insecure code to developers, resulting in vulnerabilities in their products that attackers can exploit. However, it is still little understood whether such poisoning attacks against the tools would be practical in real-world settings and how developers address the poisoning attacks during software development. To understand the real-world impact of poisoning attacks on developers who rely on AI-powered coding assistants, we conducted two user studies":" an online survey and an in-lab study. The online survey involved 238 participants, including software developers and computer science students. The survey results revealed widespread adoption of these tools among participants, primarily to enhance coding speed, eliminate repetition, and gain boilerplate code. However, the survey also found that developers may misplace trust in these tools because they overlooked the risk of poisoning attacks. The in-lab study was conducted with 30 professional developers. The developers were asked to complete three programming tasks with a representative type of AI-powered coding assistant tool, running on Visual Studio Code. The in-lab study results showed that developers using a poisoned ChatGPT-like tool were more prone to including insecure code than those using an IntelliCode-like tool or no tool. This demonstrates the strong influence of these tools on the security of generated code. Our study results highlight the need for education and improved coding practices to address new security issues introduced by AI-powered coding assistant tools.
Abstract:  Recent policy initiatives have acknowledged the importance of disaggregating data pertaining to diverse Asian ethnic communities to gain a more comprehensive understanding of their current status and to improve their overall well-being. However, research on anti-Asian racism has thus far fallen short of properly incorporating data disaggregation practices. Our study addresses this gap by collecting 12-month-long data from X (formerly known as Twitter) that contain diverse sub-ethnic group representations within Asian communities. In this dataset, we break down anti-Asian toxic messages based on both temporal and ethnic factors and conduct a series of comparative analyses of toxic messages, targeting different ethnic groups. Using temporal persistence analysis, n-gram-based correspondence analysis, and topic modeling, this study provides compelling evidence that anti-Asian messages comprise various distinctive narratives. Certain messages targeting sub-ethnic Asian groups entail different topics that distinguish them from those targeting Asians in a generic manner or those aimed at major ethnic groups, such as Chinese and Indian. By introducing several techniques that facilitate comparisons of online anti-Asian hate towards diverse ethnic communities, this study highlights the importance of taking a nuanced and disaggregated approach for understanding racial hatred to formulate effective mitigation strategies.
Abstract:  Phishing attacks have persistently remained a prevalent and widespread cybersecurity threat for several years. This leads to numerous endeavors aimed at comprehensively understanding the phishing attack ecosystem, with a specific focus on presenting new attack tactics and defense mechanisms against phishing attacks. Unfortunately, little is known about how client-side resources (e.g., JavaScript libraries) are used in phishing websites, compared to those in their corresponding legitimate target brand websites. This understanding can help us gain insights into the construction and techniques of phishing websites and phishing attackers’ behaviors when building phishing websites. In this paper, we gain a deeper understanding of how client-side resources (especially, JavaScript libraries) are used in phishing websites by comparing them with the resources used in the legitimate target websites. For our study, we collect both client-side resources from phishing websites and their corresponding legitimate target brand websites for 25 months: 3.4M phishing websites (1.1M distinct phishing domains). Our study reveals that phishing websites tend to employ more diverse JavaScript libraries than their legitimate websites do. However, these libraries in phishing websites are older (nearly 21.2 months) and distinct in comparison. For example, Socket.IO is uniquely used in phishing websites to send victims’ information to an external server in real time. Furthermore, we find that a considerable portion of them still maintain a basic and simplistic structure (e.g., simply displaying a login form or image), while phishing websites have significantly evolved to bypass anti-phishing measures. Finally, through HTML structure and style similarities, we can identify specific target webpages of legitimate brands that phishing attackers reference and use to mimic for their phishing attacks.
Abstract:  The sharing of Cyber Threat Intelligence (CTI) across organizations is gaining traction, as it can automate threat analysis and improve security awareness. However, limited empirical studies exist on the prevalent types of cybersecurity threat data and their effectiveness in mitigating cyber attacks. We propose a framework named CTI-Lense to collect and analyze the volume, timeliness, coverage, and quality of Structured Threat Information eXpression (STIX) data, a de facto standard CTI format, from a list of publicly available CTI sources. We collected about 6 million STIX data objects from October 31, 2014 to April 10, 2023 from ten data sources and analyzed their characteristics. Our analysis reveals that STIX data sharing has steadily increased in recent years, but the volume of STIX data shared is still relatively low to cover all cyber threats. Additionally, only a few types of threat data objects have been shared, with malware signatures and URLs accounting for more than 90% of the collected data. While URLs are usually shared promptly, with about 72% of URLs shared earlier than or on the same day as VirusTotal, the sharing of malware signatures is significantly slower. Furthermore, we found that 19% of the Threat actor data contained incorrect information, and only 0.09% of the Indicator data provided security rules to detect cyber attacks. Based on our findings, we recommend practical considerations for effective and scalable STIX data sharing among organizations.
Abstract:  Modern Websites rely on various client-side web resources, such as JavaScript libraries, to provide end-users with rich and interactive web experiences. Unfortunately, anecdotal evidence shows that improperly managed client-side resources could open up attack surfaces that adversaries can exploit. However, there is still a lack of a comprehensive understanding of the updating practices among web developers and the potential impact of inaccuracies in Common Vulnerabilities and Exposures (CVE) information on the security of the web ecosystem. In this paper, we conduct a longitudinal (four-year) measurement study of the security practices and implications on client-side resources (e.g., JavaScript libraries and Adobe Flash) across the Web. Specifically, we first collect a large-scale dataset of 157.2M webpages of Alexa Top 1M websites for four years in the wild. Analyzing the dataset, we find an average of 41.2% of websites (in each year of the four years) carry at least one vulnerable client-side resource (e.g., JavaScript or Adobe Flash). We also reveal that vulnerable JavaScript library versions are frequently observed in the wild, suggesting a concerning level of lagging update practice in the wild. On average, we observe 531.2 days with 25,337 websites of the window of vulnerability due to the unpatched client-side resources from the release of security patches. Furthermore, we manually investigate the fidelity of CVE (Common Vulnerabilities and Exposures) reports on client-side resources, leveraging PoC (Proof of Concept) code. We find that 13 CVE reports (out of 27) have incorrect vulnerable version information, which may impact security-related tasks such as security updates.

Journals and Magazines

Abstract:  The ever-increasing phishing campaigns around the globe have been one of the main threats to cyber security. In response, the global anti-phishing entity (e. g., APWG) collectively maintains the up-to-date blacklist database (e. g., eCrimeX ) against phishing campaigns, and so do modern browsers (e. g., Google Safe Browsing). However, our finding reveals that such a mutual assistance system has remained a blind spot when detecting geolocation-based phishing campaigns. In this paper, we focus on phishing campaigns against the web portal service with the largest number of users (42 million) in South Korea. We harvest 1,558 phishing URLs from varying resources in the span of a full year, of which only a small fraction (3.8%) have been detected by eCrimeX despite a wide spectrum of active fraudulence cases. We demystify three pervasive types of phishing campaigns in South Korea: i) sophisticated phishing campaigns with varying adversarial tactics such as a proxy configuration, ii) phishing campaigns against a second-hand online market, and iii) phishing campaigns against a non-specific target. Aligned with previous findings, a phishing kit that supports automating the whole phishing campaign is prevalent. Besides, we frequently observe a hit-and-run scam where a phishing campaign is immediately inaccessible right after victimization is complete, each of which is tailored to a single potential victim over a new channel like a messenger. As part of mitigation efforts, we promptly provide regional phishing information to APWG, and immediately lock down a victim’s account to prevent further damages.

Conferences

Abstract:  AI-powered coding assistant tools have revolutionized the software engineering ecosystem. However, prior work has demonstrated that these tools are vulnerable to poisoning attacks. In a poisoning attack, an attacker intentionally injects maliciously crafted insecure code snippets into training datasets to manipulate these tools. The poisoned tools can suggest insecure code to developers, resulting in vulnerabilities in their products that attackers can exploit. However, it is still little understood whether such poisoning attacks against the tools would be practical in real-world settings and how developers address the poisoning attacks during software development. To understand the real-world impact of poisoning attacks on developers who rely on AI-powered coding assistants, we conducted two user studies":" an online survey and an in-lab study. The online survey involved 238 participants, including software developers and computer science students. The survey results revealed widespread adoption of these tools among participants, primarily to enhance coding speed, eliminate repetition, and gain boilerplate code. However, the survey also found that developers may misplace trust in these tools because they overlooked the risk of poisoning attacks. The in-lab study was conducted with 30 professional developers. The developers were asked to complete three programming tasks with a representative type of AI-powered coding assistant tool, running on Visual Studio Code. The in-lab study results showed that developers using a poisoned ChatGPT-like tool were more prone to including insecure code than those using an IntelliCode-like tool or no tool. This demonstrates the strong influence of these tools on the security of generated code. Our study results highlight the need for education and improved coding practices to address new security issues introduced by AI-powered coding assistant tools.
Abstract:  Recent policy initiatives have acknowledged the importance of disaggregating data pertaining to diverse Asian ethnic communities to gain a more comprehensive understanding of their current status and to improve their overall well-being. However, research on anti-Asian racism has thus far fallen short of properly incorporating data disaggregation practices. Our study addresses this gap by collecting 12-month-long data from X (formerly known as Twitter) that contain diverse sub-ethnic group representations within Asian communities. In this dataset, we break down anti-Asian toxic messages based on both temporal and ethnic factors and conduct a series of comparative analyses of toxic messages, targeting different ethnic groups. Using temporal persistence analysis, n-gram-based correspondence analysis, and topic modeling, this study provides compelling evidence that anti-Asian messages comprise various distinctive narratives. Certain messages targeting sub-ethnic Asian groups entail different topics that distinguish them from those targeting Asians in a generic manner or those aimed at major ethnic groups, such as Chinese and Indian. By introducing several techniques that facilitate comparisons of online anti-Asian hate towards diverse ethnic communities, this study highlights the importance of taking a nuanced and disaggregated approach for understanding racial hatred to formulate effective mitigation strategies.
Abstract:  Phishing attacks have persistently remained a prevalent and widespread cybersecurity threat for several years. This leads to numerous endeavors aimed at comprehensively understanding the phishing attack ecosystem, with a specific focus on presenting new attack tactics and defense mechanisms against phishing attacks. Unfortunately, little is known about how client-side resources (e.g., JavaScript libraries) are used in phishing websites, compared to those in their corresponding legitimate target brand websites. This understanding can help us gain insights into the construction and techniques of phishing websites and phishing attackers’ behaviors when building phishing websites. In this paper, we gain a deeper understanding of how client-side resources (especially, JavaScript libraries) are used in phishing websites by comparing them with the resources used in the legitimate target websites. For our study, we collect both client-side resources from phishing websites and their corresponding legitimate target brand websites for 25 months: 3.4M phishing websites (1.1M distinct phishing domains). Our study reveals that phishing websites tend to employ more diverse JavaScript libraries than their legitimate websites do. However, these libraries in phishing websites are older (nearly 21.2 months) and distinct in comparison. For example, Socket.IO is uniquely used in phishing websites to send victims’ information to an external server in real time. Furthermore, we find that a considerable portion of them still maintain a basic and simplistic structure (e.g., simply displaying a login form or image), while phishing websites have significantly evolved to bypass anti-phishing measures. Finally, through HTML structure and style similarities, we can identify specific target webpages of legitimate brands that phishing attackers reference and use to mimic for their phishing attacks.
Abstract:  The sharing of Cyber Threat Intelligence (CTI) across organizations is gaining traction, as it can automate threat analysis and improve security awareness. However, limited empirical studies exist on the prevalent types of cybersecurity threat data and their effectiveness in mitigating cyber attacks. We propose a framework named CTI-Lense to collect and analyze the volume, timeliness, coverage, and quality of Structured Threat Information eXpression (STIX) data, a de facto standard CTI format, from a list of publicly available CTI sources. We collected about 6 million STIX data objects from October 31, 2014 to April 10, 2023 from ten data sources and analyzed their characteristics. Our analysis reveals that STIX data sharing has steadily increased in recent years, but the volume of STIX data shared is still relatively low to cover all cyber threats. Additionally, only a few types of threat data objects have been shared, with malware signatures and URLs accounting for more than 90% of the collected data. While URLs are usually shared promptly, with about 72% of URLs shared earlier than or on the same day as VirusTotal, the sharing of malware signatures is significantly slower. Furthermore, we found that 19% of the Threat actor data contained incorrect information, and only 0.09% of the Indicator data provided security rules to detect cyber attacks. Based on our findings, we recommend practical considerations for effective and scalable STIX data sharing among organizations.
Abstract:  VirusTotal (VT) is a widely used scanning service for researchers and practitioners to label malicious entities and predict new security threats. Unfortunately, it is little known to the end-users how VT URL scanners decide on the maliciousness of entities and the attack types they are involved in (e.g., phishing or malware-hosting websites). In this paper, we conduct a systematic comparative study on VT URL scanners' behavior for different attack types of malicious URLs, in terms of 1) detection specialties, 2) stability, 3) correlations between scanners, and 4) lead/lag behaviors. Our findings highlight that the VT scanners commonly disagree with each other on their detection and attack type classification, leading to challenges in ascertaining the maliciousness of a URL and taking prompt mitigation actions according to different attack types. This motivates us to present a new highly accurate classifier that helps correctly identify the attack types of malicious URLs at the early stage. This in turn assists practitioners in performing better threat aggregation and choosing proper mitigation actions for different attack types.
Abstract:  Modern Websites rely on various client-side web resources, such as JavaScript libraries, to provide end-users with rich and interactive web experiences. Unfortunately, anecdotal evidence shows that improperly managed client-side resources could open up attack surfaces that adversaries can exploit. However, there is still a lack of a comprehensive understanding of the updating practices among web developers and the potential impact of inaccuracies in Common Vulnerabilities and Exposures (CVE) information on the security of the web ecosystem. In this paper, we conduct a longitudinal (four-year) measurement study of the security practices and implications on client-side resources (e.g., JavaScript libraries and Adobe Flash) across the Web. Specifically, we first collect a large-scale dataset of 157.2M webpages of Alexa Top 1M websites for four years in the wild. Analyzing the dataset, we find an average of 41.2% of websites (in each year of the four years) carry at least one vulnerable client-side resource (e.g., JavaScript or Adobe Flash). We also reveal that vulnerable JavaScript library versions are frequently observed in the wild, suggesting a concerning level of lagging update practice in the wild. On average, we observe 531.2 days with 25,337 websites of the window of vulnerability due to the unpatched client-side resources from the release of security patches. Furthermore, we manually investigate the fidelity of CVE (Common Vulnerabilities and Exposures) reports on client-side resources, leveraging PoC (Proof of Concept) code. We find that 13 CVE reports (out of 27) have incorrect vulnerable version information, which may impact security-related tasks such as security updates.
Abstract:  Decompilation is a crucial capability in forensic analysis, facilitating analysis of unknown binaries. The recentrise of Python malware has brought attention to Python decompilers that aim to obtain source code representation from a Python binary. However, Python decompilers fail to handle various binaries, limiting their capabilities in forensic analysis. This paper proposes a novel solution that transforms a decompilation error-inducing Python binary into a decompilable binary. Our key intuition is that we can resolve the decompilation errors by transforming error-inducing code blocks in the input binary into another form. The core of our approach is the concept of Forensically Equivalent Transformation (FET) which allows non-semantic preserving transformation in the context of forensic analysis. We carefully define the FETs to minimize their undesirable consequences while fixing various error-inducing instructions that are difficult to solve when preserving the exact semantics. We evaluate the prototype of our approach with 17,117 real-world Python malware samples causing decompilation errors in five popular decompilers. It successfully identifies and fixes 77,022 errors. Our approach also handles anti-analysis techniques, including opcode remapping, and helps migrate Python 3.9 binaries to 3.8 binaries.
Abstract:  The packet stream analysis is essential for the early identification of attack connections while in progress, enabling timely responses to protect system resources. However, there are several challenges for implementing effective analysis, including out-of-order packet sequences introduced due to network dynamics and class imbalance with a small fraction of attack connections available to characterize. To overcome these challenges, we present two deep sequence models: (i) a bidirectional recurrent structure designed for resilience to out-of-order packets, and (ii) a pre-training-enabled sequence-to-sequence structure designed for better dealing with unbalanced class distributions using self-supervised learning. We evaluate the presented models using a real network dataset created from month-long real traffic traces collected from backbone links with the associated intrusion log. The experimental results support the feasibility of the presented models with up to 94.8% in F1 score with the first five packets (k=5), outperforming baseline deep learning models.
Abstract:  Software systems may contain critical program components such as patented program logic or sensitive data. When those components are reverse-engineered by adversaries, it can cause significantly damage (e.g., financial loss or operational failures). While protecting critical program components (e.g., code or data) in software systems is of utmost importance, existing approaches, unfortunately, have two major weaknesses: (1) they can be reverse-engineered via various program analysis techniques and (2) when an adversary obtains a legitimate-looking critical program component, he or she can be sure that it is genuine. In this paper, we propose Ambitr, a novel technique that hides critical program components. The core of Ambitr is Ambiguous Translator that can generate the critical program components when the input is a correct secret key. The translator is ambiguous as it can accept any inputs and produces a number of legitimate-looking outputs, making it difficult to know whether an input is correct secret key or not. The executions of the translator when it processes the correct secret key and other inputs are also indistinguishable, making the analysis inconclusive. Our evaluation results show that static, dynamic and symbolic analysis techniques fail to identify the hidden information in Ambitr. We also demonstrate that manual analysis of Ambitr is extremely challenging.
Abstract:  Server-side malware is one of the prevalent threats that can affect a large number of clients who visit the compromised server. In this paper, we propose Dazzle-attack, a new advanced server-side attack that is resilient to forensic analysis such as reverse-engineering. Dazzleattack retrieves typical (and non-suspicious) contents from benign and uncompromised websites to avoid detection and mislead the investigation to erroneously associate the attacks with benign websites. Dazzleattack leverages a specialized state-machine that accepts any inputs and produces outputs with respect to the inputs, which substantially enlarges the input-output space and makes reverse-engineering effort significantly difficult. We develop a prototype of Dazzle-attack and conduct empirical evaluation of Dazzle-attack to show that it imposes significant challenges to forensic analysis.
Abstract:  This research explores the possibility of a new anti-analysis technique, carefully designed to attack weaknesses of the existing program analysis approaches. It encodes a program code snippet to hide, and its decoding process is implemented by a sophisticated state machine that produces multiple outputs depending on inputs. The key idea of the proposed technique is to ambiguously decode the program code, resulting in multiple decoded code snippets that are challenging to distinguish from each other. Our approach is stealthier than previous similar approaches as its execution does not exhibit different behaviors between when it decodes correctly or incorrectly. This paper also presents analyses of weaknesses of existing techniques and discusses potential improvements. We implement and evaluate the proof of concept approach, and our preliminary results show that the proposed technique imposes various new unique challenges to the program analysis technique.
Abstract:  To provide secure content delivery, Transport Layer Security (TLS) has become a de facto standard over a couple of decades. However, TLS has a long history of security weaknesses and drawbacks. Thus, the security of TLS has been enhanced by addressing security problems through continuous version upgrades. Meanwhile, to provide fast content delivery globally, websites (or origin web servers) need to deploy and administer many machines in globally distributed environments. They often delegate the management of machines to web hosting services or content delivery networks (CDNs), where the security configurations of distributed servers may vary spatially depending on the managing entities or locations. Based on these spatial differences in TLS security, we find that the security level of TLS connections (and their web services) can be lowered. After collecting the information of (web) domains that exhibit different TLS versions and cryptographic options depending on clients' locations, we show that it is possible to redirect TLS handshake messages to weak TLS servers, which both the origin server and the client may not be aware of. We investigate 7M domains with these spatial differences of security levels in the wild and conduct the analyses to better understand the root causes of this phenomenon. We also measure redirection delays at various locations in the world to see whether there are noticeable delays in redirections.
Abstract:  Phishing attacks are causing substantial damage albeit extensive effort in academia and industry. Recently, a large volume of phishing attacks transit toward adopting HTTPS, leveraging TLS certificates issued from Certificate Authorities (CAs), to make the attacks more effective. In this paper, we present a comprehensive study on the security practices of CAs in the HTTPS phishing ecosystem. We focus on the CAs, critical actors under-studied in previous literature, to better understand the importance of the security practices of CAs and thwart the proliferating HTTPS phishing. In particular, we first present the current landscape and effectiveness of HTTPS phishing attacks comparing to traditional HTTP ones. Then, we conduct an empirical experiment on the CAs' security practices in terms of the issuance and revocation of the certificates. Our findings highlight serious conflicts between the expected security practices of CAs and reality, raising significant security concerns. We further validate our findings using a longitudinal dataset of abusive certificates used for real phishing attacks in the wild. We confirm that the security concerns of CAs prevail in the wild and these concerns can be one of the main contributors to the recent surge of HTTPS phishing attacks.
Abstract:  Transport Layer Security (TLS) has become the norm for secure communication over the Internet. In August 2018, TLS 1.3, the latest version that improves security and performance of the previous TLS version, was approved. In this paper, we take a closer look at TLS 1.3 deployments in practice regarding adoption rate, security, performance, and implementation by applying temporal, spatial, and platform-based approaches on 687M connections. Overall, TLS 1.3 has rapidly been adopted mainly due to third party platforms such as Content Delivery Networks (CDNs) makes a significant contribution to the Internet. In fact, it deprecates vulnerable cryptographic primitives and substantially reduces the time required to perform the TLS 1.3 full handshake compared to the TLS 1.2 handshake. We quantify these aspects and show TLS 1.3 is beneficial to websites that do not rely on the third-party platforms. We also review Common Vulnerabilities and Exposures (CVE) regarding TLS libraries and show that many of recent vulnerabilities can be easily addressed by upgrading to TLS 1.3. However, some websites exhibit unstable support for TLS 1.3 due to multiple platforms with different TLS versions or migration to other platforms, which means that a website can show the lower TLS version at a certain time or from a certain region. Furthermore, we find that most of the implementations (including TLS libraries) do not fully support the new features of TLS 1.3 such as downgrade protection and certificate extensions.
Abstract:  As the COVID-19 pandemic started triggering widespread lockdowns across the globe, cybercriminals did not hesitate to take advantage of users' increased usage of the Internet and their reliance on it. In this paper, we carry out a comprehensive measurement study of online social engineering attacks in the early months of the pandemic. By collecting, synthesizing, and analyzing DNS records, TLS certificates, phishing URLs, phishing website source code, phishing emails, web traffic to phishing websites, news articles, and government announcements, we track trends of phishing activity between January and May 2020 and seek to understand the key implications of the underlying trends. We find that phishing attack traffic in March and April 2020 skyrocketed up to 220\% of its pre-COVID-19 rate, far exceeding typical seasonal spikes. Attackers exploited victims' uncertainty and fear related to the pandemic through a variety of highly targeted scams, including emerging scam types against which current defenses are not sufficient as well as traditional phishing which outpaced the ecosystem's collective response.
Abstract:  Recent measurement studies have highlighted security threats against the code-signing public key infrastructure (PKI), such as certificates that had been compromised or issued directly to the malware authors. The primary mechanism for mitigating these threats is to revoke the abusive certificates. However, the distributed yet closed nature of the code signing PKI makes it difficult to evaluate the effectiveness of revocations in this ecosystem. In consequence, the magnitude of signed malware threat is not fully understood. In this paper, we collect seven datasets, including the largest corpus of code-signing certificates, and we combine them to analyze the revocation process from end to end. Effective revocations rely on three roles: (1) discovering the abusive certificates, (2) revoking the certificates effectively, and (3) disseminating the revocation information for clients. We assess the challenge for discovering compromised certificates and the subsequent revocation delays. We show that erroneously setting revocation dates causes signed malware to remain valid even after the certificate has been revoked. We also report failures in disseminating the revocations, leading clients to continue trusting the revoked certificates.
Abstract:  Digitally signed malware can bypass system protection mechanisms that install or launch only programs with valid signatures. It can also evade anti-virus programs, which often forego scanning signed binaries. Known from advanced threats such as Stuxnet and Flame, this type of abuse has not been measured systematically in the broader malware landscape. In particular, the methods, effectiveness window, and security implications of code-signing PKI abuse are not well understood. We propose a threat model that highlights three types of weaknesses in the code-signing PKI. We overcome challenges specific to code-signing measurements by introducing techniques for prioritizing the collection of code-signing certificates that are likely abusive. We also introduce an algorithm for distinguishing among different types of threats. These techniques allow us to study threats that breach the trust encoded in the Windows code-signing PKI. The threats include stealing the private keys associated with benign certificates and using them to sign malware or by impersonating legitimate companies that do not develop software and, hence, do not own code-signing certificates. Finally, we discuss the actionable implications of our findings and propose concrete steps for improving the security of the code-signing ecosystem.
Abstract:  Potentially dangerous cryptography errors are well-documented in many applications. Conventional wisdom suggests that many of these errors are caused by cryptographic Application Programming Interfaces (APIs) that are too complicated, have insecure defaults, or are poorly documented. To address this problem, researchers have created several cryptographic libraries that they claim are more usable, however, none of these libraries have been empirically evaluated for their ability to promote more secure development. This paper is the first to examine both how and why the design and resulting usability of different cryptographic libraries affects the security of code written with them, with the goal of understanding how to build effective future libraries. We conducted a controlled experiment in which 256 Python developers recruited from GitHub attempt common tasks involving symmetric and asymmetric cryptography using one of five different APIs. We examine their resulting code for functional correctness and security, and compare their results to their self-reported sentiment about their assigned library. Our results suggest that while APIs designed for simplicity can provide security benefits - reducing the decision space, as expected, prevents choice of insecure parameters - simplicity is not enough. Poor documentation, missing code examples, and a lack of auxiliary features such as secure key storage, caused even participants assigned to simplified libraries to struggle with both basic functional correctness and security. Surprisingly, the availability of comprehensive documentation and easy-to-use code examples seems to compensate for more complicated APIs in terms of functionally correct results and participant reactions, however, this did not extend to security results. We find it particularly concerning that for about 20% of functionally correct tasks, across libraries, participants believed their code was secure when it was not. Our results suggest that while new cryptographic libraries that want to promote effective security should offer a simple, convenient interface, this is not enough: they should also, and perhaps more importantly, ensure support for a broad range of common tasks and provide accessible documentation with secure, easy-to-use code examples.
Abstract:  Many critical communications now take place digitally, but recent revelations demonstrate that these communications can often be intercepted. To achieve true message privacy, users need end-to-end message encryption, in which the communications service provider is not able to decrypt the content. Historically, end-to-end encryption has proven extremely difficult for people to use correctly, but recently tools like Apple's iMessage and Google's End-to-End have made it more broadly accessible by using key-directory services. These tools (and others like them) sacrifice some security properties for convenience, which alarms some security experts, but little is known about how average users evaluate these tradeoffs. In a 52-person interview study, we asked participants to complete encryption tasks using both a traditional key-exchange model and a key-directory-based registration model. We also described the security properties of each (varying the order of presentation) and asked participants for their opinions. We found that participants understood the two models well and made coherent assessments about when different tradeoffs might be appropriate. Our participants recognized that the less-convenient exchange model was more secure overall, but found the security of the registration model to be “good enough” for many everyday purposes.
Abstract:  Vulnerabilities in Android code—including but not limited to insecure data storage, unprotected inter-component communication, broken TLS implementations, and violations of least privilege—have enabled real-world privacy leaks and motivated research cataloguing their prevalence and impact. Researchers have speculated that appification promotes security problems, as it increasingly allows inexperienced laymen to develop complex and sensitive apps. Anecdotally, Internet resources such as Stack Overflow are blamed for promoting insecure solutions that are naively copy-pasted by inexperienced developers. In this paper, we for the first time systematically analyzed how the use of information resources impacts code security. We first surveyed 295 app developers who have published in the Google Play market concerning how they use resources to solve security-related problems. Based on the survey results, we conducted a lab study with 54 Android developers (students and professionals), in which participants wrote security-and privacy-relevant code under time constraints. The participants were assigned to one of four conditions: free choice of resources, Stack Overflow only, official Android documentation only, or books only. Those participants who were allowed to use only Stack Overflow produced significantly less secure code than those using, the official Android documentation or books, while participants using the official Android documentation produced significantly less functional code than those using Stack Overflow. To assess the quality of Stack Overflow as a resource, we surveyed the 139 threads our participants accessed during the study, finding that only 25% of them were helpful in solving the assigned tasks and only 17% of them contained secure code snippets. In order to obtain ground truth concerning the prevalence of the secure and insecure code our participants wrote in the lab study, we statically analyzed a random sample of 200,000 apps from Google Play, finding that 93.6% of the apps used at least one of the API calls our participants used during our study. We also found that many of the security errors made by our participants also appear in the wild, possibly also originating in the use of Stack Overflow to solve programming problems. Taken together, our results confirm that API documentation is secure but hard to use, while informal documentation such as Stack Overflow is more accessible but often leads to insecurity. Given time constraints and economic pressures, we can expect that Android developers will continue to choose those resources that are easiest to use, therefore, our results firmly establish the need for secure-but-usable documentation.

Workshops

Abstract:  Password-based authentication is one of the most commonly adopted mechanisms for online security. Choosing strong passwords is crucial for protecting ones' digital identities and assets, as weak passwords can be readily guessable, resulting in a compromise such as unauthorized access. To promote the use of strong passwords on the Web, the National Institute of Standards and Technology (NIST) provides website administrators with password composition policy (PCP) guidelines. We manually inspect popular websites to check if their password policies conform to NIST's PCP guidelines by generating passwords that meet each criterion and testing the 100 popular websites. Our findings reveal that a considerable number of web sites (on average, 53.5 %) do not comply with the guidelines, which could result in password breaches.