Total: 54
Seunghoon Woo, Eunjin Choi, Heejo Lee, "A large-scale analysis of the effectiveness of publicly reported security patches", Computers & Security, Vol. 148, Jan. 1. 2025. | |
Public vulnerability reports assist developers in mitigating recurring threats caused by software vulnerabilities. However, security patches that lack effectiveness (1) may fail to completely resolve target vulnerabilities after application (i.e., require supplementary patches), or (2) cannot be directly applied to the codebase without modifying the patch code snippets. In this study, we systematically assessed the effectiveness of security patches from the perspective of their reliability and flexibility. We define a security patch as reliable or flexible, respectively, if it can resolve the vulnerability (1) without being complemented by additional patches or (2) without modifying the patch code snippets. Unlike previous studies that relied on manual inspection, we assess the reliability of a security patch by determining the presence of supplementary patches that complement the security patch. To evaluate flexibility, we first locate vulnerable codes in popular open-source software programs and then determine whether the security patch can be applied without any modifications. Our experiments on 8,100 security patches obtained from the National Vulnerability Database confirmed that one in ten of the collected patches lacked effectiveness. We discovered 476 (5.9%) unreliable patches that could still produce security issues after application; for 84.6% of the detected unreliable patches, the fact that a supplementary patch is required is not disclosed through public security reports. Furthermore, 377 (4.6%) security patches were observed to lack flexibility; we confirmed that 49.1% of the detected vulnerable codes required patch modifications owing to syntax diversity. Our findings revealed that the effectiveness of security patches can directly affect software security, suggesting the need to enhance the vulnerability reporting process.
|
|
Choongin Lee, Isa Jafarov, Sven Dietrich, Heejo Lee, "PRETT2: Discovering HTTP/2 DoS Vulnerabilities via Protocol Reverse Engineering", European Symposium on Research in Computer Security (ESORICS), Sep. 16. 2024. | |
HTTP/2, enhancing data transmission speed over HTTP/1.1 with features such as flow control for stream multiplexing, has seen widespread adoption across major web servers. This has exposed numerous vulnerabilities, with denial of service (DoS) particularly prominent due to flawed flow control implementations. Identifying potential weaknesses in the flow control across various HTTP/2 implementations has largely depended on manual inspection. However, the behavioral diversity among implementations poses significant challenges for testing.
In this study, we propose PRETT2, a stateful fuzzing framework targeting denial-of-service (DoS) vulnerabilities in HTTP/2 protocols. Utilizing automated protocol reverse engineering, PRETT2 infers state machines unique to various HTTP/2 implementations. Then it executes multiplexed fuzzing that manipulates flow control messages based on the identified state machines. Testing on servers such as Apache and Nginx revealed the capability of PRETT2 to infer multiple state machine types and uncover security vulnerabilities, including CVE-2023-43622 by Apache. This highlights the effectiveness of PRETT2 in identifying and addressing critical security vulnerabilities in HTTP/2.
|
|
Pyeongju Ahn, Yeonseok Jang, Seunghoon Woo, Heejo Lee, "BLOOMFUZZ : Unveiling Bluetooth L2CAP Vulnerabilities via State Cluster Fuzzing with Target-Oriented State Machines", European Symposium on Research in Computer Security (ESORICS), Sep. 16. 2024. | |
Bluetooth technologies are widely utilized across various devices. Despite the advantages, the lack of security in Bluetooth can pose critical threats. Existing approaches that rely solely on Bluetooth specification have failed to bridge the gap between documentation and implemented devices. Therefore, they struggle to (1) precisely generate state machines for target devices and (2) accurately track states during the fuzzing process, resulting in low fuzzing efficiency. In this paper, we propose BloomFuzz, a stateful fuzzer to discover vulnerabilities in Bluetooth Logical Link Control and Adaptation Protocol (L2CAP) layer. Utilizing the concept of the state cluster, which is a set of one or more states with similar attributes, BloomFuzz can generate a target-oriented state machine by pruning unimplemented states (missing states) and addressing states that are implemented but not introduced in the specification (hidden states). Furthermore, BloomFuzz enhances fuzzing efficiency by generating valid test packets for each cluster via cluster-based state machine tracking. When we applied BloomFuzz to real-world Bluetooth devices, we observed that BloomFuzz outperformed existing L2CAP fuzzers by (1) discovering 56 potential vulnerabilities (more than twice compared to existing fuzzers), (2) precisely generating a target-oriented state machine, (3) significantly reducing the probability of test packets being rejected (from 76% to 23%), and (4) producing nine times more valid malformed test packets. Our proposed approach can contribute to preventing threats within L2CAP, thereby rendering a secure Bluetooth environment.
|
|
Heejo Lee, Yoonjong Na, "METHOD AND APPARATUS FOR DETERMINING DEPENDENCIES BETWEEN COMPONENTS OF REUSED OPEN SOURCE SOFTWARE", Domestic, Application, 10-2024-0113268, Aug. 23. 2024. | |
이희조, 나윤종 “재사용된 오픈소스 소프트웨어의 컴포넌트 간 의존성 판별 방법 및 장치”, 대한민국, 출원, 10-2024-0113268, 2024.08.23
|
|
Yoonjong Na, Seunghoon Woo, Joomyeong Lee, Heejo Lee, "CNEPS: A Precise Approach for Examining Dependencies among Third-Party C/C++ Open-Source Components", IEEE/ACM Int'l Conf. on Software Engineering (ICSE), Apr. 19. 2024. | |
The rise in open-source software (OSS) reuse has led to intricate dependencies among third-party components, increasing the demand
for precise dependency analysis. However, owing to the presence of
reused files that are difficult to identify the originating components
(i.e., indistinguishable files) and duplicated components, precisely
identifying component dependencies is becoming challenging.
In this paper, we present CNEPS, a precise approach for examining dependencies in reused C/C++ OSS components. The key idea
of CNEPS is to use a novel granularity called a module, which represents a minimum unit (i.e., set of source files) that can be reused as
a library from another project. By examining dependencies based
on modules instead of analyzing single reused files, CNEPS can
precisely identify dependencies in the target projects, even in the
presence of indistinguishable files. To differentiate duplicated components, CNEPS examines the cloned paths and originating projects
of each component, enabling precise identification of dependencies
associated with them. Experimental results on top 100 C/C++ software show that CNEPS outperforms a state-of-the-art approach by
identifying twice as many dependencies. CNEPS could identify 435
dependencies with 89.9% precision and 93.2% recall in less than 10
seconds per application on average, whereas the existing approach
hardly achieved 63.5% precision and 42.5% recall.
|
|
Seunghoon Woo, Eunjin Choi, Heejo Lee, Hakjoo Oh, "V1SCAN: Discovering 1-day Vulnerabilities in Reused C/C++ Open-source Software Components Using Code Classification Techniques", USENIX Security Symposium, pp. 6541-6556, Aug. 9. 2023. | |
We present V1SCAN, an effective approach for discovering 1-day vulnerabilities in reused C/C++ open-source software (OSS) components. Reusing third-party OSS has many benefits, but can put the entire software at risk owing to the vulnerabilities they propagate. In mitigation, several techniques for detecting propagated vulnerabilities, which can be classified into version- and code-based approaches, have been proposed. However, state-of-the-art techniques unfortunately produce many false positives or negatives when OSS projects are reused with code modifications. In this paper, we show that these limitations can be addressed by improving version- and code-based approaches and synergistically combining them. By classifying reused code from OSS components, V1SCAN only considers vulnerabilities contained in the target program and filters out unused vulnerable code, thereby reducing false alarms produced by version-based approaches. V1SCAN improves the coverage of code-based approaches by classifying vulnerable code and then detecting vulnerabilities propagated with code changes in various code locations. Evaluation on GitHub popular C/C++ software showed that V1SCAN outperformed state-of-the-art vulnerability detection approaches by discovering 50% more vulnerabilities than they detected. In addition, V1SCAN reduced the false positive rate of the simple integration of existing version- and code-based approaches from 71% to 4% and the false negative rate from 33% to 7%. With V1SCAN, developers can detect propagated vulnerabilities with high accuracy, maintaining a secure software supply chain
|
|
Hyunji Hong, Seunghoon Woo, Sunghan Park, Jeongwook Lee, Heejo Lee, "CIRCUIT: A JavaScript Memory Heap-Based Approach for Precisely Detecting Cryptojacking Websites", IEEE Access, Sep. 6. 2022. | |
Cryptojacking is often used by attackers as a means of gaining profits by exploiting users’ resources without their consent, despite the anticipated positive effect of browser-based cryptomining. Previous approaches have attempted to detect cryptojacking websites, but they have the following limitations: (1) they failed to detect several cryptojacking websites either because of their evasion techniques or because they cannot detect JavaScript-based cryptojacking and (2) they yielded several false alarms by focusing only on limited characteristics of cryptojacking, such as counting computer resources. In this paper, we propose CIRCUIT, a precise approach for detecting cryptojacking websites. We primarily focuse on the JavaScript memory heap, which is resilient to script code obfuscation and provides information about the objects declared in the script code and their reference relations. We then extract a reference flow that can represent the script code behavior of the website from the JavaScript memory heap. Hence, CIRCUIT determines that a website is running cryptojacking if it contains a reference flow for cryptojacking. In our experiments, we found 1,813 real-world cryptojacking websites among 300K popular websites. Moreover, we provided new insights into cryptojacking by modeling the identified evasion techniques and considering the fact that characteristics of cryptojacking websites now appear on normal websites as well.
|
|
Hyunji Hong, Seunghoon Woo, Eunjin Choi, Jihyun Choi, Heejo Lee, "xVDB: A High-Coverage Approach for Constructing a Vulnerability Database", IEEE Access, Aug. 10. 2022. | |
Security patches play an important role in detecting and fixing one-day vulnerabilities.
However, collecting abundant security patches from diverse data sources is not a simple task. This is because (1) each data source provides vulnerability information in a different way and (2) many security patches cannot be directly collected from Common Vulnerabilities and Exposures (CVE) information (e.g., National Vulnerability Database (NVD) references). In this paper, we propose a high-coverage approach that collects known security patches by tracking multiple data sources. Specifically, we considered the
following three data sources: repositories (e.g., GitHub), issue trackers (e.g., Bugzilla), and Q&A sites (e.g., Stack Overflow). From the data sources, we gather even security patches that cannot be collected by considering only CVE information (i.e., previously untracked security patches). In our experiments, we collected 12,432 CVE patches from repositories and issue trackers, and 12,458 insecure posts from Q&A sites. We could collect at least four times more CVE patches than those collected in existing approaches,
which demonstrates the efficacy of our approach. The collected security patches serves as a database on a public website (i.e., IoTcube) to proceed with the detection of vulnerable code clones.
|
|
Seunghoon Woo, Hyunji Hong, Eunjin Choi, Heejo Lee, "MOVERY: A Precise Approach for Modified Vulnerable Code Clone Discovery from Modified Open-Source Software Components", USENIX Security Symposium, Aug. 10. 2022. | |
Vulnerabilities inherited from third-party open-source software (OSS) components can compromise the entire software security. However, discovering propagated vulnerable code is challenging as it proliferates with various code syntaxes owing to the OSS modifications, more specifically, internal (e.g., OSS updates) and external modifications of OSS (e.g., code changes that occur during the OSS reuse). In this paper, we present MOVERY, a precise approach for discovering vulnerable code clones (VCCs) from modified OSS components. By considering the oldest vulnerable function and extracting only core vulnerable and patch lines from security patches, MOVERY generates vulnerability and patch signatures that effectively address OSS modifications. For scalability, MOVERY reduces the search space of the target software by focusing only on the codes borrowed from other OSS projects. Finally, MOVERY determines that the function is VCC when it matches the vulnerability signature and is distinctive from the patch signature. When we applied MOVERY on ten popular software selected from diverse domains, we observed that 91% of the discovered VCCs had different code syntax from the disclosed vulnerable function. Nonetheless, MOVERY discovered VCCs at least 2.5 times more than those discovered in existing techniques, with much higher accuracy: MOVERY discovered 415 VCCs with 96% precision and 96% recall, whereas two recent VCC discovery techniques, which hardly consider internal and external OSS modifications, discovered only 163 and 72 VCCs with at most 77% precision and 38% recall.
|
|
Haram Park, Carlos Kayembe Nkuba, Seunghoon Woo, Heejo Lee, "L2Fuzz: Discovering Bluetooth L2CAP Vulnerabilities Using Stateful Fuzz Testing", Annual IEEE/IFIP Int'l Conf. on Dependable Systems and Networks (DSN), Jun. 27. 2022. | |
Bluetooth Basic Rate/Enhanced Data Rate (BR/EDR) is a wireless technology used in billions of devices. Recently, several Bluetooth fuzzing studies have been conducted to detect vulnerabilities in Bluetooth devices, but they fall short of effectively generating malformed packets. In this paper, we propose L2FUZZ, a stateful fuzzer to detect vulnerabilities in Bluetooth BR/EDR Logical Link Control and Adaptation Protocol (L2CAP) layer. By selecting valid commands for each state and mutating only the core fields of packets, L2FUZZ can generate valid malformed packets that are less likely to be rejected by the target device. Our experimental results confirmed that: (1) L2FUZZ generates up to 46 times more malformed packets with a much less packet rejection ratio compared to the existing techniques, and (2) L2FUZZ detected five zero-day vulnerabilities from eight real-world Bluetooth devices.
|
|
Hyunji Hong, Seunghoon Woo, Heejo Lee, "Dicos: Discovering Insecure Code Snippets from Stack Overflow Posts by Leveraging User Discussions", Annual Computer Security Applications Conf. (ACSAC), pp. 194–206, Dec. 6. 2021. | |
Online Q&A fora such as Stack Overflow assist developers to solve their faced coding problems. Despite the advantages, Stack Overflow has the potential to provide insecure code snippets that, if reused, can compromise the security of the entire software. We present Dicos, an accurate approach by examining the change history of Stack Overflow posts for discovering insecure code snippets. When a security issue was detected in a post, the insecure code is fixed to be safe through user discussions, leaving a change history. Inspired by this process, Dicos first extracts the change history from the Stack Overflow post, and then analyzes the history whether it contains security patches, by utilizing pre-selected features that can effectively identify security patches. Finally, when such changes are detected, Dicos determines that the code snippet before applying the security patch is insecure. To evaluate Dicos, we collected 1,958,283 Stack Overflow posts tagged with C, C++, and Android. When we applied Dicos on the collected posts, Dicos discovered 12,458 insecure posts (i.e., 14,719 insecure code snippets) from the collected posts with 91% precision and 93% recall. We further confirmed that the latest versions of 151 out of 2,000 popular C/C++ open-source software contain at least one insecure code snippet taken from Stack Overflow, being discovered by Dicos. Our proposed approach, Dicos, can contribute to preventing further propagation of insecure codes and thus creating a safe code reuse environment.
|
|
Seunghoon Woo, Dongwook Lee, Sunghan Park, Heejo Lee, Sven Dietrich, "V0Finder: Discovering the Correct Origin of Publicly Reported Software Vulnerabilities", USENIX Security Symposium, pp. 3041-3058, Aug. 11. 2021. | |
Common Vulnerabilities and Exposures (CVEs) are used to ensure confidence among developers, to share information about software vulnerabilities, and to provide a baseline for security measures. Therefore, the correctness of CVE reports is crucial for detecting and patching software vulnerabilities. In this paper, we introduce the concept of “Vulnerability Zero” (VZ), the software where a vulnerability first originated. We then present V0Finder, a precise mechanism for discovering the VZ of a vulnerability, including software name and its version. V0Finder utilizes code-based analysis to identify reuse relations, which specify the direction of vulnerability propagation, among vulnerable software. V0Finder constructs a graph from all the identified directions and traces backward to the root of that graph to find the VZ. We applied V0Finder to 5,671 CVE vulnerabilities collected from the National Vulnerability Database (NVD) and popular Bugzilla-based projects. V0Finder discovered VZs with high accuracy of 98% precision and 95% recall. Furthermore, V0Finder identified 96 CVEs with incorrect information related to their respective VZs. We confirmed that the incorrect VZ causes prolonged patch updates of vulnerable software; the patch update of CVEs with the incorrect VZ information takes 2 years, while the patch update of CVEs with the correct VZ takes less than a year on average. Such incorrectly identified VZ hinders the objective of the CVE and causes confusion rather than “ensuring confidence” among developers. Our analysis shows that V0Finder can enhance the credibility of information provided by the CVEs.
|
|
Hajin Jang, Kyeongseok Yang, Geonwoo Lee, Jeremy D. Seideman, Shoufu Luo, Sven Dietrich, Heejo Lee, "QuickBCC: Quick and Scalable Binary Vulnerable Code Clone Detection", IFIP Int'l Conf. on ICT Systems Security and Privacy Protection (IFIP SEC), pp. 66-82, Jun. 22. 2021. | |
Due to code reuse among software packages, vulnerabilities can propagate from one software package to another. Current code clone detection techniques are useful for preventing and managing such vulnerability propagation. When the source code for a software package is not available, such as when working with proprietary or custom software distributions, binary code clone detection can be used to examine software for flaws. However, existing binary code clone detectors have scalability issues, or are limited in their accurate detection of vulnerable code clones. In this paper, we introduce QuickBCC, a scalable binary code clone detection framework designed for vulnerability scanning. The framework was built on the idea of extracting semantics from vulnerable binaries both before and after security patches, and comparing them to target binaries. In order to improve performance, we created a signature based on the changes between the pre- and post-patched binaries, and implemented a filtering process when comparing the signatures to the target binaries. In addition, we leverage the smallest semantic unit, a strand, to improve accuracy and robustness against compile environments. QuickBCC is highly optimized, capable of preprocessing 5,439 target binaries within 111 min, and is able to match those binaries against 6 signatures in 23 s when running as a multi-threaded application. QuickBCC takes, on average, 3 ms to match one target binary. Comparing performance to other approaches, we found that it outperformed other approaches in terms of performance when detecting well known vulnerabilities with acceptable level of accuracy.
|
|
Seongkyeong Kwon, Seunghoon Woo, Gangmo Seong, Heejo Lee, "OCTOPOCS: Automatic Verification of Propagated Vulnerable Code Using Reformed Proofs of Concept", Annual IEEE/IFIP Int'l Conf. on Dependable Systems and Networks (DSN), pp. 174-185, Jun. 21. 2021. | |
Addressing vulnerability propagation has become a major issue in software ecosystems. Existing approaches hold the promise of detecting widespread vulnerabilities but cannot be applied to verify effectively whether propagated vulnerable code still poses threats. We present OCTOPOCS, which uses a reformed Proof-of-Concept (PoC), to verify whether a vulnerability is propagated. Using context-aware taint analysis, OCTOPOCS extracts crash primitives (the parts used in the shared code area between the original vulnerable software and propagated software) from the original PoC. OCTOPOCS then utilizes directed symbolic execution to generate guiding inputs that direct the execution of the propagated software from the entry point to the shared code area. Thereafter, OCTOPOCS creates a new PoC by combining crash primitives and guiding inputs. It finally verifies the propagated vulnerability using the created PoC. We evaluated OCTOPOCS with 15 real-world C and C++ vulnerable software pairs, with results showing that OCTOPOCS successfully verified 14 propagated vulnerabilities.
|
|
Hyok An, Yoonjong Na, Heejo Lee, Adrian Perrig, "Resilience Evaluation of Multi-Path Routing against Network Attacks and Failures", Electronics, Vol. 10, No. 1240, May. 24. 2021. | |
The current state of security and availability of the Internet is far from being commensurate with its importance. The number and strength of DDoS attacks conducted at the network layer have been steadily increasing. However, the single path (SP) routing used in today’s Internet lacks a mitigation scheme to rapidly recover from network attacks or link failure. In case of a link failure occurs, it can take several minutes until failover. In contrast, multi-path routing can take advantage of multiple alternative paths and rapidly switch to another working path. According to the level of available path control, we classfy the multi-path routing into two types, first-hop multi-path (FMP) and multi-hop multi-path (MMP) routing. Although FMP routing supported by networks, such as SD-WAN, shows marginal improvements over the current SP routing of the Internet, MMP routing supported by a global Internet architecture provides strong improvement under network attacks and link failure. MMP routing enables changing to alternate paths to mitigate the network problem in other hops, which cannot be controlled by FMP routing. To show this comparison with practical outcome, we evaluate network performance in terms of latency and loss rate to show that MMP routing can mitigate Internet hazards and provide high availability on global networks by 18 participating ASes in six countries. Our evaluation of global networks shows that, if network attacks or failures occur in other autonomous systems (ASes) that FMP routing cannot avoid, it is feasible to deal with such problems by switching to alternative paths by using MMP routing. When the global evaluation is under a transit-link DDoS attack, the loss rates of FMP that pass the transit-link are affected significantly by a transit-link DDoS attack, but the other alternative MMP paths show stable status under the DDoS attack with proper operation.
|
|
Seunghoon Woo, Sunghan Park, Seulbae Kim, Hakjoo Oh, Heejo Lee, "CENTRIS: A Precise and Scalable Approach for Identifying Modified Open-Source Software Reuse", IEEE/ACM Int'l Conf. on Software Engineering (ICSE), pp. 860-872, May. 22. 2021. | |
Open-source software (OSS) is widely reused as it provides convenience and efficiency in software development. Despite evident benefits, unmanaged OSS components can introduce threats, such as vulnerability propagation and license violation. Unfortunately, however, identifying reused OSS components is a challenge as the reused OSS is predominantly modified and nested. In this paper, we propose CENTRIS, a precise and scalable approach for identifying modified OSS reuse. By segmenting an OSS code base and detecting the reuse of a unique part of the OSS only, CENTRIS is capable of precisely identifying modified OSS reuse in the presence of nested OSS components. For scalability, CENTRIS eliminates redundant code comparisons and accelerates the search using hash functions. When we applied CENTRIS on 10,241 widely-employed GitHub projects, comprising 229,326 versions and 80 billion lines of code, we observed that modified OSS reuse is a norm in software development, occurring 20 times more frequently than exact reuse. Nonetheless, CENTRIS identified reused OSS components with 91% precision and 94% recall in less than a minute per application on average, whereas a recent clone detection technique, which does not take into account modified and nested OSS reuse, hardly reached 10% precision and 40% recall.
|
|
Sunbeom So, Myungho Lee, Jisu Park, Heejo Lee, Hakjoo Oh, "VeriSmart: A Highly Precise Safety Verifier for Ethereum Smart Contracts", IEEE Symposium on Security and Privacy, May. 18. 2020. | |
We present VERISMART, a highly precise verifier for ensuring arithmetic safety of Ethereum smart contracts. Writing safe smart contracts without unintended behavior is critically important because smart contracts are immutable and even a single flaw can cause huge financial damage. In particular, ensuring that arithmetic operations are safe is one of the most important and common security concerns of Ethereum smart contracts nowadays. In response, several safety analyzers have been proposed over the past few years, but state-of-the-art is still unsatisfactory; no existing tools achieve high precision and recall at the same time, inherently limited to producing annoying false alarms or missing critical bugs. By contrast, VERISMART aims for an uncompromising analyzer that performs exhaustive verification without compromising precision or scalability, thereby greatly reducing the burden of manually checking undiscovered or incorrectly-reported issues. To achieve this goal, we present a new domain-specific algorithm for verifying smart contracts, which is able to automatically discover and leverage transaction invariants that are essential for precisely analyzing smart contracts. Evaluation with real-world smart contracts shows that VERISMART can detect all arithmetic bugs with a negligible number of false alarms, far outperforming existing analyzers.
|
|
Seulbae Kim, Heejo Lee, "Software systems at risk: an empirical study of cloned vulnerabilities in practice", Computers & Security, Vol. 77, pp. 720-736, Aug. 1. 2018. | |
With the growth of open source software (OSS), code clones - code fragments that are copied and pasted within or between software systems - are proliferating. Although code cloning may expedite the process of software development, it often critically affects the security of software because vulnerabilities and bugs can easily be propagated through code clones. These vulnerable code clones are increasing in conjunction with the growth of OSS, potentially contaminating many systems. Although researchers have attempted to detect code clones for decades, most of these attempts fail to scale to the size of the ever-growing OSS code base. The lack of scalability prevents software developers from readily managing code clones and associated vulnerabilities. Moreover, most existing clone detection techniques focus overly on merely detecting clones and this impairs their ability to accurately find “vulnerable” clones.
In this paper, we propose VUDDY, an approach for the scalable detection of vulnerable code clones, which is capable of detecting security vulnerabilities in large software programs efficiently and accurately. Its extreme scalability is achieved by leveraging function-level granularity and a length-filtering technique that reduces the number of signature comparisons. This efficient design enables VUDDY to preprocess a billion lines of code in 14 hour and 17 minutes, after which it requires a few seconds to identify code clones. In addition, we designed a vulnerability-preserving abstraction technique that renders VUDDY resilient to common modifications in cloned code, while preserving the vulnerable conditions even after the abstraction is applied. This extends the scope of VUDDY to identifying variants of known vulnerabilities, with high accuracy. An implementation of VUDDY has been serviced online for free at IoTcube, an automated vulnerability detection platform. In this study, we describe its principles, evaluate its efficacy, and analyze the vulnerabilities VUDDY detected in various real-world software systems, such as Apache HTTPD server and an Android smartphone.
|
|
Sangwoo Kim, Seokmyung Hong, Jaesang Oh, Heejo Lee, "Obfuscated VBA Macro Detection Using Machine Learning", IEEE/IFIP Int'l Conf. on Dependable Systems and Networks, pp. 490-501, Jun. 28. 2018. | |
Malware using document files as an attack vector has continued to increase and now constitutes a large portion of phishing attacks. To avoid anti-virus detection, malware writers usually implement obfuscation techniques in their source code. Although obfuscation is related to malicious code detection, little research has been conducted on obfuscation with regards to Visual Basic for Applications (VBA) macros.
In this paper, we summarize the obfuscation techniques and propose an obfuscated macro code detection method using five machine learning classifiers. To train these classifiers, our proposed method uses 15 discriminant static features, taking into account the characteristics of the VBA macros. We evaluated our approach using a real-world dataset of obfuscated and non-obfuscated VBA macros extracted from Microsoft Office document files. The experimental results demonstrate that our detection approach achieved a F2 score improvement of greater than 23% compared to those of related studies.
|
|
Seulbae Kim, Seunghoon Woo, Heejo Lee, Hakjoo Oh, "VUDDY: A Scalable Approach for Vulnerable Code Clone Discovery", IEEE Symposium on Security and Privacy, May. 22. 2017. | |
The ecosystem of open source software (OSS) has been growing considerably in size. In addition, code clones - code fragments that are copied and pasted within or between software systems - are also proliferating. Although code cloning may expedite the process of software development, it often critically affects the security of software because vulnerabilities and bugs can easily be propagated through code clones. These vulnerable code clones are increasing in conjunction with the growth of OSS, potentially contaminating many systems. Although researchers have attempted to detect code clones for decades, most of these attempts fail to scale to the size of the ever-growing OSS code base. The lack of scalability prevents software developers from readily managing code clones and associated vulnerabilities. Moreover, most existing clone detection techniques focus overly on merely detecting clones and this impairs their ability to accurately find "vulnerable" clones.
In this paper, we propose VUDDY, an approach for the scalable detection of vulnerable code clones, which is capable of detecting security vulnerabilities in large software programs efficiently and accurately. Its extreme scalability is achieved by leveraging function-level granularity and a length-filtering technique that reduces the number of signature comparisons. This efficient design enables VUDDY to preprocess a billion lines of code in 14 hour and 17 minutes, after which it requires a few seconds to identify code clones. In addition, we designed a security-aware abstraction technique that renders VUDDY resilient to common modifications in cloned code, while preserving the vulnerable conditions even after the abstraction is applied. This extends the scope of VUDDY to identifying variants of known vulnerabilities, with high accuracy. In this study, we describe its principles and evaluate its efficacy and effectiveness by comparing it with existing mechanisms and presenting the vulnerabilities it detected. VUDDY outperformed four state-of-the-art code clone detection techniques in terms of both scalability and accuracy, and proved its effectiveness by detecting zero-day vulnerabilities in widely used software systems, such as Apache HTTPD and Ubuntu OS Distribution.
|
|
Hyok An, Heejo Lee, Adrian Perrig, "Coordination of Anti-Spoofing Mechanisms in Partial Deployments", Journal of Communications and Networks, Vol. 18, No. 6, pp. 948-961, Dec. 31. 2016. | |
Internet protocol (IP) spoofing is a serious problem on the Internet. It is an attractive technique for adversaries who wish to amplify their network attacks and retain anonymity. Many approaches have been proposed to prevent IP spoofing attacks; however, they do not address a significant deployment issue, i.e., filtering inefficiency caused by a lack of deployment incentives for adopters. To defeat attacks effectively, one mechanism must be widely deployed on the network; however, the majority of the antispoofing mechanisms are unsuitable to solve the deployment issue by themselves. Each mechanism can work separately; however, their defensive power is considerably weak when insuffi- ciently deployed. If we coordinate partially deployed mechanisms such that they work together, they demonstrate considerably superior performance by creating a synergy effect that overcomes their limited deployment. Therefore, we propose a universal antispoofing (UAS) mechanism that incorporates existing mechanisms to thwart IP spoofing attacks. In the proposed mechanism, intermediate routers utilize any existing anti-spoofing mechanism that can ascertain if a packet is spoofed and records this decision in the packet header. The edge routers of a victim network can estimate the forgery of a packet based on this information sent by the upstream routers. The results of experiments conducted with real Internet topologies indicate that UAS reduces false alarms up to 84.5% compared to the case where each mechanism operates individually.
|
|
Jonghoon Kwon, Jehyun Lee, Heejo Lee, Adrian Perrig, "PsyBoG: A scalable botnet detection method for large-scale DNS traffic", Computer Networks, Vol. 97, pp. 48-73, Mar. 14. 2016. | |
Domain Name System (DNS) traffic has become a rich source of
information from a security perspective. However, the volume of
DNS traffic has been skyrocketing, such that security analyzers
experience difficulties in collecting, retrieving, and
analyzing the DNS traffic in response to modern Internet
threats. More precisely, much of the research relating to DNS
has been negatively affected by the dramatic increase in the
number of queries and domains. This phenomenon has necessitated
a scalable approach, which is not dependent on the volume of
DNS traffic. In this paper, we introduce a fast and scalable
approach, called PsyBoG, for detecting malicious behavior
within large volumes of DNS traffic. PsyBoG leverages a signal
processing technique, power spectral density (PSD) analysis, to
discover the major frequencies resulting from the periodic DNS
queries of botnets. The PSD analysis allows us to detect
sophisticated botnets regardless of their evasive techniques,
sporadic behavior, and even normal users’ traffic.
Furthermore, our method allows us to deal with large-scale DNS
data by only utilizing the timing information of query
generation regardless of the number of queries and domains.
Finally, PsyBoG discovers groups of hosts which show similar
patterns of malicious behavior. PsyBoG was evaluated by
conducting experiments with two different data sets, namely DNS
traces generated by real malware in controlled environments and
a large number of real-world DNS traces collected from a
recursive DNS server, an authoritative DNS server, and Top-
Level Domain (TLD) servers. We utilized the malware traces as
the ground truth, and, as a result, PsyBoG performed with a
detection accuracy of 95%. By using a large number of DNS
traces, we were able to demonstrate the scalability and
effectiveness of PsyBoG in terms of practical usage. Finally,
PsyBoG detected 23 unknown and 26 known botnet groups with 0.1%
false positives.
|
|
Yu Seung Kim, Patrick Tague, Heejo Lee, Hyogon Kim, "A jamming approach to enhance enterprise Wi-Fi secrecy through spatial access control", Wireless Networks, Vol. 21, No. 8, pp. 2631-2647, Nov. 1. 2015. | |
Prevalent Wi-Fi networks have adopted various protections to prevent eavesdropping caused by the intrinsic shared nature of wireless medium. However, many of them are based on pre-shared secret incurring key management costs, and are still vulnerable from practical countermeasures. In this study, we investigate the feasibility of using defensive jamming technique to protect enterprise Wi-Fi networks from potential eavesdroppers. This non-cryptographic approach requires neither any pre-shared key or high deployment costs. Defensive jammers geographically confine the wireless coverage of Wi-Fi access point, and thus block the message reception outside an arbitrary boundary at a physical layer. We provide a theoretical model fine tuning the jamming parameters for jammer placement. We then discuss practical considerations including optimized jammer arrangement algorithms, interference countermeasures to legitimate communications, and countermeasures against advanced attackers.
|
|
Wade Trappe, Lalitha Sankar, Radha Poovendran, Heejo Lee, Srdjan Capkun, "Introduction to the Issue on Signal and Information Processing for Privacy", IEEE Journal of Selected Topics in Signal Processing, Vol. 9, No. 7, pp. 1173-1175, Oct. 1. 2015. | |
Jonghoon Kwon, Dongwon Seo, Minjin Kwon, Heejo Lee, Adrian Perrig, Hyogon Kim, "An incrementally deployable anti-spoofing mechanism for software-defined networks", Computer Communications, Vol. 64, pp. 1-20, Jun. 15. 2015. | |
Internet attacks often use IP spoofing to forge the source IP address of packets, and thereby hide the identity of the source. It causes many serious security problems such as the difficulty of packet authenticity and IP traceback. While many IP spoofing prevention techniques have been proposed apart from ingress filtering, none have achieved widespread real-world use. One main reason is the lack of properties favoring incremental deployment, an essential component for new technology adoption. An incrementally deployable protocol should have three properties: initial benefits for early adopters, incremental benefits for subsequent adopters, and effectiveness under partial deployment. Since no previous anti-spoofing solution satisfies all three properties, we propose an anti-spoofing mechanism called “BGP-based Anti-Spoofing Extension” (BASE). BASE is an anti-spoofing protocol designed to fulfill the incremental deployment properties. Furthermore, BASE is designed to work in the software-defined networks (SDN). It gives a motivation to network operators to adopt BASE into their network, since the idea of SDN supports the large scale network control with a simple operation. Based on simulations using a model of Internet connectivity, BASE shows desirable IP spoofing prevention capabilities under partial deployment. We find that just 30% deployment can drop about 97% of attack packets. It is shown that BASE not only provides benefits to early adopters, but also outperforms previous anti-spoofing mechanisms.
|
|
Hongzhe Li, Hyuckmin Kwon, Jonghoon Kwon, Heejo Lee, "CLORIFI: Software Vulnerability Discovery using Code Clone Verification", Concurrency and Computation: Practice and Experience, Vol. 28, No. 6, pp. 1900-1917, Apr. 14. 2015. | |
Software vulnerability has long been considered an important threat to the system safety. A vulnerability is often reproduced because of the frequent code reuse by programmers. Security patches are usually not propagated to all code clones; however, they could be leveraged to discover unknown vulnerabilities. Static code auditing approaches are frequently proposed to scan source codes for security flaws; unfortunately, these approaches generate too many false positives. While dynamic execution analysis methods can precisely report vulnerabilities, they are ineffective in path exploration, which limits them to scale to large programs. With the purpose of detecting vulnerability in a scalable way with more preciseness, in this paper, we propose a novel mechanism, called software vulnerability discovery using Code Clone Verification (CLORIFI), that scalably discovers vulnerabilities in real world programs using code clone verification. In the beginning, we use a fast and scalable syntax-based way to find code clones in program source codes based on released security patches. Subsequently, code clones are being verified using concolic testing to dramatically decrease the false positives. In addition, we mitigate the path explosion problem by backward sensitive data tracing in concolic execution. Experiments have been conducted with real-world open-source projects (recent Linux OS distributions and program packages). As a result, we found 7 real vulnerabilities out of 63 code clones from Ubuntu 14.04 LTS (Canonical, London, UK) and 10 vulnerabilities out of 40 code clones from CentOS 7.0 (The CentOS Project(community contributed)). Furthermore, we confirmed more code clone vulnerabilities in various versions of programs including Rsyslog (Open Source(Original author: Rainer Gerhards)), Apache (Apache Software Foundation, Forest Hill, Maryland, USA) and Firefox (Mozilla Corporation, Mountain View, California, USA). In order to evaluate the effectiveness of vulnerability verification in a systematic way, we also utilized Juliet Test Suite as measurement objects. The results show that CLORIFI achieves 98% accuracy with 0 false positives.
|
|
Jehyun Lee, Suyeon Lee, Heejo Lee, "Screening smartphone applications using malware family signatures", Computers & Security, Vol. 52, pp. 234-249, Feb. 21. 2015. | |
The sharp increase in smartphone malware has become one of the most serious security problems. Since the Android platform has taken the dominant position in smartphone popularity, the number of Android malware has grown correspondingly and represents critical threat to the smartphone users. This rise in malware is primarily attributable to the occurrence of variants of existing malware. A set of variants stem from one malware can be considered as one malware family, and malware families cover more than half of the Android malware population. A conventional technique for defeating malware is the use of signature matching which is efficient from a time perspective but not very practical because of its lack of robustness against the malware variants. As a counter approach for handling the issue of variants behavior analysis techniques have been proposed but require extensive time and resources. In this paper, we propose an Android malware detection mechanism that uses automated family signature extraction and family signature matching. Key concept of the mechanism is to extract a set of family representative binary patterns from evaluated family members as a signature and to classify each set of variants into a malware family via an estimation of similarity to the signatures. The proposed family signature and detection mechanism offers more flexible variant detection than does the legacy signature matching, which is strictly dependent on the presence of a specific string. Furthermore, compared with the previous behavior analysis techniques considering family detection, the proposed family signature has higher detection accuracy without the need for the significant overhead of data and control flow analysis. Using the proposed signature, we can detect new variants of known malware efficiently and accurately by static matching. We evaluated our mechanism with 5846 real world Android malware samples belonging to 48 families collected in April 2014 at an anti-virus company; experimental results showed that; our mechanism achieved greater than 97% accuracy in detection of variants. We also demonstrated that the mechanism has a linear time complexity with the number of target applications.
|
|
Jihwan Jeong, Dongwon Seo, Chanyoung Lee, Jonghoon Kwon, Heejo Lee, John Milburn, "MysteryChecker: Unpredictable Attestation to Detect Repackaged Malicious Applications in Android", IEEE MALWARE, pp. 50-57, Oct. 29. 2014. | |
The number of malicious applications, sometimes
known as malapps, in Android smartphones has increased
significantly in recent years. Malapp writers abuse repackaging
techniques to rebuild applications with code changes.
Existing anti-malware applications do not successfully defeat
or defend against the repackaged malapps due to numerous
variants. Software-based attestation approaches widely used in a
resource-constrained environment have been developed to detect
code changes of software with low resource consumption. In this
paper, we propose a novel software-based attestation approach,
called MysteryChecker, leveraging an unpredictable attestation
algorithm. For its unpredictable attestation, MysteryChecker
applies the concept of code obfuscation, which changes the syntax
in order to avoid code analysis by adversaries. More precisely,
unpredictable attestation is achieved by chaining randomly selected
crypto functions. A verifier sends a randomly generated
attestation module, and the target application must reply with
a correct response using the attestation module. Also, the target
application periodically receives a new module that contains a
different attestation algorithm. Thus, even if the attacker analyzes
the attestation module, the target application replaces the existing
attestation module with a new one and the analysis done by
the attacker becomes invalid. Experimental results show that
MysteryChecker is completely able to detect known and unknown
variants of repackaged malapps, while existing anti-malware
applications only partially detect the variants.
|
|
Ilju Seo, Heejo Lee, Seungchul Han, "Cylindrical Coordinates Security Visualization for multiple domain command and control botnet detection", Computers & Security, Vol. 46, pp. 141-153, Oct. 1. 2014. | |
The botnets are one of the most dangerous species of network-based attack. They cause severe network disruptions through massive coordinated attacks nowadays and the results of this disruption frequently cost enterprises large sums in financial losses. In this paper, we make an in-depth investigation on the issue of botnet detection and present a new security visualization tool for visualizing botnet behaviors on DNS traffic. The core mechanism is developed with the objective of enabling users to recognize security threats promptly and mitigate the damages by only visualizing DNS traffic in cylindrical coordinates. We compare our visualization method with existing ones and the experimental results show that ours has greater perceptual efficiency. The ideas and results of this study will contribute toward designing an advanced visualization technique that offers better security. Also, the approach proposed in this study can be utilized to derive new and valuable insights in security aspects from the complex correlations of Big Data.
|
|
Jehyun Lee, Heejo Lee, "GMAD: Graph-based Malware Activity Detection by DNS traffic analysis", Computer Communications, Vol. 49, No. 1, pp. 33-47, Aug. 1. 2014. | |
Malicious activities on the Internet are one of the most dangerous threats to Internet users and
organizations. Malicious software controlled remotely is addressed as one of the most critical methods
for executing the malicious activities. Since blocking domain names for command and control (C&C) of
the malwares by analyzing their Domain Name System (DNS) activities has been the most effective
and practical countermeasure, attackers attempt to hide their malwares by adopting several evasion
techniques, such as client sub-grouping and domain flux on DNS activities. A common feature of the
recently developed evasion techniques is the utilization of multiple domain names for render malware
DNS activities temporally and spatially more complex. In contrast to analyzing the DNS activities for a
single domain name, detecting the malicious DNS activities for multiple domain names is not a simple
task. The DNS activities of malware that uses multiple domain names, termed multi-domain malware,
are sparser and less synchronized with respect to space and time.
In this paper, we introduce a malware activity detection mechanism, GMAD: Graph-based Malware
Activity Detection that utilizes a sequence of DNS queries in order to achieve robustness against evasion
techniques. GMAD uses a graph termed Domain Name Travel Graph which expresses DNS query sequences
to detect infected clients and malicious domain names. In addition to detecting malware C&C domain
names, GMAD detects malicious DNS activities such as blacklist checking and fake DNS querying. To
detect malicious domain names utilized to malware activities, GMAD applies domain name clustering
using the graph structure and determines malicious clusters by referring to public blacklists. Through
experiments with four sets of DNS traffic captured in two ISP networks in the U.S. and South Korea,
we show that GMAD detected thousands of malicious domain names that had neither been blacklisted
nor detected through group activity of DNS clients. In a detection accuracy evaluation, GMAD showed
an accuracy rate higher than 99% on average, with a higher than 90% precision and lower than 0:5% false
positive rate. It is shown that the proposed method is effective for detecting multi-domain malware
activities irrespective of evasion techniques.
|
|
Dongwon Seo, Heejo Lee, Adrian Perrig, "APFS: Adaptive Probabilistic Filter Scheduling against Distributed Denial-of-Service Attacks", Computers & Security, Vol. 39, Part B, pp. 366-385, Nov. 1. 2013. | |
Distributed denial-of-service (DDoS) attacks are considered to be among the most crucial security challenges in current networks because they significantly disrupt the availability of a service by consuming extreme amount of resource and/or by creating link congestions. One type of countermeasure against DDoS attacks is a filter-based approach where filter- based intermediate routers within the network coordinate with each other to filter undesired flows. The key to success for this approach is effective filter propagation and management techniques. However, existing filter-based approaches do not consider effective filter propagation and management. In this paper, we define three necessary properties for a viable DDoS solution: how to practically propagate filters, how to place filters to effective filter routers, and how to manage filters to maximize the efficacy of the defense. We propose a novel mechanism, called Adaptive Probabilistic Filter Scheduling (APFS), that effectively defends against DDoS attacks and also satisfies the three necessary properties. In APFS, a filter router adaptively calculates its own marking probability based on three factors: 1) hop count from a sender, 2) the filter router’s resource availability, and 3) the filter router’s link degree. That is, a filter router that is closer to attackers, has more available resources, or has more connections to neighbors inserts its marking with a higher probability. These three factors lead a victim to receive more markings from more effective filter routers, and thus, filters are quickly distributed to effective filter routers. Moreover, each filter router manages multiple filters using a filter scheduling policy that allows it to selectively keep the most effective filters depending on attack situations. Experimental results show that APFS has a faster filter propagation and a higher attack blocking ratio than existing approaches that use fixed marking probability. In addition, APFS has a 44% higher defense effectiveness than existing filter-based approaches that do not use a filter scheduling policy.
|
|
Rashad Aliyev, Dongwon Seo, Heejo Lee, "DROP-FAST: Defending against DDoS Attacks using Cloud Technology", Int'l Conf. on Security and Management (SAM), Jul. 24. 2013. | |
DDoS attacks continue to be a major threat to
network security. Several new types of attacks such as Layer-
7 attacks (e.g., HTTP flood, Slowloris, RUDY, etc.) have
emerged. We propose a novel DDoS defense mechanism
called DROP-FAST. Our mechanism provides distributed
DDoS defense utilizing multiple replicas of the protected
server throughout a cloud infrastructure. DROP-FAST is
dynamic and can adapt by controlling the number of replicas
on cloud based on attack strength. Main server is isolated
from network using replica servers. Service quality features
such as response time, incoming traffic load, and load
sharing are improved due to distribution of attack and
replication of the main server throughout the cloud. We
describe our mechanism in detail and discuss improvements
made over previously existing related works. We set up an
experiment that shows significant improvement of the traffic
load on the main server as a result of utilizing DROP-FAST
mechanism.
|
|
Dongwon Seo, Heejo Lee, Ejovi Nuwere, "SIPAD: SIP-VoIP Anomaly Detection using a Stateful Rule Tree", Computer Communications, Vol. 36, No. 5, pp. 562-574, Mar. 1. 2013. | |
Voice over IP (VoIP) services have become prevalent lately because of their potential advantages such as
economic efficiency and useful features. Meanwhile, Session Initiation Protocol (SIP) is being widely used
as a session protocol for the VoIP services. Many mobile VoIP applications have recently been launched,
and they are becoming attractive targets for attackers to steal private information. In particular, malformed
SIP messages and SIP flooding attacks are the most significant attacks as they cause service disruption
by targeting call procedures and system resources. Although much research has been conducted
in an effort to address the problems, they remain unresolved challenges due to the ease of launching variants
of attacks. In this paper, we propose a stateful SIP inspection mechanism, called SIP?VoIP Anomaly
Detection (SIPAD), that leverages a SIP-optimized data structure to detect malformed SIP messages and
SIP flooding attacks. SIPAD precomputes the SIP-optimized data structure (termed a stateful rule tree)
that reorganizes the SIP rule set by hierarchical correlation. Depending on the current state and the message
type, SIPAD determines the corresponding branches from the stateful rule tree, and inspects a SIP
message’s structure by comparing it to the branches. The SIP-optimized rule tree provides higher detection
accuracy, wider detection coverage and faster detection than existing approaches. Conventional SIP
inspection schemes tend to have high overhead costs due to the complexity of their rule matching
schemes. Experimental results of our SIP-optimized approach, by contrast, indicate that it dramatically
reduces overhead and can even be deployed in resource-constrained environments such as smartphones.
|
|
Hyunsang Choi, Heejo Lee, "Identifying botnets by capturing group activities in DNS traffic", Computer Networks, Vol. 56, No. 1, pp. 20-33, Jan. 12. 2012. | |
Botnets have become the main vehicle to conduct online crimes such as DDoS, spam, phishing and identity theft. Even though numerous efforts have been directed towards detection of botnets, evolving evasion techniques easily thwart detection. Moreover, existing approaches can be overwhelmed by the large amount of data needed to be analyzed. In this paper, we propose a light-weight mechanism to detect botnets using their fundamental characteristics, i.e., group activity. The proposed mechanism, referred to as BotGAD (botnet group activity detector) needs a small amount of data from DNS traffic to detect botnet, not all network traffic content or known signatures. BotGAD can detect botnets from a large-scale network in real- time even though the botnet performs encrypted communications. Moreover, BotGAD can detect botnets that adopt recent evasion techniques. We evaluate BotGAD using multiple DNS traces collected from different sources including a campus network and large ISP networks. The evaluation shows that BotGAD can automatically detect botnets while providing real-time monitoring in large scale networks.
|
|
Yilin Mo, Tiffany Hyunjin Kim, Kenneth Brancik, Dona Dickinson, Heejo Lee, Adrian Perrig, Bruno Sinopoli, "Cyber-Physical Security of a Smart Grid Infrastructure", Proceedings of the IEEE, Vol. 100, No. 1, pp. 195-209, Jan. 1. 2012. | |
It is often appealing to assume that existing
solutions can be directly applied to emerging engineering domains.
Unfortunately, careful investigation of the unique challenges
presented by new domains exposes its idiosyncrasies,
thus often requiring new approaches and solutions. In this
paper, we argue that the Bsmart[ grid, replacing its incredibly
successful and reliable predecessor, poses a series of new security
challenges, among others, that require novel approaches
to the field of cyber security. We will call this new field cyber?
physical security. The tight coupling between information and
communication technologies and physical systems introduces
new security concerns, requiring a rethinking of the commonly
used objectives and methods. Existing security approaches are
either inapplicable, not viable, insufficiently scalable, incompatible,
or simply inadequate to address the challenges posed
by highly complex environments such as the smart grid. A concerted
effort by the entire industry, the research community,
and the policy makers is required to achieve the vision of a
secure smart grid infrastructure.
|
|
Hyunsang Choi, Bin B. Zhu, Heejo Lee, "Detecting Malicious Web Links and Identifying Their Attack Types", USENIX Int'l Conf. on Web Application Development (WebApps), pp. 11-11, Jun. 15. 2011. | |
Malicious URLs have been widely used to mount various
cyber attacks including spamming, phishing and mal-
ware. Detection of malicious URLs and identification of
threat types are critical to thwart these attacks. Know-
ing the type of a threat enables estimation of severity
of the attack and helps adopt an effective countermea-
sure. Existing methods typically detect malicious URLs
of a single attack type. In this paper, we propose method
using machine learning to detect malicious URLs of all
the popular attack types and identify the nature of at-
tack a malicious URL attempts to launch. Our method
uses a variety of discriminative features including tex-
tual properties, link structures, webpage contents, DNS
information, and network traffic. Many of these fea-
tures are novel and highly effective. Our experimental
studies with 40,000 benign URLs and 32,000 malicious
URLs obtained from real-life Internet sources show that
our method delivers a superior performance: the accu-
racy was over 98% in detecting malicious URLs and over
93% in identifying attack types. We also report our stud-
ies on the effectiveness of each group of discriminative
features, and discuss their evadability.
|
|
Hyundo Park, Hyogon Kim, Heejo Lee, "Is early warning of an imminent worm epidemic possible?", IEEE Network, Vol. 23, No. 5, pp. 14-20, Oct. 2. 2009. | |
This article introduces a novel anomaly detection method that makes use of only matrix operations and is highly sensitive to randomness in traffic. The sensitivity can be leveraged to detect attacks that exude randomness in traffic characteristics, such as denial-of-service attacks and worms. In particular, we show that the method can be used to alert of the imminent onset of a worm epidemic in a statistically sound manner, irrespective of the worm's scanning strategies.
|
|
Hyunsang Choi, Heejo Lee, Hyogon Kim, "Fast Detection and Visualization of Network Attacks on Parallel Coordinates", Computers & Security, Vol. 28, No. 5, pp. 276-288, Jul. 2009. | |
This article presents what we call the parallel coordinate attack visualization (PCAV) for detecting unknown large-scale Internet attacks including Internet worms, DDoS attacks and network scanning activities. PCAV displays network traffic on the plane of parallel coordinates using the flow information such as the source IP address, destination IP address, destination port and the average packet length in a flow. The parameters are used to draw each flow as a connected line on the plane, where a group of polygonal lines form a particular shape in case of attack. From the observation that each attack type of significance forms a unique pattern, we develop nine signatures and their detection mechanism based on an efficient hashing algorithm. Using the graphical signatures, PCAV can quickly detect new attacks and enable network administrators to intuitively recognize and respond to the attacks. Compared with existing visualization works, PCAV can handle hyper-dimensions, i.e., can visualize more than 3 parameters if necessary, which significantly reduces false positives. As a consequence, Internet worms are more precisely detectable by machine and more easily recognizable by human. Another strength of PCAV is handling flows instead of packets. Per-flow visualization greatly reduces the processing time and further provides compatibility with legacy routers which export flow information, e.g., as NetFlow does in Cisco routers. We demonstrate the effectiveness of PCAV using real-life Internet traffic traces. The PCAV program is publicly available.
|
|
Hyundo Park, Peng Li, Debin Gao, Heejo Lee, Robert H. Deng, "Distinguishing between FE and DDoS Using Randomness check", Int'l Conf. on Information Security (ISC), Vol. 5222, pp. 131-145, Sep. 16. 2008. | |
Threads posed by Distributed Denial of Service (DDoS) attacks are becoming more serious day by day. Accurately detecting DDoS becomes an important and necessary step in securing a computer network. However, Flash Event (FE), which is created by legitimate requests, shares very similar characteristics with DDoS in many aspects and makes it hard to be distinguished from DDoS attacks. In this paper, we propose a simple yet effective mechanism called FDD (FE and DDoS Distinguisher) to distinguish FE and DDoS. To the best of our knowledge, this is the first effective and practical mechanism that distinguishes FE and DDoS attacks. Our trace-driven evaluation shows that FDD distinguishes between FE and DDoS attacks accurately and efficiently by utilizing only memory of a very small size, making it possible to be implemented on high-speed networking devices.
|
|
Dongwon Seo, Heejo Lee, Ejovi Nuwere, "Detecting More SIP Attacks on VoIP Services by Combining Rule Matching and State Transition Models", IFIP Int'l Conf. on ICT Systems Security and Privacy Protection (IFIP SEC), Vol. 278, pp. 397-411, Sep. 9. 2008. | |
The Session Initiation Protocol (SIP) has been used widely for Voice over IP (VoIP) service because of its potential advantages, economical efficiency and call setup simplicity. However, SIP-based VoIP service basically has two main security issues, malformed SIP message attack and SIP flooding attack. In this paper, we propose a novel mechanism for SIP-based VoIP system utilizing rule matching algorithm and state transition models. It detects not only two main attacks, but also covers more SIP attacks. Instead of simply combining rule comparison and counting number of SIP messages, we develop secure RFC 3261 rules based on existing RFC 3261 rules, so that proposed mechanism shows 26% higher detection rate for malformed attack. Moreover, we utilize session information and define the features of each state in order to detect abnormal situations including SIP flooding. As the result, it is shown that the proposed mechanism provides not only higher accuracy, but also covering more SIP attacks including two main attacks.
|
|
Keun Park, Dongwon Seo, Jaewon Yoo, Heejo Lee, Hyogon Kim, "Unified Rate Limiting in Broadband Access Networks for Defeating Internet Worms and DDoS Attacks", Int'l Conf. on Information Security Practice and Experience (ISPEC), Vol. 4991, pp. 176-187, Apr. 21. 2008. | |
Internet worms and DDoS attacks are considered the two
most menacing attacks on today’s Internet. The traditional wisdom is
that they are different beasts, and they should be dealt with independently.
In this paper, however, we show that a unified rate limiting algorithm
is possible, which effectively works on both Internet worms and
DDoS attacks. The unified approach leads to higher worm traffic reduction
performance than that of existing rate limiting schemes geared
toward worm mitigation, in addition to the added advantage of dropping
most DDoS attack packets. In our experiments with attack traffics generated
by attacking tools, the unified rate limiting scheme drops 80.7%
worm packets and 93% DDoS packets, while 69.2% worms and 3.4%
DDoS packets are dropped at maximum by previous worm scan rate
limiting schemes. Also, the proposed scheme requires less computing resources,
and has higher accuracy for dropping attack packets but not
dropping legitimate packets.
|
|
Minjin Kwon, Kyoochang Jeong, Heejo Lee, "PROBE: A Process Behavior-Based Host Intrusion Prevention System", Int'l Conf. on Information Security Practice and Experience (ISPEC), Vol. 4991, pp. 203-217, Apr. 2008. | |
Attacks using vulnerabilities are considered nowadays a severe threat. Thus, a host needs a device that monitors system activities for malicious behaviors and blocks those activities to protect itself. In this paper, we introduce PROcess BEhavior (PROBE), which monitors processes running on a host to identify abnormal process behaviors. PROBE makes a process tree using only process creation relationship, and then it measures each edge weight to determine whether the invocation of each child process causes an abnormal behavior. PROBE has low processing overhead when compared with existing intrusion detections which use sequences of system calls. In the evaluation on a representative set of critical security vulnerabilities, PROBE shows desirable and practical intrusion prevention capabilities estimating that only 5% false-positive and 5% false-negative. Therefore, PROBE is a heuristic approach that can also detect unknown attacks, and it is not only light-weight but also accurate.
|
|
Kyoochang Jeong, Heejo Lee, "Code Graph for Malware Detection", Int'l Conf. on Information Networking (ICOIN), Jan. 2008. | |
When an application program is executed for the
first time, the results of its execution are not always predictable.
Since the host will be damaged by a malware as soon as
it is executed, detecting and blocking the malware before its
execution is the most effective means of protection. In contrast
to current research into the detection of malwares based on their
behavior while being executed, we propose a new mechanism
which can preview the effect of a program on a system. The
mechanism we developed is to represent the distinctions between
portable executable binaries. The proposed mechanism analyzes
the instructions related to the system-call call sequence in a
binary executable and demonstrates the result in the form of a
topological graph. This topological graph is called the code graph
and the preview system is called the code graph system. We have
tested various real application programs with the code graph
system and identified their distinctive characteristics which can
be used for distinguishing normal softwares from malwares such
as worm codes and botnet programs. Our system detected all
known malwares used in the experiment, and distinguished 67%
of unknown malwares from normal programs. In this paper, we
show how to analyze the effects of executable binaries before their
execution and normal softwares can be effectively distinguished
from malwares by applying the code graph.
|
|
Hyunsang Choi, Hanwoo Lee, Heejo Lee, Hyogon Kim, "Botnet Detection by Monitoring Group Activities in DNS Traffic", IEEE Int'l Conf. Computer and Information Technology (IEEE CIT), Oct. 17. 2007. | |
Recent malicious attempts are intended to get financial
benefits through a large pool of compromised hosts, which are
called software robots or simply “bots.” A group of bots, referred
to as a botnet, is remotely controllable by a server and can be
used for sending spam mails, stealing personal information, and
launching DDoS attacks. Growing popularity of botnets compels
to find proper countermeasures but existing defense mechanisms
hardly catch up with the speed of botnet technologies. In this
paper, we propose a botnet detection mechanism by monitoring
DNS traffic to detect botnets, which form a group activity in DNS
queries simultaneously sent by distributed bots. A few works have
been proposed based on particular DNS information generated
by a botnet, but they are easily evaded by changing bot programs.
Our anomaly-based botnet detection mechanism is more robust
than the previous approaches so that the variants of bots can
be detectable by looking at their group activities in DNS traffic.
From the experiments on a campus network, it is shown that the
proposed mechanism can detect botnets effectively while bots are
connecting to their server or migrating to another server.
|
|
ChangHee Lee, Heejo Lee, "A Password Stretching Method with User Specific Salts", Int'l World Wide Web Conf. (WWW), May. 2007. | |
In this paper, we present a password stretching method based on user specific salt. Our scheme takes a similar time to stretch a password as a recent password stretching algorithm, but the complexity of pre-computation attack increases by 10^8 times and also the storage to store pre-computation result increases by 10^8 times over a recent password stretching algorithm.
|
|
Sangki Yun, Hyogon Kim, Heejo Lee, Inhye Kang, "100+ VoIP Calls on 802.11b: The Power of Combining Voice Frame Aggregation and Uplink-Downlink Bandwidth Control in Wireless LANs", IEEE Journal on Selected Areas in Communications, Vol. 25, No. 4, pp. 689-698, May. 2007. | |
The bandwidth efficiency of Voice over IP (VoIP) traffic on the IEEE 802.11 WLAN is notoriously low. VoIP over 802.11 incurs high bandwidth cost for voice frame packetization and MAC/PHY framing, which is aggravated by channel access overhead. For instance, 10 calls with the G.729 codec can barely be supported on 802.11b with acceptable QoS – less than 2% efficiency. As WLANs and VoIP services become increasingly widespread, this inefficiency must be overcome. This paper proposes a solution that boosts the efficiency high enough to support a significantly larger number of calls than existing schemes, with fair call quality. The solution comes in two parts: adaptive frame aggregation and uplink/downlink bandwidth equalization. The former reduces the absolute number of MAC frames according to the link congestion level, and the latter balances the bandwidth usage between the access point (AP) and wireless stations. When used in combination, they yield superior performance, for instance, supporting more than 100 VoIP calls over a IEEE 802.11b link. We demonstrate the performance of the proposed approach through extensive simulation, and validate the simulation through analysis.
|
|
Hyundo Park, Heejo Lee, Hyogon Kim, "Detecting Unknown Worms using Randomness Check", IEICE Trans. on Communications, Vol. E90-B, No. 4, pp. 894-903, Apr. 2007. | |
From the introduction of CodeRed and Slammer worms, it has been learned that the early detection of worm epidemics is important in order to reduce the damage resulting from outbreaks. A prominent characteristic of Internet worms is the random selection of subsequent targets. In this paper, we propose a new worm detection mechanism by checking the random distribution of destination addresses in network traffic. The proposed mechanism constructs a matrix from network traffic and checks the rank of the matrix in order to detect the spreading of Internet worms. From the fact that a random binary matrix holds a high rank value, ADUR (Anomaly Detection Using Randomness check) is proposed for detecting unknown worms based on the rank of the matrix. From experiments on various environments, it is demonstrated that the ADUR mechanism effectively detects the spread of new worms in the early stages, even when there is only a single host infected in a monitoring network. Also, we show that ADUR is highly sensitive so that the worm epidemic can be detectable quickly, e.g., three times earlier than the infection of 90% vulnerable hosts.
|
|
Heejo Lee, Minjin Kwon, Geoffrey Hasker, Adrian Perrig, "BASE: An Incrementally Deployable Mechanism for Viable IP Spoofing Prevention", ACM Asia Conf. on Computer and Communications Security (ASIACCS), pp. 20 - 31, Mar. 2007. | |
DoS attacks use IP spoofing to forge the source IP address of packets, and thereby hide the identity of the source. This makes it hard to defend against DoS attacks, so IP spoofing will still be used as an aggressive attack mechanism even under distributed attack environment. While many IP spoofing prevention techniques have been proposed, none have achieved widespread real-world use. One main reason is the lack of properties favoring incremental deployment, an essential component for the adoption of new technologies. A viable solution needs to be not only technically sound but also economically acceptable. An incrementally deployable protocol should have three properties: initial benefits for early adopters, incremental benefits for subsequent adopters, and effectiveness under partial deployment. Since no previous anti-spoofing solution satisfies all three of these properties, we propose a new mechanism called "BGP Anti-Spoofing Extension" (BASE). The BASE mechanism is an anti-spoofing protocol designed to fulfill the incremental deployment properties necessary for adoption in current Internet environments. Based on simulations we ran using a model of Internet AS connectivity, BASE shows desirable IP spoofing prevention capabilities under partial deployment. We find that just 30% deployment can drop about 97% of attack packets. Therefore, BASE not only provides adopters' benefit but also outperforms previous anti-spoofing mechanisms.
|
|
Heejo Lee, Jong Kim, Wanyeon Lee, "Resiliency of Network Topologies under Path-Based Attacks", IEICE Trans. on Communications, Vol. E89-B, No. 10, pp. 2878-2884, Oct. 2006. | |
Network topology has no direct effect on the correctness of network protocols, however, it influences the performance of networks and their survivability when they are under attack. Recent studies have analyzed the robustness of the Internet in the face of faults or attacks which may cause node failures. However, the effect of link failure or a series of link failures has not been extensively examined, even though such a situation is more likely to occur in the current Internet environment. In this paper, we propose an attack-and-failure graph model and practical techniques for attacking strategies against nodes, edges or paths in order to reflect real-life attack scenarios. The resiliency of Internet topologies is examined under the attacking strategies, with various metrics including path-failure ratio and
|
|
Heejo Lee, Jong Kim, Sungje Hong, Sunggu Lee, "Processor Allocation and Task Scheduling of Matrix Chain Products on Parallel Systems", IEEE Trans. on Parallel and Distributed Systems, Vol. 14, No. 4, pp. 394-407, Apr. 2003. | |
Abstract
|
|
Heejo Lee, Jong Kim, Sungje Hong, Sunggu Lee, "Task Scheduling using a Block Dependency DAG for Block-Oriented Sparse Cholesky Factorization", Parallel Computing, Vol. 29, No. 1, pp. 135-159, Jan. 2003. | |
Block-oriented sparse Cholesky factorization decomposes a sparse matrix into rectangular
subblocks; each block can then be handled as a computational unit in order to increase data
reuse in a hierarchical memory system. Also, the factorization method increases the degree of
concurrency and reduces the overall communication volume so that it performs more efficiently
on a distributed-memory multiprocessor system than the customary column-oriented
factorization method. But until now, mapping of blocks to processors has been designed
for load balance with restricted communication patterns. In this paper, we represent tasks
using a block dependency DAG that represents the execution behavior of block sparse Cholesky
factorization in a distributed-memory system. Since the characteristics of tasks for block
Cholesky factorization are different from those of the conventional parallel task model, we
propose a new task scheduling algorithm using a block dependency DAG. The proposed algorithm
consists of two stages: early-start clustering, and affined cluster mapping (ACM).
The early-start clustering stage is used to cluster tasks while preserving the earliest start time
of a task without limiting parallelism. After task clustering, the ACM stage allocates clusters
to processors considering both communication cost and load balance. Experimental results on a Myrinet cluster system show that the proposed task scheduling approach outperforms other
processor mapping methods.
|
|
Kihong Park, Heejo Lee, "On the Effectiveness of Route-Based Packet Filtering for Distributed DoS Attack Prevention in Power-Law Internets", Int'l Conf. on ACM SIGCOMM, Vol. 31, No. 4, pp. 15-26, Aug. 2001. | |
Denial of service (DoS) attack on the Internet has become a pressing problem. In this paper, we describe and evaluate route-based distributed packet filtering (DPF), a novel approach to distributed DoS (DDoS) attack prevention. We show that DPF achieves proactiveness and scalability, and we show that there is an intimate relationship between the effectiveness of DPF at mitigating DDoS attack and power-law network topology.
The salient features of this work are two-fold. First, we show that DPF is able to proactively filter out a significant fraction of spoofed packet flows and prevent attack packets from reaching their targets in the first place. The IP flows that cannot be proactively curtailed are extremely sparse so that their origin can be localized-i.e., IP traceback-to within a small, constant number of candidate sites. We show that the two proactive and reactive performance effects can be achieved by implementing route-based filtering on less than 20% of Internet autonomous system (AS) sites. Second, we show that the two complementary performance measures are dependent on the properties of the underlying AS graph. In particular, we show that the power-law structure of Internet AS topology leads to connectivity properties which are crucial in facilitating the observed performance effects.
|
|
Kihong Park, Heejo Lee, "On the Effectiveness of Probabilistic Packet Marking for IP Traceback under Denial of Service Attack", IEEE INFOCOM, Vol. 1, pp. 338-347, Apr. 2001. | |
Effective mitigation of denial of service (DoS) attack is a pressing problem on the Internet. In many instances, DoS attacks can be prevented if the spoofed source IP address is traced back to its origin which allows assigning penalties to the offending party or isolating the compromised hosts and domains from the rest of the network. Recently IP traceback mechanisms based on probabilistic packet marking (PPM) have been proposed for achieving traceback of DoS attacks. In this paper, we show that probabilistic packet marking-of interest due to its efficiency and implementability vis-`a-vis deterministic packet marking and logging or messaging based schemes-suffers under spoofing of the marking field in the IP header by the attacker which can impede traceback by the victim. We show that there is a trade-off between the ability of the victim to localize the attacker and the severity of the DoS attack, which is represented as a function of the marking probability, path length, and traffic volume. The optimal decision problem-the victim can choose the marking probability whereas the attacker can choose the spoofed marking value, source address, and attack volume-can be expressed as a constrained minimax optimization problem, where the victim chooses the marking probability such that the number of forgeable attack paths is minimized. We show that the attacker's ability to hide his location is curtailed by increasing the marking probability, however, the latter is upper-bounded due to sampling constraints. In typical IP internets, the attacker's address can be localized to within 2-5 equally likely sites which renders PPM effective against single source attacks. Under distributed DoS attacks, the uncertainty achievable by the attacker can be amplified, which diminishes the effectiveness of PPM.
|
|
Jong Kim, Heejo Lee, Sunggu Lee, "Replicated Process Allocation for Load Distribution in Fault-Tolerant Multicomputers", IEEE Trans. on Computers, Vol. 46, No. 4, pp. 499-505, Apr. 1997. | |
In this paper, we consider a load-balancing process
allocation method for fault-tolerant multicomputer systems that
balances the load before as well as after faults start to degrade the
performance of the system. In order to be able to tolerate a single fault,
each process (primary process) is duplicated (i.e., has a backup
process). The backup process executes on a different processor from
the primary, checkpointing the primary process and recovering the
process if the primary process fails. In this paper, we formalize the
problem of load-balancing process allocation and propose a new
process allocation method and analyze the performance of the
proposed method. Simulations are used to compare the proposed
method with a process allocation method that does not take into
account the different load characteristics of the primary and backup
processes. While both methods perform well before the occurrence of
a fault, only the proposed method maintains a balanced load after the
occurrence of such a fault.
|