Preferred Name

Alex Mitchell

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

ORCID

https://orcid.org/0000-0002-0590-0812

Date of Graduation

12-17-2022

Semester of Graduation

Fall

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Department of Computer Science

Advisor(s)

Xunhua Wang

Abstract

Computer programmers often leave their individual programming styles in source code. Recent studies show that contrary to a popular belief, many of such programming styles can survive, in controlled environments, code compilation into binary. From the binary programming styles can be effectively retrieved for enhanced binary authorship attribution; such binary authorship attribution is often called binary program stylometry. In this thesis, we first perform a white-box impact analysis of various factors in code compilation on programming styles. For the MS Windows platform, we study the impact of multiple compilers, including gcc, Clang, and MSVC, their optimization levels, symbol stripping, and the Ghidra decompiler on programming styles. These factors are then ranked and provide guidance for binary code stylometry. Next, we perform binary stylometry on a set of six real-world crypto ransomware and aim for highly automated classification through leveraging the powerful open-source Ghidra decompiler. The validation accuracy reaches 42.6%. This study can be used as the first step to quickly classify binary malware before labor-intensive analysis is needed.

Available for download on Sunday, December 08, 2024

Share

COinS