Preferred Name

Alex Mitchell

ORCID

https://orcid.org/0000-0002-0590-0812

Date of Graduation

12-17-2022

Semester of Graduation

Fall

Degree Name

Master of Science (MS)

Department

Department of Computer Science

Abstract

Computer programmers often leave their individual programming styles in source code. Recent studies show that contrary to a popular belief, many of such programming styles can survive, in controlled environments, code compilation into binary. From the binary programming styles can be effectively retrieved for enhanced binary authorship attribution; such binary authorship attribution is often called binary program stylometry. In this thesis, we first perform a white-box impact analysis of various factors in code compilation on programming styles. For the MS Windows platform, we study the impact of multiple compilers, including gcc, Clang, and MSVC, their optimization levels, symbol stripping, and the Ghidra decompiler on programming styles. These factors are then ranked and provide guidance for binary code stylometry. Next, we perform binary stylometry on a set of six real-world crypto ransomware and aim for highly automated classification through leveraging the powerful open-source Ghidra decompiler. The validation accuracy reaches 42.6%. This study can be used as the first step to quickly classify binary malware before labor-intensive analysis is needed.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.