Preferred Name

Justin Carpenter

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

ORCID

https://orcid.org/0009-0000-3775-5624

Date of Graduation

5-9-2024

Semester of Graduation

Spring

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Department of Computer Science

Advisor(s)

Xunhua Wang

Michael Lam

Brett Tjaden

Abstract

Binary stylometry aims to find the features in a binary computer program and use them to identify the developers of the corresponding source code. Despite the noises in the code compilation process from the compiler, assembler, linker, and library functions, two existing studies based on machine learning for binary stylometry have reported high success rates (Alrabaee, Shirani, Wang, Debbabi, and Hanna 2018; Caliskan, Yamaguchi, Dauber, Harang, Rieck, Greenstadt, and Narayanan 2018). In this thesis, we first observe that both existing studies are based on a largely benign security model and assume that the binaries used in testing and prediction are generated in the same way as the training data. As a result, such binary stylometry studies would not work on binary mutants that are generated directly through binary instrumentation from an existing binary and resemble the original. Tracing such a mutant through existing binary stylometry studies will lead to the original binary program developer(s), thus defeating the very purpose of binary attribution in many applications. Next, we examine existing general-purpose static binary instrumentation techniques to transform existing binary programs into new meaningful mutants against binary stylometry. We find they are less ideal in the binary stylometry setting. To demonstrate the practicality of binary instrumentation attacks in binary stylometry, we then instrument, with minimal changes, the security-intensive 2013 NSA codebreaker challenge program to strengthen its security design. This successful effort gives us high confidence in the practicality of the attack.

Available for download on Friday, April 24, 2026

Share

COinS