Scalable data structure detection and classification for C/C++ binaries

Research output: Contribution to JournalArticleAcademicpeer-review


Many existing techniques for reversing data structures in C/C ++ binaries are limited to low-level programming constructs, such as individual variables or structs. Unfortunately, without detailed information about a program's pointer structures, forensics and reverse engineering are exceedingly hard. To fill this gap, we propose MemPick, a tool that detects and classifies high-level data structures used in stripped binaries. By analyzing how links between memory objects evolve throughout the program execution, it distinguishes between many commonly used data structures, such as singly- or doubly-linked lists, many types of trees (e.g., AVL, red-black trees, B-trees), and graphs. We evaluate the technique on 10 real world applications, 4 file system implementations and 16 popular libraries. The results show that MemPick can identify the data structures with high accuracy.
Original languageEnglish
Pages (from-to)778–810
Number of pages33
JournalEmpirical Software Engineering
Issue number3
Early online date7 Mar 2015
Publication statusPublished - Jun 2015


Dive into the research topics of 'Scalable data structure detection and classification for C/C++ binaries'. Together they form a unique fingerprint.

Cite this