Detecting command injection vulnerabilities in Linux-based embedded firmware with LLM-based taint analysis of library functions

Junjian Ye, Xincheng Fei, Xavier de Carné de Carnavalet, Lianying Zhao, Lifa Wu*, Mengyuan Zhang

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

With the popularization of IoT devices, embedded firmware security has attracted people's attention. Command injection (CI) is one of the most common types of vulnerabilities in Linux-based embedded firmware. It is caused by user input being propagated to functions responsible for command execution without strict sanitization, which can be detected by static taint analysis. Unfortunately, single-binary taint analysis tools cannot find vulnerabilities caused by custom dynamically linked library functions (DLLFs) that are implemented in external library files, while multi-binary analysis tools are time-consuming. In this paper, we present SLFHunter, an approach that leverages Large Language Model (LLM) to analyze sensitive custom DLLFs separately, and imports their information into single-binary taint analysis tools to overcome this challenge. Our approach follows filtering rules to find out sensitive DLLFs that call common sink functions, and analyzes them with LLMs to find sink library functions (SLFs) where input parameters can be passed to executed command strings. Finally, SLFs are marked as new sinks to help existing tools discover CI vulnerabilities caused by them. We implemented SLFHunter as a ChatGPT-based module for EmTaint and evaluated it with a dataset consisting of 100 Linux-based embedded firmware samples from 13 vendors. The results show that our prompts can guide ChatGPT 4.0 to identify SLFs with 95% accuracy after being improved with a trick we dubbed “double-check”. SLFHunter can help EmTaint find 42 additional CI vulnerabilities with an average time cost increase of 89 s on our dataset, which demonstrates the effectiveness and efficiency of our approach.

Original languageEnglish
Article number103971
Pages (from-to)1-12
Number of pages12
JournalComputers and Security
Volume144
Early online date27 Jun 2024
DOIs
Publication statusPublished - Sept 2024

Bibliographical note

Publisher Copyright:
© 2024 Elsevier Ltd

Keywords

  • Command injection
  • Embedded firmware security
  • Large Language Model
  • Static taint analysis

Fingerprint

Dive into the research topics of 'Detecting command injection vulnerabilities in Linux-based embedded firmware with LLM-based taint analysis of library functions'. Together they form a unique fingerprint.

Cite this