When Threads Meet Interrupts: 
Effective Static Detection of Interrupt-Based Deadlocks in Linux

Chengfeng Ye  Yuandao Cai  Charles Zhang

\textit{The Hong Kong University of Science and Technology}
\{cyeaa, ycaibb, charlesz\}@cse.ust.hk

Abstract

Deadlocking is an unresponsive state of software that arises when threads hold locks while trying to acquire other locks that are already held by other threads, resulting in a circular lock dependency. Interrupt-based deadlocks, a specific and prevalent type of deadlocks that occur within the OS kernel due to interrupt preemption, pose significant risks to system functionality, performance, and security. However, existing static analysis tools focus on resource-based deadlocks without characterizing the interrupt preemption. In this paper, we introduce Archerfish, the first static analysis approach for effectively identifying interrupt-based deadlocks in the large-scale Linux kernel. At its core, Archerfish utilizes an Interrupt-Aware Lock Graph (ILG) to capture both regular and interrupt-related lock dependencies, reducing the deadlock detection problem to graph cycle discovery and refinement. Furthermore, Archerfish incorporates four effective analysis components to construct ILG and refine the deadlock cycles, addressing three core challenges, including the extensive interrupt-involving concurrency space, identifying potential interrupt handlers, and validating the feasibility of deadlock cycles. Our experimental results show that Archerfish can precisely analyze the Linux kernel (19.8 MLoC) in approximately one hour. At the time of writing, we have discovered 76 previously unknown deadlocks, with 53 bugs confirmed, 46 bugs already fixed by the Linux community, and 2 CVE IDs assigned. Notably, those found deadlocks are long-latent, hiding for an average of 9.9 years.

1 Introduction

Deadlocking \cite{7, 15, 21, 33} commonly refers to an unresponsive condition caused by circular lock dependencies, in which, each thread within a group of threads holds a lock while also attempting to acquire another lock that is already held by other threads. In the OS kernel, the interrupt preemption \cite{68} can cause additional lock dependencies, when preempted threads, which have already acquired locks, attempt to acquire additional locks within the interrupt handling procedure \cite{4}. In this paper, we refer to the circular lock dependencies related to the interrupt preemption as interrupt-based deadlocks.

There is abundant evidence that deadlocks can significantly hurt the functionality, performance \cite{5}, and even security of modern software \cite{50-55}. Furthermore, the interrupt-based deadlocks within the OS kernel can pose a more severe danger as they occur within the interrupt context \cite{43}, potentially leading to the entire unresponsive CPU core and even a whole system breakdown \cite{70}. Unfortunately, there are no effective tools for identifying interrupt-based deadlocks, as both the previous static and dynamic approaches have certain limitations.

First, existing static approaches \cite{7, 10, 14, 15, 18, 20-22, 34, 63} primarily focus on detecting resource-based deadlocks \cite{15} caused by thread interleaving, thus overlooking interrupt-based deadlocks. Specifically, most past static analysis tools \cite{14, 15, 18, 34, 35} target deadlock detection in userland software without considering interrupt preemption, a unique feature of the OS kernel. Also, although a few tools \cite{7} aim to detect deadlocks in OS kernel, they still use traditional thread interleaving models by directly treating interrupts as threads, missing lock dependencies caused by interrupt preemption.

Second, dynamic approaches \cite{11, 13, 17, 30, 31, 40, 57, 84, 85} often suffer from low coverage and reliance on the runtime execution environment such as external devices. Specifically, although dynamic fuzzing can detect such deadlocks with the assistance of Lockdep \cite{73}, the built-in Linux kernel lock validator, they require high-quality seed inputs or configurations to guide the execution towards a faulty path. Besides, the interrupt preemption \cite{68} significantly enlarges the exploration space of dynamic concurrent execution, degrading the effectiveness. Moreover, the dynamic triggering of interrupts heavily depends on signals generated by hardware devices, which sometimes are hard to be properly estab-
lished or available [60].

We use an example shown in Figure 1(a), which was introduced in Linux kernel [74] v3.7-rc1 eleven years ago [41] and discovered by us. The example shows an interrupt-based deadlock caused by circular lock dependencies between two threads, Thread A and Thread B. First, in the left-hand side of Figure 1(a), Thread A acquires &dev->irqlock ① at Line 184 and then calls the function clear_work_bit() at Line 195, which further acquires &dev->condlock ② at Line 52. This regular execution flow, establishes a regular lock dependency from &dev->irqlock ① to &dev->condlock ②. Second, on the right-hand side of Figure 1(a), Thread B, in the callback function enc.post.frame_start(), also calls clear_work_bit() at Line 300 and acquires &dev->condlock ②. Since the interrupt handler s5p_mfc_irq() is not disabled at that moment, there is a possibility of preemption by the interrupt handler between Line 52 and Line 54. If the interrupt handler acquires &dev->irqlock ① during this preemption, it creates an interrupt lock dependency from &dev->condlock ② to &dev->irqlock ①. Note that this interrupt lock dependency cannot be captured by existing static methods [7, 14, 15] without characterizing interrupt preemption. Consequently, in certain concurrent scheduling, an interrupt-based deadlock may occur, as depicted in Figure 1(b), resulting in a system halt.

Our Approach. In this work, we present Archerfish, the first static analysis approach to characterize the interrupt preemption to detect the interrupt-based deadlocks in the Linux kernel. At the core, Archerfish is empowered by the extended notion of an Interrupt-Aware Lock Graph (ILG), which captures both kinds of lock dependencies in the Linux kernel, namely regular lock dependencies and interrupt lock dependencies, caused by regular execution flow of threads and interrupt preemption, respectively. By effectively constructing the ILG, Archerfish can reduce the detection of interrupt-based deadlocks to the discovery and validation of dependence cycles. For example, the deadlock in Figure 1(a) can be detected by finding the cycle shown in Figure 1(c).

However, there are three core challenges of effectively analyzing the Linux kernel to identify the lock dependencies and refine deadlock cycles. (1) First, identifying the Interrupt Service Routines (ISRs) is challenging without domain-specific knowledge, as different subsystems in the Linux kernel often have their own unique APIs and callback interfaces for registering ISRs [71]. (2) Second, the interrupt preemption enlarges the concurrency reasoning space of static analysis. As each interrupt-enabled program location can be potentially interrupted [68], it is necessary to consider preemption at each statement to capture all possible interrupt-induced lock dependencies. However, the large concurrency reasoning space can significantly degrade the effectiveness of static analysis. (3) Third, it is crucial to identify feasible lock dependencies, taking into account the feasibility of both interrupts and program paths. However, for example, directly validating the path-feasibility of numerous lock dependencies using SMT solving [82] may cause low efficiency.

To address the three challenges, we design four stages shown in Figure 2 with several tailored techniques.

- First, to address the first challenge, Archerfish...
harnesses the power of Large Language Models (LLMs) to generate interrupt specifications that complement manually modeled ones for identifying interrupts. This is because LLMs have been trained on a vast amount of online textual data, including information related to the Linux kernel [61].

- Second, we employ a summary-based data-flow analysis to compute the lockset [32,81] and the state of interrupt enabling or disabling at relevant statements. To tackle the second challenge of the vast reasoning space of the interrupt preemption, we introduce the notion of a preemption unit, clustering all statements within a critical section. Our key idea is that a critical section [78] shares the same lockset so that any interrupts occurring within the region hold the same set of locks. Thus, separately computing each individual program point within the region is unnecessary during lockset analysis.

- Third, using the computed data-flow facts, Archerfish constructs the ILG and captures both kinds of lock dependencies as graph edges. In particular, it identifies dependency cycles within the ILG, each representing an interrupt-based deadlock.

- Fourth, to overcome the third challenge of expensive deadlock feasibility checking, our key idea is that not all lock dependencies are related to deadlock detection. Thus, we lazily check the feasibility of both interrupts and paths, deferring the costly validation for each discovered cycle instead of early validating each edge during ILG construction.

We have implemented the Archerfish prototype based on the LLVM framework [36] and evaluated it on the Linux kernel v6.4. Archerfish can analyze the Linux kernel (19.8 MLoC) in about one hour with a relatively low positive (49.7%). More excitingly, we have found 76 new and long-latent interrupt-based deadlocks in the Linux kernel and reported all of them to the Linux community. In total, 53 bugs have been confirmed by developers, and 46 of them have already been fixed in the mainstream Linux kernel with 2 CVE IDs assigned.

**Contribution.** We make the following contributions:

- We propose the notion of the Interrupt-Aware Lock Graph (ILG), which captures both kinds of regular and interrupt-related lock dependencies as edges, enabling the reduction of interrupt-based deadlock detection via graph cycle discovery and refinement.

- We present Archerfish, the first static interrupt-based deadlock detection for the Linux kernel, addressing three specific challenges of effectively constructing ILG and refining cycles.

- We evaluate Archerfish on Linux kernel v6.4, where it revealed 76 new interrupt-based deadlocks that across various subsystems in the Linux kernel.

![Figure 2: The design of Archerfish framework.](image-url)

2 Background

We present the interrupts’ concept, the interrupt-based deadlocks, the limitations of existing works, and the threat model in § 2.1, § 2.2, § 2.3, and § 2.4, respectively.

2.1 Interrupts in Linux kernel

**Interrupt Handling.** An interrupt [43] in the Linux kernel is a special event that alters the regular execution flow of a program, and such action is also called interrupt preemption [68]. In contrast, the action that one thread switches to another thread is called thread preemption [72]. Their major difference is that interrupt preemption is asymmetric, indicating that interrupt runs in higher priority and cannot switch back to the previously preempted execution flow until the interrupt finishes its execution [43]. Thus, interrupt preemption could introduce new lock dependencies and cause deadlocks [2].

**Interrupt Priority.** Interrupts in the Linux kernel have two categories: hardware interrupts [42] and software interrupts [44], also known as hardirqs and softirqs, respectively. Hardirqs are triggered by interrupt signals from hardware devices and execute under the context called the hardirq context. In recent Linux kernel, local interrupt is disabled inside hardirq context by default, so hardirqs normally run with the highest priority and cannot be preempted by any other execution units, including other hardirqs [43].

Because hardirqs should execute quickly in response to hardware signals, time-consuming tasks would be deferred to softirqs [44]. Softirqs run in the execution context called the softirq context with lower priority than the hardirq context and can be preempted by hardirqs [44]. Nevertheless, both the hardirqs and softirqs belong to
interrupts and have higher priority than normal kernel threads executing under the process context [23]. The priority-based preemption relationship leads to the specific concurrency model in the Linux kernel. Therefore, any static analyzer that aims to reason about interrupt preemption should consider this relationship.

**Interrupt Service Routine (ISR) Registration.** An ISR [3] is a function in response to interrupt signals from hardware devices or software interrupt. In the Linux kernel, an ISR is typically registered in two ways. First, an ISR can be registered by being assigned to a structure field. In Figure 3(a), function ks_wlan_start_xmit() is registered as a softirq ISR by being assigned to the net_device_ops.ndo_start_xmit structure field, a callback interface responsible for handling network packet transmission [71]. Second, an ISR can be registered by being passed as an argument to a specific API call. For example, as shown in Figure 3(b), the function do_cio InterruptedException() is registered as a hardirq ISR for the interrupt line IO_INTERRUPT using the standard API request_irq() [3]. The large codebase of Linux contains diverse subsystems, each with its own interfaces or APIs for ISR registration. The variation presents a challenge for identifying ISRs and capturing interrupt preemption in static deadlock detection.

### 2.2 Interrupt-Based Deadlocks

Distinguishing interrupt-based deadlocks from other types of deadlocks [14, 15] is the presence of lock dependencies introduced by interrupt preemption [68]. Specifically, interrupt preemption is an asymmetric process [43], in which a preempted thread is required to wait until the ISR completes before it can proceed. Consequently, any locks held by the preempted thread cannot be released until the ISR finishes its execution. When the ISR itself attempts to acquire locks, it can give rise to lock dependencies, potentially resulting in cyclic lock acquisitions.

For example, consider Figure 4(a), in which lock b is held by thread T2 and the thread is preempted by an ISR that acquires another lock a. In this scenario, the interrupt lock dependency b ← a appears, as T2 cannot release lock b until the ISR finishes. When considering the regular lock dependency a → b introduced by T1 together, an interrupt-based deadlock could happen under the specific thread and the interrupt interleaving.

Note that the thread switching cannot introduce the lock dependencies (e.g., b → a) as interrupt preemption. This is because if a thread is holding a lock and is switched out, it could be switched back at any time to release the lock it acquired. This paper refers to the lock dependencies introduced by interrupt preemption as interrupt lock dependencies and deadlocks that involve interrupt lock dependencies as interrupt-based deadlocks.

### 2.3 Limitations of Existing Static Detectors

Existing static deadlock detection techniques [7, 14, 15, 18, 34, 35] do not consider the interrupt lock dependencies and model interrupts in the same way as threads. As a result, they cannot detect interrupt-based deadlocks. Consider the example program in Figure 4(b), where the interrupt in Figure 4(a) is modeled as thread T3 by the existing deadlock detectors and then the interrupt preemption is not captured to identify the lock dependency b ← a. Consequently, the existing static deadlock detectors, which model interrupts as threads, are unable to detect deadlock situations that arise from interrupts.

Some works [38, 75] detect other specific interrupt-related concurrency bugs like data race [46, 75, 80] or atomicity violation [28, 38]. However, the challenges associated with modeling and detecting these types of bugs differ from those encountered in deadlock detection. Therefore, their approaches cannot be directly applied to identify interrupt-based deadlocks effectively.
2.4 Security Impacts of Interrupt-Based Deadlocks in the Linux Kernel

**Threat Model.** We characterize the deadlock vulnerability as potential lock acquisitions that can be exploited by local or remote adversaries. These adversaries have the capability to trigger specific deadlock-introducing ISRs through various means, such as remote network packets, local system call execution, or physical hardware access, manipulating the control flow. When these deadlock-introducing ISRs in the system are periodically activated and happen to preempt the execution of code involving vulnerable and cyclic lock acquisitions, a deadlock situation can arise. Once this deadlock is exploited, the corresponding ISRs can seize CPU cores, causing them to remain indefinitely stuck [2, 4, 5] and leading to issues like lockup [70] or system crashes. A concrete instance of the deadlock (CVE-2023-0160) [56] was reported in the BPF subsystem, allowing low-privileged local attackers to exploit the deadlock and crash the system.

Conventional deadlocks primarily impact blocked threads, allowing other threads to continue execution [7, 15]. In contrast, interrupt-based deadlocks pose a more severe impact by monopolizing the CPU with deadlock-introducing ISRs, resulting in a complete blockage of the entire system’s execution and halting further progress.

3 Archerfish in a Nutshell

We present the problem formulation, the challenges, and our solutions to interrupt-based deadlock detection.

3.1 Problem Formulation and Challenges

In essence, interrupt-based deadlocks are cyclic lock dependencies that occur due to interrupt preemption. To identify and address these deadlocks, we introduce ILG by extending the conventional notion of lock graphs [15, 33], which incorporate both interrupt lock dependencies and regular lock dependencies as lock edges. As a result, the problem of interrupt-based deadlock detection is formulated as the construction of ILG and the discovery of feasible deadlock cycles within the graph. However, three challenges hinder the precise and efficient ILG construction and deadlock cycle validation.

**C1: Lack of ISR Registration Specifications.** Various methods exist to register interrupt handlers in Linux, a complex and feature-rich OS kernel containing many subsystems [74]. As shown in Figure 3 (a), without any domain-specific knowledge, it is difficult to identify that net_device_ops.ndo_start_xmit is a callback interface executed under the softirq context. However, identifying functions executed under interrupt contexts is the prerequisite for capturing interrupt preemption and interrupt lock dependencies. Thus, lacking domain-specific knowledge about ISR registration poses a significant challenge for interrupt-based deadlock detection.

**C2: Large Interrupt-Involving Reasoning Space.** The concurrency reasoning space of thread interleaving already grows exponentially with the number of statements [19, 65], and this complexity is further amplified by the presence of interrupt preemption. In particular, efficiently and precisely capturing lock dependencies becomes challenging when considering the interrupt enabling/disabling state at each statement, and the state could vary under different calling contexts, which require expensive context-sensitive analysis [79]. For example, in the Figure 1(a), clear_work_bit() could be preempted by the ISR s5p_mfs_irq() when called by function enc_post_frame_start(), but not when called by function s5p_mfc_watchdog_worker() because hardirq is already disabled by spin_lock_irqsave() at Line 184. Worse still, the large codebase of the Linux kernel also further complicates the interruption-involving concurrency reasoning space.

**C3: Validation of Feasible Lock Dependencies.** Finally, not all lock dependencies are feasible during runtime execution. First, the interrupt preemption leading to some lock dependencies may be infeasible as the preemption cannot happen before the corresponding ISR is registered [3]. Second, the path conditions of certain lock dependencies could also be infeasible. However, the large number of lock dependencies to validate and the high overhead (e.g., SMT solving [82]) of feasibility checking make efficient validation challenging.

3.2 Four Core Stages in Archerfish

The architecture of Archerfish is shown in Figure 2, consisting of four main stages to address these challenges.

**S1: Interrupt Service Routine Identification (§5.1).** Public LLMs like ChatGPT-4.0 are trained with vast online textual data, including those of Linux kernel [61]. Thus, we can consider LLMs as Linux experts and extract domain-specific knowledge about subsystem ISR registration from them. The subsystem specifications generated by LLMs would complement a set of standard ISR registration specifications modeled by humans.

**S2: Lockset and Interrupt-State Analysis (§5.2).** To effectively reason about the interrupt preemption and capture interrupt lock dependencies, our basic idea is that statements inside the same critical section share the same lockset. Consequently, instead of enumerating the possibility of interrupt preemption at each program location, we treat all statements in a critical section as a preemption unit, calculating a unified state of interrupts for the unit and reducing the preemption reasoning space. Furthermore, we propose a compositional and summary-
based data-flow analysis to compute the lockset and interrupt state at interesting statements.

**S3: ILG Construction (§5.3).** By inspecting the computed data-flow facts, both Interrupt Lock Edge (ILE) and Regular Lock Edge (RLE) are captured to construct an ILG. Specifically, we capture RLEs by examining the lockset on statements of interest, while we capture ILEs by pairwise checking between several pre-computed summaries on preempted threads and ISRs. After constructing ILG, a standard cycle detection algorithm [29] is run to identify potential deadlock cycles.

**S4: Deadlock Cycle Validation (§5.4).** Instead of performing heavyweight feasibility checking on each identified lock dependency edge, we delay the analysis by only performing feasibility checking on those forming a deadlock cycle. At this stage, Archerfish performs happen-before analysis [45, 83] regarding ISR registration to examine the preemption feasibility and path-feasibility checking on each potential deadlock-leading dependency edge. Finally, deadlock reports are generated on validated interrupt-based deadlock cycles.

### 4 Preliminary

In this section, we introduce the definitions used in the paper and formulate the problem we aim to solve.

**Abstract Domain.** Figure 5 shows the basic abstract domain, where the symbol \( s \in S \) represents each statement in a standard LLVM-like language. Specifically, lock- and interrupt-related API call sites are modeled during static analysis. The symbol \( o \in \mathcal{O} \) represents each lock object, and a cyclic dependency among them forms a deadlock. The symbol \( t \in T \) represents a kernel thread executing under the process context, while \( isr \in \mathbb{R} \) represents an ISR [3] executing under the interrupt context, including both hardirq and softirq. Note that identification of ISRs is important for Archerfish to capture interrupt lock dependencies and perform deadlock detection. We give more details of ISR identification in §5.1.

The \( LS \) and \( RS \) represent the lockset and the interrupt states computed by typestate analysis. Lockset \( LS \) maintains whether a specific lock \( o \in \mathcal{O} \) is Acquired, Released or its state is Unknown (⊥). Interrupt state \( RS \) maintains whether a specific \( isr \in \mathbb{R} \) is Enabled, Disabled or its state is Unknown (⊥). To reach flow-sensitivity, we compute a unique lockset \( LS \) and interrupt state \( RS \) at each statement \( s \in S \). For example, at Line 186 of the program in Figure 1(a), we identify that \( LS = (irqlock, A) \) and \( RS = (s5p.mfc.irq(1), D) \), indicating \( irqlock \) is Acquired and \( s5p.mfc.irq() \) is Disabled. Both \( LS \) and \( RS \) are derived from a compositional typestate analysis described in §5.2, forming the foundation of deadlock detection.

Next, we formally define the notion of ILG.

**Statements**

\[ s \in S \]

**Locks**

\[ o \in \mathcal{O} \]

**Threads**

\[ t \in T \]

**Interrupt Handlers**

\[ isr \in \mathbb{R} \]

**Lockset**

\[ LS := S \to \{ D, TY \} \]

**where**

\[ TY = \{ Acquired, Released, ⊥ \} \]

**Interrupt State**

\[ RS := S \to \{ R, ST \} \]

**where**

\[ ST = \{ Enabled, Disabled, ⊥ \} \]

---

**Figure 5:** Basic abstract domain.

**Definition 1.** The Interrupt-Aware Lock Graph (ILG) is defined as a three-tuple, \( G = (N, E_S, E_I) \).

- \( N \) denotes the set of vertices in ILG, where each vertex represents a unique lock object \( o \).
- \( E_S \) represents a set of regular lock edges with the form \( o_1 \rightarrow o_2 \), indicating that during thread or ISR execution, lock \( o_2 \) could be acquired with \( o_1 \) held.
- \( E_I \) represents a set of interrupt lock edges with the form \( o_1 \rightsquigarrow o_2 \), indicating that at certain program points, lock \( o_1 \) is held while an ISR preempts the execution and subsequently acquires \( o_2 \).

At a high level, interrupt lock edges are constructed by pairwise matching between preemption units and summaries of lock acquisition inside ISRs, both of which are constructed in §5.2. We give more details of ILG construction in §5.3. With the definition of ILG, we further define the concept of Interrupt-Based Lock Cycle (ILC).

**Definition 2.** An Interrupt-Based Lock Cycle (ILC) is a cycle on ILG where at least one of the edges on the cycle belongs to \( E_I \), representing a potential deadlock introduced by interrupt preemption.

Figure 1(c) shows an example of ILC constituting an interrupt lock edge \( ○ \rightsquigarrow ○ \) and a regular lock edge \( ○ \rightarrow ○ \). Note that an ILC does not necessarily represent a real interrupt-based deadlock in practice, so further validation is required to reduce false positives. We give a detailed illustration in §5.4. In addition, this paper focuses on detecting interrupt-based deadlocks, despite Archerfish being capable of detecting conventional deadlocks.

### 5 The Approach of Archerfish

In this section, we detail each core stage in Archerfish.

#### 5.1 ISR Identification

Prompt queries with LLVMs allow us to automatically extract ISR registration specifications in Linux subsystems, which supplement manually modeled standard specifications with a complete list shown in appendix A.3.
In the Linux kernel, is a function registered by interface typically executed under “process context”, “hardirq context” or “softirq context”?

(a) Prompt template for ISR registration interface

In the Linux kernel, is a function registered by API argument typically executed under “process context”, “hardirq context” or “softirq context”?

(b) Prompt template for ISR registration API

Figure 6: Prompt templates for LLM-assisted specs.

To construct a prompt query, Archerfish performs static analysis to trace each address-taken function along its def-use chain until reaching one of the two cases:

- **Registration Interface**: If the function pointer $fptr$ is used as $*p = fptr$, we take the type structure of $p$ as a potential ISR registration interface. The type would be used to construct a prompt query via the example template shown in Figure 6(a) to determine under which context the interface is executed.

- **Registration API**: If the function pointer $fptr$ is used as call $bar(..., fptr,...)$, we extract the callee function name and its argument index, treating them as a potential ISR registration API. Similarly, this information would be used to construct a prompt query via the template in Figure 6(b).

LLMs can act as a classifier for the execution context of queried APIs or interfaces. With such specifications, we can identify whether a function is an ISR or a kernel thread by inspecting whether it is passed to these APIs or interfaces. Specifically, if an address-taken function is classified as hardirq or softirq, it would be identified as an $isr \in \mathbb{R}$, otherwise it would be identified as $t \in \mathbb{T}$.

**Example 5.1.** To determine whether the function $\text{ks wlan_start_xmit()}$ in Figure 3(a) is an ISR or a thread, Archerfish traces its def-use chain and reaches the structure field $\text{net_device_ops.ndo_start_xmit}$. This structure field is an interface commonly used by Linux network developers for $\text{NET_TX_SOFTIRQ}$ softirq registration. In our experiment, LLM can correctly recognize that it is executed under the softirq context.

**5.2 Lockset and Interrupt-State Analysis**

Next, we perform a flow- and context-sensitive type-state analysis for computing lockset and interrupt states. Specifically, Archerfish runs forward data-flow analyses for each function following the reverse topological order on Call Graph (CG) [12, 16] and computes function summaries. These summaries can be inline by callers to achieve efficient inter-procedural context-sensitive analysis and be used for efficient ILG construction.

**5.2.1 Intra-Procedural Data-Flow Analysis**

At the entry of an analyzed function, the state of each lock $o$ and $isr$ are initialized as $\bot$. Next, Archerfish proceeds forward along the control flow graph to analyze each instruction sequentially until the function returns. The state at each instruction is initialized as a merge of states at its predecessors, and then the corresponding abstract operation shown in Figure 7 is used. A concrete list of modeled APIs can be found in appendix A.2.

Next, we provide a detailed illustration of the abstract operations. Note that the strong update [25], denoted by the $\leftrightarrow$ operator, is applied to achieve flow-sensitivity.

- **Lock Acquisition and Release**: At each lock-acquisition API call $\text{lock(o)}$, Archerfish applies (1) $o \leftrightarrow A$ on the current lockset, indicating the state of lock $o$ is updated as Acquired. A similar operation (2) is applied at each lock-release API call $\text{unlock(o)}$.

- **Interrupt Disabling and Enabling**: At each ISR-disabling API call $\text{disable_irq(isr)}$, Archerfish performs the operation (3) $isr \leftrightarrow D$ on the current interrupt state to mark $isr$ as Disable. The opposite operation (4) is performed at ISR-enabling API calls.

- **Merging**: The (5) merge operation first retrieves the lockset and interrupt states at predecessor in-
Preempt Summary \( M_1 := (O, S) \rightarrow (R, ST) \),
where \( ST = \{ E, D \} \)

Locking Summary \( M_2 := (O, S) \rightarrow O \)
Return Summary \( M_3 := (RS, LS) \)

Figure 8: Three function summaries

structions, and then performs \( LS_{merge} \) and \( RS_{merge} \) to conservatively merge these states. Specifically, \textit{may} analysis is used to determine whether a lock \( o \) could be \texttt{Acquired} or whether an \texttt{isr} could be \texttt{Enabled}.

**Example 5.2.** If the typestate from one incoming branch is \( \{(o, A), (isr, \perp)\} \) and that from the opposite branch is \( \{(o, R), (isr, E)\} \), the merged results would be \( \{(o, A), (isr, E)\} \), indicating lock \( o \) could be acquired and \texttt{isr} could be enabled at the branches confluence.

### 5.2.2 Function Summary Construction

Figure 8 shows the definition of three function summaries constructed along the data-flow analysis.

- **Preemption Unit Summary** (\( M_1 \)): This summary forms the units of interrupt preempt. Each summary item represents a critical section, denoted by a lock object and a lock statement \((o, s)\). The item also maintains a set of conservative interrupt states, representing the set of ISRs that could preempt this critical section. To do so, we record \texttt{must-Disabled} and \texttt{may-Enabled} state of each already analyzed ISR.

- **Lock Acquisition Summary** (\( M_2 \)): This summary represents lock acquisitions inside a function. Each summary item is also denoted by a lock object and a lock statement \((o, s)\). Different from \( M_1 \), the item in \( M_2 \) records a set of locks that \texttt{must} have been \texttt{Released} before this lock acquisition.

- **Return State Summary** (\( M_3 \)): This summary catches the lockset and interrupt states at the end of a function, which can be inlined by caller functions to update the typestate at call sites.

**Summary Construction.** Algorithm 1 shows how to construct the three summaries. First, for each encountered lock-acquisition API call \( s : lock(o) \), we construct a new item of \( M_1 \), recording the set of ISRs that \texttt{must} have been \texttt{Disabled} and cannot preempt the current critical section (Line 3). Similarly, we construct a new item of \( M_2 \), recording the set of locks \texttt{must} have been \texttt{Released} before lock \( o \) is acquired (Line 4). Second, for each ISR-enabling API call \( s : enable_{irq}(isr) \) (including those inlined from callee’s summary), we first retrieve the set of locks \((o, s')\) guarding \( s \), and update corresponding items of \( M_1 \) to mark that \texttt{isr} may be \texttt{Enabled} inside the critical sections (Lines 5-7). Finally, at the function return, \( LS \) and \( RS \) would be cached as \( M_3 \) (Lines 8-9), used by callers to update typestates at call sites.

### 5.2.3 Function Summary Inlining.

At a call site of an already analyzed function, constructed summaries inside the callee function can be inlined by the caller for context-sensitive inter-procedural analysis. Algorithm 2 shows the detail of the inlining process.

**Summary Inlining.** Archerfish first retrieves the three summaries \( M_1', M_2', M_3' \) constructed in callee function (Line 2). Before inlining the summary, we take two steps. First, if some ISRs have already been disabled at the call site and are not enabled inside the callee’s preempt unit, Archerfish will update the summary item in \( M_1' \) by marking the state as \texttt{Disabled} (Lines 3-5). Second, some locks have already been \texttt{Released} before the call site, so Archerfish needs to insert these released locks into each item of callee’s \( M_2' \) (Lines 6-7). After these, callee’s \( M_1' \) and \( M_2' \) would be inlined (Line 8), and the current lockset and interrupt states are updated by callee’s Return State Summary \( M_3' \) (Line 9).

**Example 5.3.** Figure 9 shows partial analysis results of Figure 1. Once \texttt{clear_work_bits()} is analyzed, its summary \( M_1 \) contains a preemption unit \( \{condlock, s_{22} : \emptyset\} \), indicating that the critical section \( condlock, s_{22} \) does not execute with any ISR explicitly enabled or disabled. Next, the summary is inlined by two caller
5.3 Interrupt Lock Graph Construction

In this stage, Archerfish proceeds to perform ILG construction. The core algorithm is shown in Algorithm 3.

**Algorithm 3: ILG Construction**

```c
1. Function InterruptLockEdgeAnalysis():
2.    foreach $f \in T \cup R$, $isr \in R$ do
3.        if $isr$ has higher priority than $f$ then
4.            retrieve $M_4$ from $f$;
5.            foreach $(o, s) \in M_4$ do
6.                if $(isr, D) \in M_1[(o, s)]$ then
7.                    continue;
8.            retrieve $M_2$ from $isr$;
9.            foreach $(o', s') \in M_2$ do
10.               add $o \sim o'$

11. Function RegularLockEdgeAnalysis($s$):
12.    case $s = lock(o)$ do
13.        foreach $(o', A) \in LS[s]$ do
14.            add $o' \rightarrow o$
15.    case $s = call bar()$ do
16.        retrieve $M_2$ from $bar$;
17.        foreach $o : (o, A) \in LS[s]$ do
18.            foreach $(o', s') \in M_2$ do
19.                if $o \notin M_2[(o', s')]$ then
20.                    add $o \sim o'$
```

**Interrupt Lock Edge Construction.** Interrupt lock dependencies are captured as interrupt lock edges $(E_I)$. As shown in InterruptLockEdgeAnalysis(), pairwise matching is run between each preemption unit $(M_4)$ of lock $o$ inside a function $f$ and each higher priority isr (Lines 2-5). If the isr is not disabled inside the preemption unit, for each lock acquisition $(M_2)$ of $o'$ inside isr, an interrupt lock edge $o \sim o'$ is constructed (Lines 6-10).

**Regular Lock Edge Construction.** Regular lock dependencies are captured as regular lock edges $(E_R)$. As shown in RegularLockEdgeAnalysis(), for a lock statement on lock $o$ and each already acquired lock $o'$, a regular lock edge $o' \rightarrow o$ is constructed (Lines 12-14); while for a call statement, Archerfish performs pairwise matching between each already acquired lock $o$ and each acquisition of lock $o'$ inside the callee function. If $o$ is not released before $o'$ is acquired inside the callee, a regular lock edge $o \rightarrow o'$ is constructed (Lines 15-20).

Once ILG is constructed, we use a standard cycle-detection algorithm [29] to identify an initial set of ILCs.

**Example 5.4.** In Figure 9, by running pairwise matching between summaries $M_1$ of enc_post_frame_start() and $M_2$ of s5p_mfc_irq(), we can construct an interrupt lock edge $condlock \sim irqlk$. By performing pairwise matching between $M_2$ of clear_work_bit() and $LS$ of $s5p_mfc_watchdog_worker()$ at $s_{95}$, we can construct a regular lock edge $irqlk \sim condlock$. As a result, the two lock edges form an ILC, as shown in Figure 1(c).

5.4 Lazy Deadlock Cycle Validation

Finally, each detected ILC is validated in terms of both the happen-before relationship and path feasibility.

5.4.1 Interrupt Happen-Before Validation

Synchronization between threads and ISRs should be characterized to ensure a true deadlock. Specifically, since interrupt preemption cannot happen before it is registered, all statements before ISR registration cannot be preempted. To this end, we perform a static happen-before analysis to validate the feasibility of an interrupt lock edge $o \sim o'$ in terms of ISR registration. Specifically, we perform a forward search from the last statement of the preemption unit of $o$, following the context-sensitive control flow graph. Once the registration statement of the ISR acquiring $o'$ is found during the graph traversal, the interrupt lock edge is considered invalid.
\[ \Phi(\circ) = \bigwedge_{e \in E} \Phi_S(e) \land \bigwedge_{e' \in E} \Phi_I(e') \]

\[ \Phi_S(e) = \bigvee_{p \in P(T_{entry}, s)} \Phi(p) \land \bigvee_{p' \in P(s, s')} \Phi(p') \quad (1) \]

\[ \Phi_I(e) = \forall s \in S_p, \bigvee_{p \in P(T_{entry}, s)} \Phi(p) \land \bigvee_{p' \in P(s, s')} \Phi(p') \quad (2) \]

Figure 10: Path-feasibility constraints of an ILC

5.4.2 Interrupt Path-Feasibility Validation

Figure 10 shows the path constraints \( \Phi(\circ) \) of an ILC (denoted by \( \circ \)), which consist of two parts: path constraints on regular lock edges, denoted as \( \Phi_S(e) \) and the path constraints on interrupt lock edges, denoted as \( \Phi_I(e') \). The Z3 SMT solver is used to determine path feasibility.

Regular Lock Edge. At a high level, the path constraints \( \Phi_S(e) \) of a regular lock edge represent that at least one path starting from the entry point could lead to lock dependency \( o \rightarrow o' \). To do so, we first collect the disjunctive path conditions of each path from the entry to the first lock acquisition site \( s \) of \( o \), as shown in Figure 10 (1). Second, we collect the disjunctive path conditions of each path from \( s \) to the second lock acquisition site \( s' \) of \( o' \), as shown in Figure 10 (2).

Interrupt Lock Edge. The path condition \( \Phi_I(e) \) of an interrupt locking edge \( o \rightarrow o' \) is more complicated, as the preemption at each statement inside the preemption unit with target ISR enabled (denoted by \( S_p \)) should be considered. Consequently, the disjunctive path conditions of each path from entry to each statement \( s \) in \( S_p \) are encoded, as shown in Figure 10 (3). Finally, the disjunctive path conditions of each path from ISR entry to lock acquisition site \( s' \) are encoded, as shown in Figure 10 (4).

Example 5.5. Figure 11 depicts the path conditions on the interrupt lock edge \( s.lock \rightarrow r.lock \) when the critical section of \( s.lock \) in the \( vq.alloc() \) function is preempted by the \( vq.heartbeat() \) ISR. With two valid preemption points within the critical section, the first part constraint (Figure 10 (3)) is formulated as the disjunctive conditions of these two preemption points \( (3) \lor (2) \). The second part constraint (Figure 10 (4)) consists of the disjunctive conditions on program paths from the entry of \( vq.heartbeat() \) to the lock acquisition site of \( r.lock \) \( (3) \). By evaluating the conjunction of these two parts using an SMT solver, the constraints are satisfied, as the second preemption point \( (3) \) is feasible along the path.

\begin{verbatim}
void vq_thread(...) {
    dev->vq = npnull;
    ...
    spin_lock(dev->s_lock);
    dev->vq = kmalloc(...);
    spin_unlock(dev->s_lock);
    ....
}

void vq_heartbeat(...) {
    ...
    if (!dev->vq)
        return;
    spin_lock(dev->r_lock);
    ...
}
\end{verbatim}

Figure 11: An example of path-feasibility validation. The symbol \texttt{new.o} is the return value of \texttt{kmalloc()}.

6 Implementation

Archerfish is implemented on LLVM [36] (4,034 Loc) and uses Z3 [82] as its path condition solver. Several Python scripts (354 Loc) are implemented to perform network interaction with the OpenAI RESTful API [49]. For the pointer aliasing information, we use a standard pointer analysis, following the previous work [21,48].

Lockset and Interrupt-State Analysis. Archerfish performs analysis by modeling a set of locking and interrupt-related APIs in the Linux kernel. Appendix A.2 shows the whole set of these APIs. To avoid these APIs being inlined during compilation, we attach \_attribute\_\( (\text{noinline}) \) attribute on these functions before compilation. We model several heavily used lock types in the Linux kernel, such as \texttt{spin.lock} and \texttt{rw.lock} [69], and model both the hardware and software interrupts (e.g., timer and tasklet [44]).

LLM-Assisted Specification Generation. The program data for constructing prompt queries, including the struct types [47] and function names, are extracted from compiled LLVM IR. Since the LLVM type system does not contain the name of the structure field types, we compile the Linux kernel with the “-g” option and parse LLVM metadata to extract such information. To construct prompt queries, we use RESTful APIs provided by Azure OpenAI Service and ChatGPT-4.0 model [49].

Path Conditions Solving. Path feasibility is validated with Z3 SMT solver, where each variable in the LLVM IR is modeled as a bit vector, on top of which arithmetic computation and guarded constraints can be modeled and solved. Following existing works [7,14], the conditions solving is under-constrained, losing the bounds of external values (e.g., those derived from hardware IO operations or input arguments of root functions). As a result, false positives are possible due to the presence of under-constrained values.
7 Evaluation

This section presents the evaluation results of Archerfish, through investigating the following research questions:

- **RQ1**: How effective is Archerfish in detecting interrupt-based deadlocks? (§ 7.1)
- **RQ2**: How do the stages contribute to enhancing the effectiveness of Archerfish? (§ 7.2)

We also discuss other points, such as the false negatives and generalizability of our approach in § 7.3.

**Experiment Setup.** We built the source code of Linux to LLVM IR using clang-12 and evaluated the latest version Linux (v6.4) at the time of the experiment. Following the conventional setup, we used the built-in `allmodconfig` setting during compilation. To generate subsystem specifications, Archerfish relies on the Azure OpenAI Service APIs [49] and uses the ChatGPT-4.0 model for evaluation. While performing the prompting, we set the temperature as 0.1 and set Top-p as 0.2, to achieve relatively stable and accurate results. All experiments were conducted on an Ubuntu server with two 20-core Intel(R) Xeon CPU@2.20GHz and 256GB of physical memory.

Note that Archerfish is the first static tool for detecting interrupt-based deadlocks in Linux kernel. To the best of our knowledge, no other open-source tools are available that can serve as a basis for comparison. Crucially, Archerfish is specific to detect interrupt-based deadlocks that previous static methods have overlooked.

### 7.1 Effectiveness on Bug Detection

The Archerfish’s effectiveness in detecting interrupt-based deadlocks is evaluated and summarized in Table 1.

<table>
<thead>
<tr>
<th>Description</th>
<th>Results</th>
</tr>
</thead>
<tbody>
<tr>
<td>Linux kernel statistics</td>
<td>Lines of source code 19.8M Number of functions 798.3K</td>
</tr>
<tr>
<td>Performance</td>
<td>Peak memory 38.3G Time spend on Stage I (Spec. generation) 31.0 mins Time spend on Stage II (Data-flow analysis) 28.4 mins Time spend on Stage III (ILG construction) 1.3 mins Time spend on Stage IV (ILC validation) 7.9 mins Total time (Stages I-IV) 68.6 mins</td>
</tr>
<tr>
<td>Precision</td>
<td>Validated ILCs (detected bugs) 151 Real interrupt-based deadlocks (real bugs) 76</td>
</tr>
<tr>
<td>ISR and thread identification</td>
<td>Kernel threads 136,207 Softirq ISRs by manual-model specifications 689 Softirq ISRs by manual-model specifications 1,226 Softirq ISRs by LLM-assisted specifications 578 Hardirq ISRs by LLM-assisted specifications 1,151 Total tokens used for prompt query 752,666</td>
</tr>
<tr>
<td>ILG construction</td>
<td>Preempt units 3,091,821 Preempt units with ISRs enabled 234,547 Total ISRs 3,644 ISRs containing lock acquisitions 1,449 Lock acquisition inside ISRs 5,957 Regular lock edges 1,507,760 Interrupt lock edges 3,962,932</td>
</tr>
<tr>
<td>Interrupt-based deadlocks detection</td>
<td>Detected ILCs 353 Invalid ILCs 202 ILCs pruned by happen-before analysis 67 ILCs pruned by path-feasibility validation 135 Validated ILCs brought by hardirq preemption 52 Validated ILCs brought by softirq preemption 99</td>
</tr>
</tbody>
</table>

21 are due to the case that an ISR cannot be identified and is misclassified as a kernel thread. Such misclassification leads to infeasible preemption scenarios, where the misclassified ISR is deemed able to be preempted by other ISRs. In the remaining 13 cases, a kernel thread is misidentified as an ISR due to incorrect specifications generated by LLMs, also leading to infeasible preemption scenarios. The high proportion of false alarms in this category indicates that correctly identifying ISRs is important to interrupt-based deadlock detection. Second, complex path conditions involving under-constrained values cause 25 false alarms. For example, Archerfish reports a false alarm on the function `gmc_v10_0_process_interrupt()`, which could be executed under both interrupt context and process context, depending on an under-constrained hardware status value received from IO operators. Since the actual execution path is directly controlled by the hardware status, such path constraints cannot be identified by static analysis. The remaining cases are associated with false alias relationships, which are inherent static analysis problems.

**Real-World Impacts.** We manually reviewed all the detected interrupt-based deadlocks and identified 76 of them as genuine and new bugs. These interrupt-based deadlocks are difficult to detect, as they have remained hidden for an average of 9.9 years. We wrote 37 patches to fix these bugs and sent the patches to the corresponding Linux kernel subsystem developers. Currently, 53 of these bugs have been confirmed by the Linux kernel developers, and most of the bug-fixing patches have al-
Table 2: Accuracy of LLM-Assisted Specifications

<table>
<thead>
<tr>
<th></th>
<th>Hardirq</th>
<th>Softirq</th>
</tr>
</thead>
<tbody>
<tr>
<td>Precision</td>
<td>78.6</td>
<td>80.6</td>
</tr>
<tr>
<td>Recall</td>
<td>87.3</td>
<td>88.0</td>
</tr>
<tr>
<td>w/o GPT-Assisted Specification (FP rate: 53.3%)</td>
<td>27</td>
<td>110</td>
</tr>
<tr>
<td>w/ GPT-Assisted Specification (FP rate: 49.7%)</td>
<td>41</td>
<td></td>
</tr>
</tbody>
</table>

ready been merged into the mainstream Linux kernel or corresponding subsystem development branch, resulting in 46 bugs already fixed.

The detected interrupt-based deadlocks cover numerous subsystems of Linux kernel, including various device drivers, network stack, and filesystems. Linux kernel developers highly appreciated our bug-reporting and patch-submitting efforts and demonstrated high interest in our tool with comments like “Will this tool be available for general use? It’s obviously quite handy.”, “Your experimental tool looks really promising.”, “Interesting.”. Furthermore, two bugs inside the SCTP and TIPC network subsystems are assigned 2 CVEs. Details of these detected bugs can be found in appendix A.4.

7.2 Effectiveness on Each Stage

Next, we investigate the key stages of Archerfish.

7.2.1 ISR Identification

We study how LLMs address the first challenge by inferring subsystem ISR registration specifications automatically. As shown in Table 1, among all the 3,644 ISRs identified by Archerfish, 1,729 (47.1%) are derived from the LLM-assisted subsystem specifications. During the process, Archerfish spent 752,666 tokens in prompting with the GPT-4 model, with 1,292 APIs and 19,172 interfaces collected and used in prompting.

Out of all the identified ISRs, 1,267 are softirq ISRs, while 2,377 are hardirq ISRs. ISRs only account for a small portion (2.6%) of all routines, compared to a large number of kernel threads (136,207). Despite their relatively small number, ISRs play a crucial role and lead to many interrupt-based deadlocks.

Accuracy of LLM-Assisted Specifications. To examine the precision of the LLM-assisted specifications, we randomly sampled 150 hardirq and 150 softirq ISRs from the generated specifications and manually inspected their correctness. Additionally, to examine the recall, we collected 150 real hardirq and 150 real softirq ISRs (the ground truth) from subsystem API documentation, code comments, or manual code reviewing, and checked whether they are covered by the LLM-assisted specifications. Table 2 shows that the overall precision and recall are 79.6% and 87.6%, respectively.

Ablation Study 1: LLM-Assisted Specifications. We further performed end-to-end ablation evaluation to examine the direct effect of LLM-assisted subsystem specifications on bug detection. As shown in Figure 12, with the LLM-assisted specifications, Archerfish detects 41 new bugs and misses 27 originally detected bugs, compared to Archerfish without the help of LLMs.

We manually examined all the 27 missed bugs and confirmed they are false warnings. The false warnings arise from situations of infeasible preemption, where an ISR is not correctly identified and mistakenly classified as a kernel thread, thus allowing the ISR to be preempted by other ISRs incorrectly. With the LLM-assisted subsystem specifications, more real ISRs can be identified, thus reducing such infeasible preemptions.

Out of the 41 additionally reported bugs with the help of LLM-assisted specifications, 12 are true positives. The false positive rate of these additionally reported bugs is a bit higher (70.7% vs 53.3%) because the LLM-assisted specifications contain some fake ISRs. When considering together the 27 reduced bugs brought by infeasible preemption, the overall false positive rate remains similar (49.7% vs 53.3%). In conclusion, the specifications help Archerfish capture new preemption and reduce infeasible preemption with high precision, resulting in 12 more real bugs detected while remaining comparable FP rate shown in our evaluation. A case study of a real bug can be found in appendix A.1. To sum up, the effect of LLM-assisted specifications in Stage I of Archerfish is highly effective.

7.2.2 Lockset and Interrupt-State Analysis

We first present some statistical data in data-flow analysis (Stage II). During typestate analysis, 309,821 pre-empt units are constructed, and 234,547 among them have at least one ISR enabled. On the other hand, 5,957 lock acquisition summaries are constructed inside 1,449 ISRs. These lock acquisitions and pre-empt units can introduce interrupt lock dependencies, represented by interrupt lock edges on the ILG. Specifically, a total of 3,962,932 (72.4%) interrupt lock edges and 1,507,760 (27.6%) regular lock edges are constructed. This result indirectly indicates that, despite the number of ISRs be-
ing much smaller than kernel threads, interrupt preemption is a noteworthy source of deadlocks in the Linux kernel that existing deadlock detection tools have ignored.

Ablation Study 2: Preemption Unit. Next, we study the effectiveness of the interrupt preemption unit to address the large concurrency reasoning space. Importantly, the preemption unit can significantly reduce the time costs for interrupt preemption analysis, as only critical sections need to be checked instead of all statements. To demonstrate the importance of this design, we created an ablation group called Archerfish−, where each statement is considered a unique preemption point, and executed Archerfish− with the same experiment setup. Our evaluation results indicate that Archerfish− takes 166.7 minutes to complete the analysis, resulting in a time overhead that is 2.43 times greater than before. The peak memory usage is also up to 130.2G (3.39x), as we require more memory resources to maintain enormous summaries for each statement during typestate analysis. In summary, using the preemption unit can significantly reduce the overhead regarding both time and memory.

7.2.3 Lazy Circular Lock Dependency Validation

We first demonstrate the effectiveness of deadlock cycle validation (Stage IV) to reduce false positives. By performing cycle detection on the ILG, we initially detected 353 ILCs. In total, 202 (57.2%) ILCs are pruned away during validation. Specifically, 67 ILCs are pruned by the interrupt happen-before validation, and 135 ILCs are dropped due to infeasible paths.

Ablation Study 3: Lazy Validation. We next explore the effectiveness of the lazy validation strategy in improving efficiency. As shown in Table 1, the number of lock edges (5,470,692) significantly exceeds the number of finally detected ILCs (353). Consequently, it is inefficient to validate the feasibility of every edge, highlighting the need for the lazy validation strategy.

7.3 Discussion

False Negative Study. We collected the past interrupt-based deadlocks and examined whether Archerfish can find them. Specifically, we searched the commit history of the Linux kernel using keywords, including “interrupt” and “deadlock”, and examined each relevant commit message. We then identified 16 existing interrupt-based deadlocks in the past five years. These deadlocks were detected by Lockdep [73] or manual code review, 14 of which can be initially detected by Archerfish in their corresponding historical versions. We manually checked the two missed bugs and found that their critical ISRs cannot be identified with LLMs, resulting in the missed corresponding interrupt lock dependencies. By manually modeling those ISRs, Archerfish can finally detect those two bugs without inducing false negatives.

Generalizability. Archerfish aims to detect interrupt-based deadlocks on spin_lock and rwlock [69] inside the Linux kernel. However, the lockset analysis, ILG construction, and cycle validation are general to model various locks. We believe that, with further engineering effort, Archerfish can be extended to model other locking mechanisms (e.g., completion variables and waiting queues) or detect interrupt-aware deadlocks in other OS kernels [1]. We leave the extension as our future work.

8 Related Work

Concurrency bug detection in the OS Kernel is an important problem [6,8,9,24,27,80]. Among the diverse types of concurrency errors [15,26,37,38,67,75], including data races [46,75,80], atomicity violation [28,38] and deadlocks [7,14], we focus on presenting the previous deadlock detection work in this section.

Static Deadlock Detection. Archerfish is specialized for detecting interrupt-based deadlocks and is orthogonal to existing works [7,14,15,21,34], which generally focus on deadlocks under thread interleaving and do not consider interrupt preemption. Recently, a few works were proposed for deadlock detection in specialized domains. For example, DeadWait [63] focuses on detecting deadlocks in asynchronous C# programs. Also, the work [58] proposes approaches for static communication deadlock detection in Go programs. Recently, DLOS [7] proposes a deadlock detection tool for OS kernel; however, their algorithm still does not consider interrupt preemption.

Dynamic Deadlock Detection. Dynamic tools like Lockdep [73] and ThreadSanitizer [64] can perform program instrumentation and monitor the runtime locking behaviors for deadlock detection. However, dynamic and static approaches have been separate realms for bug finding, having different merits. For thoroughly interrupt-based deadlocks detection with higher code coverage, a static bug detector has its advantages since there is no reliance on test cases or runtime environment like concurrency fuzzing [8,11,17,27,80] or controlled concurrency testing [57,77]. As an illustration of its effectiveness, Archerfish has uncovered 76 previously unknown and long-latent interrupt-based deadlocks in the Linux kernel despite subjecting the system to stress testing and extensive fuzzing.

Finally, we are the first to use LLMs to infer interrupt specifications. Using LLMs for dynamic or static program analysis is an interesting recent trend [39,59,62,76] on it. We believe that using LLMs for specification generation would bring new inspiration for combining LLMs into security analysis and solving practical problems.
9 Conclusion

We have presented Archerfish, a static and effective approach for interrupt-based deadlock detection in the Linux kernel. Evaluation shows that Archerfish effectively analyzes the Linux kernel in around one hour, uncovering 76 previously unknown interrupt-based deadlocks, with 53 confirmed and mostly fixed (with two assigned CVE IDs), demonstrating its real-world impact.

Acknowledgment

We thank the reviewers for their valuable comments on this work. This work is supported by the PRP/004/21FX grant from the Hong Kong Innovation and Technology Commission and research grants from Huawei and TCL. Yuandao Cai is the corresponding author.

References

[2] InfiniDB. https://github.com/torvalds/linux/commit/10af303192b5c490bb39b29641ecb0ead2e47e.
A Appendix

A.1 Case Study of a Real Bug Detected with LLM-Assisted Specification

```c
static int timbpgio_update_bit(...) {
    struct timbpgio *tgpio = ...
    ...
    spin_lock(&tgpio->lock);
    ......
}
```

```c
static void timbpgio_irq_disable(...) {
    ...
    unsigned long flags;
    spin_lock_irqsave(&tgpio->lock, flags);
}
```

```c
static struct irq_chip timbpgio_irqchip = {
    .name = "GP10",
    .irq_enable = timbpgio_irq_enable,
    .irq_disable = timbpgio_irq_disable,
    .irq_set_type = timbpgio_irq_type,
};
```

Figure 13: A new interrupt-based deadlock detected after applying LLM-assisted specification.

Figure 13 shows one of the newly detected bugs after applying LLM-assisted specification. Since the function `timbpgio_update_bit()` acquires `tgpio->lock` under process context at Line 46, `timbpgio_irq_disable()` could preempt the execution and acquire the lock again under interrupt context at Line 105. In such a scenario, the CPU would spin on the lock indefinitely, resulting in a system hang. The
key to detecting this bug is the expert knowledge that
`timbpio_irq_disable()` could be executed under the
interrupt context, which is ignored by manual modeling.
As `timbpio_irq_disable()` is registered via the
interface `irq_chip->irq_disable`, ChatGPT-4.0 can
infer that this callback interface is executed during
interrupt handling, enabling Archerfish to detect this
self-cyclic interrupt-based deadlock.

A.2 Details of APIs used in Static Lockset
and Interrupt-State analysis

Table 3 shows the list of APIs modeled for the lockset
and interrupt-state analysis. Next, we illustrate some
details of the implementation to model these APIs.

- **Lock/Unlock**: We model three kinds of commonly
  used locks inside the interrupt context, including
  `spin_lock`, `read_lock`, and `write_lock`. One
  special case in modeling is that `read_lock` operations
  allow concurrent accesses, whereas `write_lock`
  operations require exclusive accesses. As a result, two
  `read_lock` acquisitions do not introduce lock
  dependency, and we model these special cases in our tool.

- **Hardirq Enable/Disable**: There are several spe-
  cial cases in the API modeling. First, different
  APIs could disable/enable different sets of
  ISRs. The APIs `*.lock/unlock_irq()` and
  `*.local_irq_disable/enable()` operate on all
  the hardirq ISRs in the system, while APIs
  `enable/disable_irq()` only operate on a desig-
  nated hardirq specified by an irq line number. As
  a result, we perform different operations on different
  APIs to model these differences. Second, APIs end
  with `irqsave()` or `irqrestore()` suffix would
  store/load the interrupt state into/from a local
  flag variable. Thus, we perform a local data-flow
  analysis to track the interrupt state stored inside the
  flag variable.

- **Softirq Enable/Disable**: Compared with hardirq,
  the modeling for softirq is relatively simple. Specif-
  ically, each one of these APIs would disable/enable
  all the softirq ISRs in the system.

A.3 Details of Manually Modeled ISR Regis-
tration APIs in the Linux Kernel

Table 4 shows some standard APIs used by all the sub-
systems in Linux. Since there are well-defined docu-
mentations [42,44], we can manually model them in the
implementation.

- **Hardirq**: Standard APIs such as `request_irq()` are
  used for hardware interrupt handler registration, and
  they are well-defined in the documentation of Linux
 Kernel [42]. Thus, we can manually model these
commonly used hardware ISR registration APIs.

- **Softirq**: The timer, tasklet, and polling are well-
  known types of software interrupt handlers in the
Linux kernel, and corresponding registration APIs
are well-defined in the Linux Kernel documentation
[44]. We also manually model the correspond-
ing APIs.

For subsystem APIs or interfaces that are not well-
defined, we resort to LLM-assisted specifications.

A.4 Detected Deadlocks

Table 5 presents a list of all the bugs detected. Two bugs
detected inside SCTP and TIPC subsystems are assigned
CVE-2024-0639 and CVE-2024-0641, respectively.
Table 5: List of 76 interrupt-based deadlocks detected by Archerfish on Linux kernel v6.4. Each row displays the subsystem name, the preempted function that introduces the deadlock, whether the bug is detected with the help of LLM-assisted specifications, and the status of the bug. ✗ denotes the corresponding bug-fixing patch is already merged into Linux, ◆ denotes developers already confirmed the bug by adding “Acked-by” or “Reviewed-by” tags under the patch and prepared to accept the patch, and ○ denotes the patch is still under review.

<table>
<thead>
<tr>
<th>System</th>
<th>Function</th>
<th>LLM?</th>
<th>Status</th>
</tr>
</thead>
<tbody>
<tr>
<td>i2c</td>
<td>iproc_i2c_rd_reg()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>i2c</td>
<td>iproc_i2c_wr_reg()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>atm</td>
<td>console.show()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>atm</td>
<td>pcolse()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>gpio</td>
<td>timbgpio_update_bit()</td>
<td>✓</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>sun6i_dma_pause()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>sun6i_dma_terminate_all()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>pd_tx_submit()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>pdc_desc_get()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>pdc_desc_put()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>pd_issue_pending()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>pdc_desc_get()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>milbeaut haci channel config()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>milbeaut haci channel pause()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>milbeaut haci channel resume()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>plx_dma_process_desc()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>xgene_dma_cleanup_descriptors()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>k3_dma_terminate_all()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>k3_dma_transfer_pause()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>dma</td>
<td>mtks_hsdma_free_active_desc()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>staging</td>
<td>hostif_sme_work()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>staging</td>
<td>rtss_exclusive_enter_ssi()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>char</td>
<td>kcs_hmc_handle_event()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>IB</td>
<td>read_mod_write()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>IB</td>
<td>ib.device.get_netdev()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>media</td>
<td>enc_post_frame_start()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>media</td>
<td>dvb_dmxdev_is.callback()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>media</td>
<td>dvb_dmxdev_section_callback()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>mfd</td>
<td>pm8xxx_config_irq()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>mfd</td>
<td>pm8xxx_irq_set_type()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>misc</td>
<td>bcm_vk_reset()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>misc</td>
<td>bcm_vk_blk drv access()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>misc</td>
<td>bcm_vk_get_ctx()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>misc</td>
<td>bcm_vk_free_ctx()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>isdn</td>
<td>_hfcpci_softirq()</td>
<td>✗</td>
<td>●</td>
</tr>
<tr>
<td>watchdog</td>
<td>s3c2410wdt_stop()</td>
<td>✗</td>
<td>●</td>
</tr>
</tbody>
</table>

| watchdog | s3c2410wdt_start() | ✗    | ●      |
| tty     | imx_uart_dma_rx_callback() | ✗    | ●      |
| tty     | slgt_compat_iocet() | ✗    | ●      |
| scsi    | mvs_slot_complete() | ✓    | ●      |
| scsi    | mvs_slot_complete() | ✓    | ●      |
| scsi    | qedi_cpu_offline() | ✗    | ●      |
| scsi    | megaraid_reset_handler() | ✗    | ●      |
| scsi    | qedi_tmf_resp_work() | ✗    | ●      |
| scsi    | ips_eh_abort() | ✗    | ●      |
| scsi    | hisi_sas_release_task() | ✗    | ●      |
| scsi    | hisi_sas_slot_task_free() | ✗    | ●      |
| scsi    | hisi_sas_task_deliver() | ✗    | ●      |
| scsi    | hisi_sas_slot_index_free() | ✗    | ●      |
| scsi    | hisi_sas_slot_index_alloc() | ✗    | ●      |
| scsi    | hisi_sas_alloc_dev() | ✗    | ●      |
| scsi    | fcoe_ctlr_flogi_send() | ✓    | ●      |
| scsi    | fcoe_ctlr_flogi_retry() | ✓    | ●      |
| scsi    | fcoe_ctlr_announce() | ✓    | ●      |
| scsi    | fcoe_ctlr_els_send() | ✓    | ●      |
| scsi    | mvs_find_dev_mvi() | ✗    | ●      |
| ocfs2   | o2quq_make_decision() | ✗    | ●      |
| ocfs2   | o2quq_hb_up() | ✗    | ●      |
| ocfs2   | o2quq_hb_down() | ✗    | ●      |
| ocfs2   | o2quq_hb_still_up() | ✗    | ●      |
| ocfs2   | o2quq_conn_up() | ✗    | ●      |
| ocfs2   | o2quq_conn_err() | ✗    | ●      |
| ocfs2   | o2net_debug_add NST() | ✗    | ●      |
| ocfs2   | o2net_debug_del NST() | ✗    | ●      |
| ocfs2   | nst_seq_start() | ✗    | ●      |
| ocfs2   | nst_seq_next() | ✗    | ●      |
| ocfs2   | nst_seq_show() | ✗    | ●      |
| ocfs2   | o2net_debug_add_sc() | ✗    | ●      |
| ocfs2   | o2net_debug_del_sc() | ✗    | ●      |
| ocfs2   | sc_seq_start() | ✗    | ●      |
| ocfs2   | sc_seq_next() | ✗    | ●      |
| ocfs2   | sc_seq_show() | ✗    | ●      |
| scvp    | scvp_auto_asconf_init() | ✗    | ●      |
| ax25    | ax25_check_dama_slave() | ✗    | ●      |
| tipc    | tipc_crypto_key_revoke() | ✗    | ●      |

**Total** | 76