Next: Experimental Results Up: Automated Response Using System-Call Previous: pH Design

Implementation

**Figure 1:** Basic flow of control and data in a pH-modified Linux kernel.
$\begin{figure}\begin{center} \psfig{figure=pH-flow.eps,width=5in}\end{center}\end{figure}$

The pH prototype is implemented as a patch for the Linux 2.2 kernel, and was developed and tested on systems running a pre-release of the Debian/GNU Linux 2.2 distribution [35]. The modified kernel is capable of monitoring every executed system call, recording profiles for every executable. An overview of the system is shown in Figure 1.

Program profiles for each executable are stored on disk. Each profile contains both a training and testing array, and so is actually two ``profiles'' by the terminology in Section 2. The kernel loads the current profile when a new program begins executing (on execve), and then writes it out again when the process terminates. When a new executable is loaded via the execve system call, the kernel attempts to load the appropriate profile from disk; if it is not present, a new profile is created. If another process runs the same executable, the profile is shared between both processes. To prevent consistency problems due to interleaving, each executing process maintains its own record of recent system calls (its current sequence). When all processes using a given profile terminate, the updated profile is saved to disk. A loaded profile consumes approximately 80K of kernel (non-swappable) memory.

We modified the system call dispatcher so that it calls a pH function (pH_process_syscall) prior to dispatching the system call. pH_process_syscall implements the monitoring, response, and training logic. pH is controlled through its own system call, sys_pH, which allows the superuser (root) to take the following actions:

Start, stop monitoring processes.
Set system parameters (see Section 3 for descriptions):
- $delay\_factor$
- $abort\_execve$
- $mod\_minimum$
- $normal\_minimum$
- $normal\_ratio$
- $tolerization\_limit$
- $anomaly\_limit$
Turn on/off logging of system calls to disk (expensive, used for debugging).
Turn on/off logging novel sequences to disk.
Status (prints out current values of system parameters to the kernel log).
Write all profiles to disk.
Reset <pid>: Resets the profile to be empty.
Start normal <pid>: Copies the training array for 's executable to its testing array, and marks the profile as normal.
Tolerize <pid>: Change the normal flag for 's profile to 0, reset its locality frame, and cancel any current delay for it.
Sensitize <pid>: Clears the training array. This mechanism is used to prevent known true positives from being incorporated into the training data.
Turn on/off debugging messages sent to kernel logging facility.

More specifically, we extended the Linux task structure (the kernel data structure used to represent processes and kernel-level threads) with a new structure which contains the following fields: the current window of system calls for the task, a locality frame, and a pointer to the current profile. A profile is a structure containing two byte-arrays for storing pairs (the training and testing arrays) and some additional training statistics described in Section 3.

Next: Experimental Results Up: Automated Response Using System-Call Previous: pH Design

Anil B. Somayaji 2000-06-14