Check out the new USENIX Web site. next up previous
Next: Ranking Messages by Exceptionality Up: Analyzing System Logs: A Previous: Analyzing System Logs: A

Introduction

System logs, such as Windows Event Logs or Linux system logs, are an important resource for computer system management. These logs hold textual messages emitted from various sources in the computer system during its day-to-day operation. Emitted messages may be informational, or they can indicate a problem in the system, whether trivial or more serious. Periodic monitoring of system logs by system administrators allows the identification of anomalies and security breaches in the system. In addition, the information in system logs is vital for problem diagnosis. In reality, however, system logs hold a large number messages, most of which are not interesting to the user. It is time-consuming and sometimes impossible to manually find the valuable messages in this abundance of information. Previous works on the subject of log analysis present a variety of approaches. One approach is to have a human expert define a set of message patterns to find, along with desired actions to be taken when encountering them ([5], [6], [13]). The effort invested in writing and maintaining these rules is proportional to the number of message types and the rate at which they change. Another approach for log analysis focuses on visualizing the log data in a useful way ([2], [11]). This is achieved, for instance, by showing a succinct representation of the log data, by graphically showing patterns in the data or by presenting time statistics of messages. Works differ in the type and extent of pattern detection applied to log data. Some of the techniques are analysis of the frequency at which message occur [11], grouping of time correlated messages ([10], [7]), and the use of text analysis algorithms to categorize messages ([10], [7]). Unlike the approach we present here, all these works base their analysis only on the log data of the inspected computer system. In this paper we present a method for ranking log messages by their estimated value to users, based on information from a large population of computer systems. We generate a new ranked log view, in which the messages are shown in order of rank and in a condensed form. We applied our method on a dataset of the combined Windows Event Log (Security, Application and System messages) taken from $ 3,\!000$ IBM xSeries servers that are used for diverse purposes. A characteristic Event Log holds between $ 3,\!000$ and $ 30,\!000$ messages. We show that using a new feature construction scheme, we can find a structure in the logs of computer systems to improve ranking. The rest of the paper is organized as follows: In Section [*] we describe our method for scoring log messages and its use of clustering as a building block. In Section [*] a new feature construction scheme for sample data is introduced. This scheme achieves better clustering results in the message ranking scenario. In Section [*] we describe the experiments and analyzes the results. We summarize in Section [*].
next up previous
Next: Ranking Messages by Exceptionality Up: Analyzing System Logs: A Previous: Analyzing System Logs: A
2007-03-12