The first contribution that we presented in this paper was a freely distributed filter for Procmail that detected known malicious Windows executables and previously unknown malicious Windows binaries from UNIX. The detection rate of new executables was over twice that of the traditional signature based methods, 97.76% compared with 33.96%.
One problem with traditional, signature-based methods is that in order to detect a new malicious executable, the program needs to be examined and a signature extracted from it and included in the anti-virus database. The difficulty with this method is that during the time required for a malicious program to be identified, analyzed and signatures to be distributed, systems are at risk from that program. Our methods may provide a defense during that time. With a low false positive rate the inconvenience to the end user would be minimal while providing ample defense during the time before an update of models is available.
Virus scanners are updated about every month, and 240-300 new malicious executables are created in that time (8-10 a day ). Our method may catch roughly 216-270 of those new malicious executables without the need for an update whereas traditional methods would catch only 87-109. Our method tested on a particular data set more than doubles the detection rate of signature based methods.
Secondly, we presented a system that improves it's accuracy by regenerating models after receiving borderline cases. This feature is of interest because as more servers and clients use this system the system will receive additional borderline cases. Training on these borderline cases will increase the accuracy of the filter. Finally, the system has the optional ability to monitor the propagation of malicious attachments. Depending upon the user specified setting, email tracking can be turned on or off. If tracking is turned on then statistics can be generated detailing how a malicious binary attacked a system and propagated. If tracking is turned off then the system loses no accuracy in detecting malicious attachments.
The system that we presented detected malicious Windows binaries from UNIX, and detected new examples of similar malicious binaries because of the data mining algorithms. It tracked the propagation of email attachments, and with the inclusion of borderline cases it will become more accurate with time. Also with a larger, more realistic data set work can be done to show the algorithm is practical.