PV Monitoring Based on Linear Regression

Thursday, June 07, 2018 - 4:15 pm4:40 pm

Wang Bo, Baidu


PV (Page View) curve is one of the most important curves for SREs. Every significant drop on the curve is regarded as an incident. Therefore, SREs are badly in need of a good anomaly detection algorithm.

Because PV fluctuates during day and night, the detection heavily depends on its expected values. Moving average is a naïve method to generate the expected values. It suffers from two reasons. First, it lags behind the actual trend, which will miss the drop on a rise trend. Second, it cannot easily differentiate between the drop and the recovery after a rise. Advanced methods such as exponential smoothing also have their own shortcomings. When PVs are large, the local fluctuations of the curve are relatively small, rendering a smooth curve. This inspired us to apply linear regression to generate the expected value. But linear regression is susceptible to abnormal values.

In this talk, we will present a method based on robust linear regression to compute expected values. This method is able to resist the impact of anomalies. Moreover, we will also introduce a statistical hypothesis testing method to detect anomalies, eliminating the need to set different thresholds at different time in simple methods.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@conference {214971,
author = {Wang Bo},
title = {{PV} Monitoring Based on Linear Regression},
year = {2018},
publisher = {{USENIX} Association},