About SciDoc
Document Search

DOCUMENT METADATA
SLAC Publication: SLAC-PUB-17144
SLAC Release Date: September 29, 2017
Predicting Server Failures with Machine Learning
Lai, Brian.
Unexpected server failures incur a large cost. Using data that is continuously collected by monitoring software, we can more accurately understand thee processes that each server is used for. The deviations in server performance help diagnose when servers may malfunction. We demonstrate a machine leaning model that can predict whether a server fails within 60 days with high accuracy. In specific, our models predict the occurrence of hard drive failures as they constitute over 80% of all server f... Show Full Abstract
Unexpected server failures incur a large cost. Using data that is continuously collected by monitoring software, we can more accurately understand thee processes that each server is used for. The deviations in server performance help diagnose when servers may malfunction. We demonstrate a machine leaning model that can predict whether a server fails within 60 days with high accuracy. In specific, our models predict the occurrence of hard drive failures as they constitute over 80% of all server failures within the data center. Show Partial Abstract
Download File: