Abstract
With rapid developments in the information industry, data centers have become increasingly important for collecting and storing data. The devices in data centers are not only connected to external machines to provide a variety of services, but they also store vast amounts of data, as device failures in data centers can result in fatal and heavy economic damage. Various methods have been studied in recent years to effectively predict failures in connected devices. However, in data center-scale systems, there is a problem of low frequency of failure when predicting the failure for each device. In addition, there are complex failures that may occur within the data center owing to a mix of devices and systems, and it is difficult to determine the cause of failure in such cases. In this study, we present a device hierarchical attention network (DHAN) methodology that can predict all device failures by simultaneously using existing device information regarding the devices in the data center. Because the devices in the data center could potentially affect each other, this information regarding the device is used in a composite manner. When using information from a single device, it was observed that failure could be predicted more effectively compared to the results obtained from failure prediction. In addition, by extracting attention information from the DHAN model, we identified a device that plays an important role in predicting the failure of a particular device. Thereafter, we utilized it to cluster and reconstruct the DHAN model and identify the results of predicting failures more effectively. Based on the results presented herein, it is expected that the proposed system can be stably maintained and repaired by identifying the potential impact of the devices.
Original language | English |
---|---|
Article number | 117277 |
Journal | Expert Systems with Applications |
Volume | 203 |
DOIs | |
Publication status | Published - 2022 Oct 1 |
Bibliographical note
Publisher Copyright:© 2022 Elsevier Ltd
All Science Journal Classification (ASJC) codes
- Engineering(all)
- Computer Science Applications
- Artificial Intelligence