Abstract
A new system model reflecting the clustered structure of distributed storage is suggested to investigate interplay between storage overhead and repair bandwidth as storage node failures occur. Large data centers with multiple racks/disks or local networks of storage devices (e.g., sensor network) are good applications of the suggested clustered model. In realistic scenarios involving clustered storage structures, repairing storage nodes using intact nodes residing in other clusters are more bandwidth consuming than restoring nodes based on information from intra-cluster nodes. Therefore, it is important to differentiate between intra-cluster repair bandwidth and cross-cluster repair bandwidth in modeling distributed storage. Capacity of the suggested model is obtained as a function of fundamental resources of distributed storage systems, namely, node storage capacity, intra-cluster repair bandwidth, and cross-cluster repair bandwidth. The capacity is shown to be asymptotically equivalent to a monotonic decreasing function of number of clusters, as the number of storage nodes increases without bound. Based on the capacity expression, feasible sets of required resources which enable reliable storage are obtained in a closed-form solution. Specifically, it is shown that the cross-cluster traffic can be minimized to zero (i.e., intra-cluster local repair becomes possible) by allowing extra resources on storage capacity and intra-cluster repair bandwidth, according to the law specified in the closed form. The network coding schemes with zero cross-cluster traffic are defined as intra-cluster repairable codes, which are shown to be a class of the previously developed locally repairable codes.
Original language | English |
---|---|
Article number | 8360492 |
Pages (from-to) | 81-107 |
Number of pages | 27 |
Journal | IEEE Transactions on Information Theory |
Volume | 65 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2019 Jan |
Bibliographical note
Funding Information:Manuscript received September 25, 2017; revised March 9, 2018 and April 30, 2018; accepted April 30, 2018. Date of publication May 17, 2018; date of current version December 19, 2018. This work was supported in part by the National Research Foundation of Korea under Grant 2016R1A2B4011298 and in part by the ICT Research and Development Program of MSIP/IITP (Research on Adaptive Machine Learning Technology Development for Intelligent Autonomous Digital Companion) under Grant 2016-0-00563. This paper was presented at the 2017 IEEE Conference on Communications [1].
Publisher Copyright:
© 1963-2012 IEEE.
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Library and Information Sciences