Design and evaluation of random linear network coding accelerators on FPGAs

Sunwoo Kim, Won Seob Jeong, Won W. Ro, Jean Luc Gaudiot

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Network coding is a well-known technique used to enhance network throughput and reliability by applying special coding to data packets. One critical problem in practice, when using the random linear network coding technique, is the high computational overhead. More specifically, using this technique in embedded systems with low computational power might cause serious delays due to the complex Galois field operations and matrix handling. To this end, this article proposes a high-performance decoding logic for random linear network coding using field-programmable gate-array (FPGA) technology. We expect that the inherent reconfigurability of FPGAs will provide sufficient performance as well as programmability to cope with changes in the specification of the coding. The main design motivation was to improve the decoding delay by dividing and parallelizing the entire decoding process. Fast arithmetic operations are achieved by the proposed parallelized GF ALUs, which allow calculations with all the elements of a single row of a matrix to be performed concurrently. To improve the flexibility in the utilization of the FPGA components, two different decoding methods have been designed and compared. The performance of the proposed idea is evaluated by comparing with the performance of the decoding process executed by general-purpose processors through an equivalent software algorithm. Overall, a maximum throughput of 65.98 Mbps is achieved with the proposed FPGA design on an XC5VLX110T Virtex 5 device. In addition, the proposed design provides speedups of up to 13.84 compared to an aggressively parallelized software decoding algorithm run on a quad-core AMD processor. Moreover, the design affords 12 times higher power efficiency in terms of throughput per watt than an ARM Coretex-A9 processor.

Original languageEnglish
Article number13
JournalTransactions on Embedded Computing Systems
Volume13
Issue number1
DOIs
Publication statusPublished - 2013 Aug

Fingerprint

Linear networks
Network coding
Particle accelerators
Decoding
Field programmable gate arrays (FPGA)
Throughput
Embedded systems
Specifications

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture

Cite this

@article{8d42e5cc65d44819b7269b4d949a7fbd,
title = "Design and evaluation of random linear network coding accelerators on FPGAs",
abstract = "Network coding is a well-known technique used to enhance network throughput and reliability by applying special coding to data packets. One critical problem in practice, when using the random linear network coding technique, is the high computational overhead. More specifically, using this technique in embedded systems with low computational power might cause serious delays due to the complex Galois field operations and matrix handling. To this end, this article proposes a high-performance decoding logic for random linear network coding using field-programmable gate-array (FPGA) technology. We expect that the inherent reconfigurability of FPGAs will provide sufficient performance as well as programmability to cope with changes in the specification of the coding. The main design motivation was to improve the decoding delay by dividing and parallelizing the entire decoding process. Fast arithmetic operations are achieved by the proposed parallelized GF ALUs, which allow calculations with all the elements of a single row of a matrix to be performed concurrently. To improve the flexibility in the utilization of the FPGA components, two different decoding methods have been designed and compared. The performance of the proposed idea is evaluated by comparing with the performance of the decoding process executed by general-purpose processors through an equivalent software algorithm. Overall, a maximum throughput of 65.98 Mbps is achieved with the proposed FPGA design on an XC5VLX110T Virtex 5 device. In addition, the proposed design provides speedups of up to 13.84 compared to an aggressively parallelized software decoding algorithm run on a quad-core AMD processor. Moreover, the design affords 12 times higher power efficiency in terms of throughput per watt than an ARM Coretex-A9 processor.",
author = "Sunwoo Kim and Jeong, {Won Seob} and Ro, {Won W.} and Gaudiot, {Jean Luc}",
year = "2013",
month = "8",
doi = "10.1145/2512469",
language = "English",
volume = "13",
journal = "Transactions on Embedded Computing Systems",
issn = "1539-9087",
publisher = "Association for Computing Machinery (ACM)",
number = "1",

}

Design and evaluation of random linear network coding accelerators on FPGAs. / Kim, Sunwoo; Jeong, Won Seob; Ro, Won W.; Gaudiot, Jean Luc.

In: Transactions on Embedded Computing Systems, Vol. 13, No. 1, 13, 08.2013.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Design and evaluation of random linear network coding accelerators on FPGAs

AU - Kim, Sunwoo

AU - Jeong, Won Seob

AU - Ro, Won W.

AU - Gaudiot, Jean Luc

PY - 2013/8

Y1 - 2013/8

N2 - Network coding is a well-known technique used to enhance network throughput and reliability by applying special coding to data packets. One critical problem in practice, when using the random linear network coding technique, is the high computational overhead. More specifically, using this technique in embedded systems with low computational power might cause serious delays due to the complex Galois field operations and matrix handling. To this end, this article proposes a high-performance decoding logic for random linear network coding using field-programmable gate-array (FPGA) technology. We expect that the inherent reconfigurability of FPGAs will provide sufficient performance as well as programmability to cope with changes in the specification of the coding. The main design motivation was to improve the decoding delay by dividing and parallelizing the entire decoding process. Fast arithmetic operations are achieved by the proposed parallelized GF ALUs, which allow calculations with all the elements of a single row of a matrix to be performed concurrently. To improve the flexibility in the utilization of the FPGA components, two different decoding methods have been designed and compared. The performance of the proposed idea is evaluated by comparing with the performance of the decoding process executed by general-purpose processors through an equivalent software algorithm. Overall, a maximum throughput of 65.98 Mbps is achieved with the proposed FPGA design on an XC5VLX110T Virtex 5 device. In addition, the proposed design provides speedups of up to 13.84 compared to an aggressively parallelized software decoding algorithm run on a quad-core AMD processor. Moreover, the design affords 12 times higher power efficiency in terms of throughput per watt than an ARM Coretex-A9 processor.

AB - Network coding is a well-known technique used to enhance network throughput and reliability by applying special coding to data packets. One critical problem in practice, when using the random linear network coding technique, is the high computational overhead. More specifically, using this technique in embedded systems with low computational power might cause serious delays due to the complex Galois field operations and matrix handling. To this end, this article proposes a high-performance decoding logic for random linear network coding using field-programmable gate-array (FPGA) technology. We expect that the inherent reconfigurability of FPGAs will provide sufficient performance as well as programmability to cope with changes in the specification of the coding. The main design motivation was to improve the decoding delay by dividing and parallelizing the entire decoding process. Fast arithmetic operations are achieved by the proposed parallelized GF ALUs, which allow calculations with all the elements of a single row of a matrix to be performed concurrently. To improve the flexibility in the utilization of the FPGA components, two different decoding methods have been designed and compared. The performance of the proposed idea is evaluated by comparing with the performance of the decoding process executed by general-purpose processors through an equivalent software algorithm. Overall, a maximum throughput of 65.98 Mbps is achieved with the proposed FPGA design on an XC5VLX110T Virtex 5 device. In addition, the proposed design provides speedups of up to 13.84 compared to an aggressively parallelized software decoding algorithm run on a quad-core AMD processor. Moreover, the design affords 12 times higher power efficiency in terms of throughput per watt than an ARM Coretex-A9 processor.

UR - http://www.scopus.com/inward/record.url?scp=84883879141&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883879141&partnerID=8YFLogxK

U2 - 10.1145/2512469

DO - 10.1145/2512469

M3 - Article

AN - SCOPUS:84883879141

VL - 13

JO - Transactions on Embedded Computing Systems

JF - Transactions on Embedded Computing Systems

SN - 1539-9087

IS - 1

M1 - 13

ER -