Due to the decoding complexity of network coding, there have been concerns on adopting network coding in the practical P2P systems. To provide rapid decoding speed in practical network coding systems, various multi-threaded approaches which successfully exploit hardware supported TLP have been proposed. Among those parallel approaches, a dynamic partitioning method is known to be the best solution so far. However, the algorithm dynamically changes workload distribution and inherently contains some limits to utilize the SIMD instruction set which are designed to work on a fixed size of data. In this paper, we present a new data manipulation method to utilize SIMD instruction sets, which can be successfully integrated into the dynamic partitioning of thread-level workload distribution. With exploiting both SIMD and thread-level parallelism, we achieve the speed-up of 10.86 using eight running threads compared to the serial algorithm.
All Science Journal Classification (ASJC) codes
- Control and Systems Engineering
- Computer Science(all)
- Electrical and Electronic Engineering