Many-integrated core (MIC) architecture combines dozens of reduced x86 cores onto a single chip to offer high degrees of parallelism. The parallel user applications executed across many cores that exist in one or more MICs require a series of work related to data sharing and synchronization with the host. In this work, we build a real CPU+MIC heterogeneous cluster and analyze its performance behaviors by examining different communication methods such as message passing method and remote direct memory accesses. Our evaluation results and in-depth studies reveal that (i) aggregating small messages can improve network bandwidth without violating latency restrictions, (ii) while MICs can execute hundreds of hardware cores, the highest network throughput is achieved when only 4 ~ 6 point-to-point connections are established for data communication, (iii) data communication over multiple point-to-point connections between host and MICs introduce severe load unbalancing, which require to be optimized for future heterogeneous computing.
|Title of host publication||Network and Parallel Computing - 14th IFIP WG 10.3 International Conference, NPC 2017, Proceedings|
|Editors||Xuanhua Shi, Mahmut Kandemir, Hong An, Chao Wang, Hai Jin|
|Number of pages||5|
|Publication status||Published - 2017|
|Event||14th IFIP WG 10.3 International Conference on Network and Parallel Computing, NPC 2017 - Hefei, China|
Duration: 2017 Oct 20 → 2017 Oct 21
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Other||14th IFIP WG 10.3 International Conference on Network and Parallel Computing, NPC 2017|
|Period||17/10/20 → 17/10/21|
Bibliographical noteFunding Information:
Acknowledgement. This research is mainly supported by NRF 2016R1C1B2015312. This work is also supported in part by IITP-2017-2017-0-01015, NRF-2015M3C4 A7065645, DOE DE-AC02-05CH 11231 and MemRay grant (2015-11-1731). The corresponding author is M. Jung.
© IFIP International Federation for Information Processing 2017.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)