In modern computing hardware, the performance gap between processor and memory is one of the most significant factors that limits overall performance improvement of computing system. Also, with the advent of multicore and manycore system, memory bandwidth per core is decreasing constantly. To solve this problem, recently, many researchers are interested in Processing-In-Memory (PIM). PIM is that processing elements are attached to memory-side, so near-memory-processing which is suitable for memory intensive application can be possible. Various researches studied PIM, but it was just single-channel memory system. In addition, PIM is a new architecture that is different with conventional computing system. Thus, common data layout cannot become optimal case for PIM. Optimal data layout is also needed to be studied. In this paper, we propose the multi-channel PIM architecture with PIM-to-PIM communication, because data that is needed to operate can be distributed over several channels. To utilize multichannel PIM architecture properly, we also introduce data layout that can minimize the number of PIM-to-PIM communications which are overheads of the system and maximize parallelism to reduce execution time. We evaluate it about vector arithmetic operation. The result is that execution time is improved about 393% and compared to the worst case, in the optimal data layout.