The Time-of-Flight (ToF) sensor has been widely used in computer vision fields since it can provide depth information in real time. However, the depth map obtained from ToF sensor is distressed with error, and has a lower resolution than general cameras. In this paper, we propose a novel framework to fuse and upsample multi-view depth maps obtained from multiple ToF sensors. The proposed method can be robust to the camera calibration error and effectively applied to the Multi-view Video plus Depth (MVD) system. For that, we perform depth balancing and confidence map based multi-view depth fusion. The depth balancing adjusts the distribution of depth values between multiple ToF sensors. It can provide a coherent depth for the corresponding points between depth maps. Confidence map based multi-view depth fusion technique can restore the depth acquisition error and align multiple depth maps well with the corresponding color image by using only reliable depth values. Experimental results show that the proposed method using multiple ToF sensors is superior to the conventional method based on the 2D-plus-depth system consisting of one color camera and one depth sensor.