In the 5G era, the operational cost of network is expected to increase significantly because of networks' densification, the concurrent operation at multiple frequency bands, and the simultaneous use of different medium access as latent components of 5G. To decrease the operational cost of networks, engineers have tuned to self-organizing networks (SON) that facilitate automatic operation of a network. Challenges have emerged, however, that hinder the current SON paradigm from meeting the requirements of 5G. To overcome these challenges, researchers have proposed a framework for empowering SON with big data. The framework of big data empowered SON analyzes the relationship between key performance indicators (KPIs) and related network parameters (NPs) using machine learning tools; with those parameters, the framework develops regression models using Gaussian process. These models are then applied to the SON engine to be further optimized for operation. The problem, however, is that the methods of finding NPs related to KPI differ case by case. In addition, it is not easy to apprehend the relationship between a KPI and the various NPs related to that KPI with the Gaussian process regression model because it is a single regression. In this paper, to alleviate these two problems, we propose multiple regression models based on MapReduce, where a KPI is the dependent variable and NPs are the independent variables; then we also describe implementation issues with MapReduce.