For a robot to understand a scene, we have to infer and extract meaningful information from vision sensor data. Since scene understanding consists in recognizing several visual contexts, we have to extract these contextual cues and understand their relationships. However, context extraction from visual information is difficult due to uncertain information in variable environments, imperfect nature of the feature extraction methods and high computational complexity of reasoning from the complex relationship. In order to manage the uncertainties effectively, in this paper, we adopted Bayesian probabilistic approach, and proposed a Bayesian network framework that synthesizes the low level features and the high level semantic cues. It contains how to develop and utilize an integrated Bayesian network model. In the experimental results of two applications, the efficacy of the proposed framework is shown.