Speeding up Inference with User Simulators throughPolicy Modulation

Hee Seung Moon, Seungwon Do, Wonjae Kim, Jiwon Seo, Minsuk Chang, Byungjoo Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

The simulation of user behavior with deep reinforcement learning agents has shown some recent success. However, the inverse problem, that is, inferring the free parameters of the simulator from observed user behaviors, remains challenging to solve. This is because the optimization of the new action policy of the simulated agent, which is required whenever the model parameters change, is computationally impractical. In this study, we introduce a network modulation technique that can obtain a generalized policy that immediately adapts to the given model parameters. Further, we demonstrate that the proposed technique improves the efficiency of user simulator-based inference by eliminating the need to obtain an action policy for novel model parameters. We validated our approach using the latest user simulator for point-and-click behavior. Consequently, we succeeded in inferring the user's cognitive parameters and intrinsic reward settings with less than 1/1000 computational power to those of existing methods.

Original languageEnglish
Title of host publicationCHI 2022 - Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450391573
DOIs
Publication statusPublished - 2022 Apr 29
Event2022 CHI Conference on Human Factors in Computing Systems, CHI 2022 - Virtual, Online, United States
Duration: 2022 Apr 302022 May 5

Publication series

NameConference on Human Factors in Computing Systems - Proceedings

Conference

Conference2022 CHI Conference on Human Factors in Computing Systems, CHI 2022
Country/TerritoryUnited States
CityVirtual, Online
Period22/4/3022/5/5

Bibliographical note

Funding Information:
This work was supported in part by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07043580), in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (NRF-2020R1A2C400214612), and in part by the Institute of Information and Communications Technology Planning and Evaluation (IITP) grant funded by the Korea government (MSIT) (2020-0-01361).

Publisher Copyright:
© 2022 ACM.

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Computer Graphics and Computer-Aided Design
  • Software

Fingerprint

Dive into the research topics of 'Speeding up Inference with User Simulators throughPolicy Modulation'. Together they form a unique fingerprint.

Cite this