Nowadays, Knowledge Distillation (KD) has been widely studied for recommender system. KD is a model-independent strategy that generates a small but powerful student model by transferring knowledge from a pre-trained large teacher model. Recent work has shown that the knowledge from the teacher's representation space significantly improves the student model. The state-of-the-art method, named Distillation Experts (DE), adopts cluster-wise distillation that transfers the knowledge of each representation cluster separately to distill the various preference knowledge in a balanced manner. However, it is challenging to apply DE to a new environment since its performance is highly dependent on several key assumptions and hyperparameters that need to be tuned for each dataset and each base model. In this work, we propose a novel method, dubbed Personalized Hint Regression (PHR), distilling the preference knowledge in a balanced way without relying on any assumption on the representation space nor any method-specific hyperparameters. To circumvent the clustering, PHR employs personalization network that enables a personalized distillation to the student space for each user/item representation, which can be viewed as a generalization of DE. Extensive experiments conducted on real-world datasets show that PHR achieves comparable or even better performance to DE tuned by a grid search for all of its hyperparameters.
|Publication status||Published - 2022 Mar 5|
Bibliographical noteFunding Information:
This work was supported by the NRF grant funded by the MSIT (South Korea, No. 2020R1A2B5B03097210 ), the IITP grant funded by the MSIT (South Korea, No. 2018-0-00584 , 2019-0-01906 ), and the Technology Innovation Program funded by the MOTIE (South Korea, No. 20014926 ).
© 2021 Elsevier B.V.
All Science Journal Classification (ASJC) codes
- Management Information Systems
- Information Systems and Management
- Artificial Intelligence