Neural ordinary differential equations (NODEs) presented a new paradigm to construct (continuous-time) neural networks. While showing several good characteristics in terms of the number of parameters and the flexibility in constructing neural networks, they also have a couple of well-known limitations: i) theoretically NODEs learn homeomorphic mapping functions only, and ii) sometimes NODEs show numerical instability in solving integral problems. To handle this, many enhancements have been proposed. To our knowledge, however, integrating attention into NODEs has been overlooked for a while. To this end, we present a novel method of attentive dual co-evolving NODE (ACE-NODE): one main NODE for a downstream machine learning task and the other for providing attention to the main NODE. Our ACE-NODE supports both pairwise and elementwise attention. In our experiments, our method outperforms existing NODE-based and non-NODE-based baselines in almost all cases by non-trivial margins.
|Title of host publication||KDD 2021 - Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining|
|Publisher||Association for Computing Machinery|
|Number of pages||10|
|Publication status||Published - 2021 Aug 14|
|Event||27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2021 - Virtual, Online, Singapore|
Duration: 2021 Aug 14 → 2021 Aug 18
|Name||Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining|
|Conference||27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2021|
|Period||21/8/14 → 21/8/18|
Bibliographical noteFunding Information:
Noseong Park is the corresponding author. This work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-01361, Artificial Intelligence Graduate School Program (Yonsei University)).
© 2021 ACM.
All Science Journal Classification (ASJC) codes
- Information Systems