Learn an internal model of observation to estimate rewards without completely learning the dynamics-->
Eye of the Beholder: Improved Relation Generalization for Text-based Reinforcement Learning Agents Keerthiram Murugesan, Subhajit Chaudhury , Kartik Talamadupula Association for the Advancement of Artificial Intelligence (AAAI) , 2022. Paper
Neuro-symbolic Approaches for Text-based Policy Learning Subhajit Chaudhury , Prithviraj Sen, Masaki Ono, Daiki Kimura, Michiaki Tatsubori and Asim Munawar Empirical Methods in Natural Language Processing (EMNLP) , 2020. Paper