Publications Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards. Hao Qin, Kwang-Sung Jun, Chicheng Zhang Conference on Neural Information Processing Systems (NeurIPS) 2023 arXiv | Code