RLux: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation

Chao Yu, Tsinghua University; Yuanqing Wang, Peking University; Zhen Guo, Hao Lin, and Si Xu, Infinigence AI; Hongzhi Zang, Tsinghua University; Quanlu Zhang, Infinigence AI; Yongji Wu, UC Berkeley; Chunyang Zhu and Junhao Hu, Infinigence AI; Zixiao Huang, Tsinghua University; Mingjie Wei, Harbin Institute of Technology; Yuqing Xie, Tsinghua University; Ke Yang, Harbin Institute of Technology; Bo Dai, Beihang University; Zhexuan Xu and Jiakun Du, Tsinghua University; Xiangyuan Wang, Peking University; Xu Fu and Letong Shi, Infinigence AI; Zhihao Liu, Institute of Automation, Chinese Academy of Sciences; Kang Chen, Peking University; Weilin Liu, Infinigence AI; Gang Liu, Tsinghua University; Boxun Li, Infinigence AI; Jianlei Yang, Beihang University; Zhi Yang, Peking University; Guohao Dai, Shanghai Jiao Tong University; Yu Wang, Tsinghua University