A Minibatch Stochastic Gradient Descent-Based Learning Metapolicy for Inventory Systems with Myopic Optimal Policy
- Jiameng Lyu ,
Corresponding Author
Jiameng Lyu
[email protected]https://orcid.org/0000-0002-4688-5276
Yau Mathematical Sciences Center, Tsinghua University, Beijing 100084, China; and Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China
- Jinxing Xie ,
Corresponding Author
Jinxing Xie
[email protected]https://orcid.org/0000-0002-9269-6468
Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China
- Shilin Yuan ,
Corresponding Author
Shilin Yuan
[email protected]https://orcid.org/0009-0002-7892-0344
Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China
- Yuan Zhou
Corresponding Author
Yuan Zhou
[email protected]https://orcid.org/0009-0008-1706-6539
Yau Mathematical Sciences Center, Tsinghua University, Beijing 100084, China; and Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China; and Beijing Institute of Mathematical Sciences and Application, Beijing 100084, China
Corresponding Author
Jiameng Lyu
[email protected]https://orcid.org/0000-0002-4688-5276
Yau Mathematical Sciences Center, Tsinghua University, Beijing 100084, China; and Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China
Corresponding Author
Jinxing Xie
[email protected]https://orcid.org/0000-0002-9269-6468
Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China
Corresponding Author
Shilin Yuan
[email protected]https://orcid.org/0009-0002-7892-0344
Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China
Corresponding Author
Yuan Zhou
[email protected]https://orcid.org/0009-0008-1706-6539
Yau Mathematical Sciences Center, Tsinghua University, Beijing 100084, China; and Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China; and Beijing Institute of Mathematical Sciences and Application, Beijing 100084, China

