Reinforcement Learning
(Back to Abhijit Gosavi's homepage)
Journal Papers
A. Encapera, A. Gosavi, and S.L. Murray. Total Productive Maintenance of Make-To-Stock Production-Inventory
Systems via Artificial-Intelligence-Based iSMART. To appear in International Journal of Systems Science: Operations & Logistics.
R.J. Lawhead and A. Gosavi. A Bounded Actor-Critic Reinforcement Learning Algorithm Applied to Airline
Revenue Management. Engineering Applications of Artificial Intelligence, 82, 252-262, 2019.
Abhijit Gosavi Variance-penalized Markov decision processes: Dynamic programming and reinforcement learning techniques.
International Journal of General Systems, 43(6), 649-669, 2014.
Abhijit Gosavi, Susan L. Murray, Jiaqiao Hu, and Shuva Ghosh. . Model-building Adaptive Critics for semi-Markov Control.
Journal of Artificial Intelligence and Soft Computing Research, 2(1): 43-58, 2012.
A. Gosavi. Target-sensitive control of Markov and semi-Markov processes, International Journal of Control, Automation, and
Systems, , 9(5):1-11, 2011. MATLAB codes available here
K. Kulkarni, A. Gosavi, S. L. Murray and K. Grantham
Semi-Markov Adaptive Critic Heuristics with Application to Airline
Revenue Management Journal of Control Theory and Applications (special issue on Approximate Dynamic Programming), 9(3): 421-430, 2011.
P. Shah, A. Gosavi, and R. Nagi "A Machine Learning Approach to Optimize Usage of Recycled Material in a Remanufacturing Environment." (pdf file)
International Journal of Production Research. Vol. 48, No. 4, pp. 933–955, 2010.
Zheng Sui, Abhijit Gosavi, and Li Lin. A reinforcement learning approach for inventory replenishment in vendor-managed inventory
systems with consignment inventory, Engineering Management Journal. , 22(4), 44-53, 2010.
A. Gosavi. "Reinforcement Learning: A Tutorial Survey and Recent Advances." (pdf file)
INFORMS Journal on Computing,, Vol 21(2), pp. 178-192, 2009.
Abhijit Gosavi "Boundedness of Iterates in Reinforcement Learning." (pdf file)
Systems and Control Letters.
55, pp 347-349, 2006.
Abhijit Gosavi
"A Reinforcement Learning Algorithm Based on Policy Iteration For Average Reward: Empirical Results
with Yield Management and Convergence Analysis" (pdf file), Machine Learning , 55(1), pp 5-29, 2004.
Abhijit Gosavi,
"Reinforcement Learning for Long-Run Average Cost" (pdf file),
European Journal of Operational Research, 155, pp 654-674, 2004.
Abhijit Gosavi, Tapas K. Das and Sudeep Sarkar
"A Simulation-Based Learning Automata Framework
for Solving Semi-Markov Decision Problems Under Long-Run Average
Reward" (pdf file), IIE Transactions , 36, pp 1-11, 2004.
Abhijit Gosavi, Naveen Bandla, and Tapas K. Das "A Reinforcement Learning Approach to Airline Seat Allocation
for Multiple Fare Classes with Overbooking" IIE Transactions, 34, pp 729-742, 2002.
P. Pontrandolfo, Abhijit Gosavi, O.G. Okogbaa and Tapas K. Das
"Global Supply Chain Management: A Reinforcement Learning Approach" ,
International Journal of Production Research, vol 40, no 6, pp 1299-1317, 2002.
Tapas K. Das, Abhijit Gosavi, Sridhar Mahadevan, and Nicholas Marchalleck "Solving Semi-Markov Decision Problems
using Average Reward Reinforcement Learning, "Management Science, April, vol. 45, No. 4, pp 560-574, 1999.