All the input parameters are specified in the global.m files. For i=1,2,3 and 4, globali.m corresponds to mdpi in the paper . The file to be executed in the MATLAB shell is main.m
In main.m, one must use whichever global.m is to be tested. It is best to download the files and save them in MS Word.
1. The codes for Q-Learning (for discounted reward) are here
Please go to qlearn.m to modify the step-size. Also, one must use the right globalx.m to obtain the desired result.
2. The codes for Q-Value Iteration for discounted reward MDPs are here
3. The codes for R-SMART for the MDP using average reward (Abhijit Gosavi, Reinforcement Learning for Long-Run Average Cost European Journal of Operational Research, 155, pp 654-674, 2004) are here
4. The codes for discounted reward Q-Learning (MDP) that use a simple neuron as a function approximator are here.
5. The codes for Q-Learning for survival probabilities (MDP) are here.