Markov reward model
In probability theory, a Markov reward model or Markov reward process is a stochastic process which extends either a Markov chain or continuous-time Markov chain by adding a reward rate to each state. An additional variable records the reward accumulated up to the current time.{{Cite book | last1 = Begain | first1 = K. | last2 = Bolch | first2 = G. | last3 = Herold | first3 = H. | doi = 10.1007/978-1-4615-1387-2_2 | chapter = Theoretical Background | title = Practical Performance Modeling | url = https://archive.org/details/practicalperform00bega | url-access = limited | pages = [https://archive.org/details/practicalperform00bega/page/n18 9] | year = 2001 | isbn = 978-1-4613-5528-1 }} Features of interest in the model include expected reward at a given time and expected time to accumulate a given reward.{{Cite book | last1 = Li | first1 = Q. L. | chapter = Markov Reward Processes | doi = 10.1007/978-3-642-11492-2_10 | title = Constructive Computation in Stochastic Models with Applications | pages = 526–573 | year = 2010 | isbn = 978-3-642-11491-5 }} The model appears in Ronald A. Howard's book.{{cite book | first = R.A. | last = Howard | author-link1 = Ronald A. Howard | title = Dynamic Probabilistic Systems, Vol II: Semi-Markov and Decision Processes | publisher = Wiley | location = New York | year = 1971 | isbn = 0471416657 | url-access = registration | url = https://archive.org/details/dynamicprobabili00howa }} The models are often studied in the context of Markov decision processes where a decision strategy can impact the rewards received.
The Markov Reward Model Checker tool can be used to numerically compute transient and stationary properties of Markov reward models.
Continuous-time Markov chain
The accumulated reward at a time t can be computed numerically over the time domain or by evaluating the linear hyperbolic system of equations which describe the accumulated reward using transform methods or finite difference methods.{{Cite journal | last1 = Reibman | first1 = A. | last2 = Smith | first2 = R. | last3 = Trivedi | first3 = K. | doi = 10.1016/0377-2217(89)90335-4 | title = Markov and Markov reward model transient analysis: An overview of numerical approaches | journal = European Journal of Operational Research | volume = 40 | issue = 2 | pages = 257 | year = 1989 | url = http://people.ee.duke.edu/~kst/markovpapers/survey.pdf}}