Dependability state model
A dependability state diagram is a method for modelling a system as a Markov chain. It is used in reliability engineering for availability and reliability analysis.{{cite book
| author = Bjarne E. Helvik
| title = Dependable Computing Systems and Communication Networks
| quote =
| publisher = Gnist Tapir
| year = 2007
| pages =
| url =
| doi =
}}
It consists of creating a finite-state machine which represent the different
states a system may be in. Transitions between states happen as a result of events from underlying Poisson processes with different intensities.
Example
File:Dep-state-model-example.png
A redundant computer system consist of identical two-compute nodes, which each fail with an intensity of . When failed, they are repaired one at the time by a single repairman with negative exponential distributed repair times with expectation .
- state 0: 0 failed units, normal state of the system.
- state 1: 1 failed unit, system operational.
- state 2: 2 failed units. system not operational.
Intensities from state 0 and state 1 are , since each compute node has a failure intensity of . Intensity from state 1 to state 2 is .
Transitions from state 2 to state 1 and state 1 to state 0 represent the repairs of the compute nodes and have the intensity , since only a single unit is repaired at the time.
= Availability =
The asymptotic availability, i.e. availability over a long period, of the system is equal to the probability that the model is in state 1 or state 2.
This is calculated by making a set of linear equations of the state transition and solving the linear system.
The matrix is constructed with a row for each state. In a row, the intensity into the state is set in the column with the same index, with a negative term.
:
0 & -\mu & 0 \\
-\lambda & 0 & -\mu \\
0 & \lambda & 0
\end{bmatrix}.
The identities cells balance the sum of their column to 0:
:
(\lambda) & -\mu & 0 \\
-\lambda & (\lambda+\mu) & -\mu \\
0 & -\lambda & (\mu) \\
\end{bmatrix}.
In addition the equality clause must be taken into account:
:
By solving this equation, the probability of being in state 1 or state 2 can be found, which
is equal to the long-term availability of the service.
= Reliability =
The reliability of the system is found by making the failure states absorbing, i.e. removing all outgoing state transitions.
For this system the function is:
:
R(t) = e^{-\lambda t} \,
Criticism
Finite state models of systems are subject to state explosion. To create
a realistic model of a system one ends up with a model with so many states that it is infeasible to solve or draw the model.