Stochastic control

From Mwiki
Jump to: navigation, search

Stochastic control refers to the general area in which some random variable distributions depend on the choice of certain controls, and one looks for an optimal strategy to choose those controls in order to maximize or minimize the expected value of the random variable.

The random variable to optimize is computed in terms of some stochastic process. It is usually the value of some given function evaluated at the end point of the stochastic process.

The expected value of this random variable, in terms of the starting point of the stochastic process, is a function which satisfies some fully nonlinear integro-differential equation. If the stochastic processes involved are diffusions, the PDEs involved are classical second order equations. If the stochastic process is a Levy process with jumps, then the equations will have a non local part.

Standard stochastic control: the Bellman equation

Consider a family of stochastic processes $X_t^\alpha$ indexed by a parameter $\alpha \in A$, whose corresponding generator operators are $L^\alpha$. We consider the following dynamic programming setting: the parameter $\alpha$ is a control that can be changed at any period of time.

We look for the optimal choice of the control that will maximize the value of a given function $g$ the first time the process $X_t$ exits a domain $D$. If we call this maximal possible expected value $u(x)$, in terms of the initial point $X_0 = x$, the function $u$ will solve the Bellman equation. \[ \sup_{\alpha \in A} L^\alpha u = 0 \qquad \text{in } D.\]

This is a fully nonlinear convex equation.

When the operators $L^\alpha$ are second order and uniformly elliptic, the solution is $C^{2,\alpha}$. This is the result of the Evans-Krylov theorem. When the operators $L^\alpha$ are integral kernels, one can still prove that the solution is classical if the kernels satisfy some uniform assumptions. This is the nonlocal Evans-Krylov theorem

There are many variants of this problem. If the value of $g$ is given at a previously specified time $T$, then $u(x,t)$ solves the backwards parabolic Bellman equation.

Optimal stopping problem

In this case the stochastic process $X_t$ is fixed and the only control is the choice of the time when to stop it. The solution of this problem is obtained though the obstacle problem.

Zero sum games: the Isaacs equation

This is a stochastic games with two players. Each player has one control and they have opposite objectives. The first player chooses a control $\alpha$ to minimize the final expectation, whereas the second player chooses a control $\beta$ (knowing the choice of $\alpha$) to maximize the expectation. The resulting value function $u(x)$ satisfies the Isaacs equation \[ \inf_\alpha \left( \sup_\beta \left( L^{\alpha \beta} u \right) \right) = 0 \qquad \text{in } D.\]

Note that any fully nonlinear PDE of second order $F(D^2u, Du) = 0$ for $F$ (degenerate) elliptic and Lipchitz can be written as an Isaacs equation, and therefore derived from a zero sum game, for some appropriate choice of the linear operators $L^{\alpha \beta}$.

If the Levy processes contain jumps, the corresponding equation will be a fully nonlinear integro-differential equation.