derivative of the exponential map

{{Short description|Formula in Lie group theory}}

Image:poincare.jpg's investigations into group multiplication in Lie algebraic terms led him to the formulation of the universal enveloping algebra.{{harvnb|Schmid|1982}}]]

In the theory of Lie groups, the exponential map is a map from the Lie algebra {{math|g}} of a Lie group {{math|G}} into {{math|G}}. In case {{math|G}} is a matrix Lie group, the exponential map reduces to the matrix exponential. The exponential map, denoted {{math|exp:gG}}, is analytic and has as such a derivative {{math|{{sfrac|d|dt}}exp(X(t)):Tg → TG}}, where {{math|X(t)}} is a {{math|C1}} path in the Lie algebra, and a closely related differential {{math|dexp:Tg → TG}}.{{harvnb|Rossmann|2002}} Appendix on analytic functions.

The formula for {{math|dexp}} was first proved by Friedrich Schur (1891).{{harvnb|Schur|1891}} It was later elaborated by Henri Poincaré (1899) in the context of the problem of expressing Lie group multiplication using Lie algebraic terms.{{harvnb|Poincaré|1899}} It is also sometimes known as Duhamel's formula.

The formula is important both in pure and applied mathematics. It enters into proofs of theorems such as the Baker–Campbell–Hausdorff formula, and it is used frequently in physics{{harvnb|Suzuki|1985}} for example in quantum field theory, as in the Magnus expansion in perturbation theory, and in lattice gauge theory.

Throughout, the notations {{math|exp(X)}} and {{math|eX}} will be used interchangeably to denote the exponential given an argument, except when, where as noted, the notations have dedicated distinct meanings. The calculus-style notation is preferred here for better readability in equations. On the other hand, the {{math|exp}}-style is sometimes more convenient for inline equations, and is necessary on the rare occasions when there is a real distinction to be made.

Statement

The derivative of the exponential map is given by{{harvnb|Rossmann|2002}} Theorem 5 Section 1.2

{{Equation box 1

|indent =:

|equation =

\frac{d}{dt}e^{X(t)} = e^{X(t)}\frac{1 - e^{-\mathrm{ad}_{X}}}{\mathrm{ad}_{X}}\frac{dX(t)}{dt}.               {{EquationRef|(1)}}

|cellpadding= 6

|border

|border colour = #0073CF

|bgcolor=#F9FFF7

}}

;Explanation

{{unordered list

| {{math|1=X = X(t)}} is a {{math|C1}} (continuously differentiable) path in the Lie algebra with derivative {{math|1=X′(t) = {{sfrac|dX(t)|dt}}}}. The argument {{math|t}} is omitted where not needed.

| {{math|adX}} is the linear transformation of the Lie algebra given by {{math|1=adX(Y) = [X, Y]}}. It is the adjoint action of a Lie algebra on itself.

| The fraction {{math|{{sfrac|1 − exp(−adX)|adX}}}} is given by the power series

{{NumBlk|:|\frac{1 - e^{-\mathrm{ad}_{X}}}{\mathrm{ad}_{X}} = \sum_{k = 0}^\infty \frac{(-1)^k}{(k + 1)!}(\mathrm{ad}_X)^k. |{{EquationRef|2}}}}

derived from the power series of the exponential map of a linear endomorphism, as in matrix exponentiation.

| When {{math|G}} is a matrix Lie group, all occurrences of the exponential are given by their power series expansion.

| When {{math|G}} is not a matrix Lie group, {{math|{{sfrac|1 − exp(−adX)|adX}}}} is still given by its power series ({{EquationNote|2}}), while the other two occurrences of {{math|exp}} in the formula, which now are the exponential map in Lie theory, refer to the time-one flow of the left invariant vector field {{mvar|X}}, i.e. element of the Lie algebra as defined in the general case, on the Lie group {{mvar|G}} viewed as an analytic manifold. This still amounts to exactly the same formula as in the matrix case. Left multiplication of an element of the algebra {{math|g}} by an element {{math|exp(X(t))}} of the Lie group is interpreted as applying the differential of the left translation {{math|dLexp(X(t))}}.

| The formula applies to the case where {{math|exp}} is considered as a map on matrix space over {{math|ℝ}} or {{math|C}}, see matrix exponential. When {{math|1=G = GL(n, C)}} or {{math|GL(n, R)}}, the notions coincide precisely.

}}

To compute the differential {{math|dexp}} of {{math|exp}} at {{math|X}}, {{math|dexpX: TgX → TGexp(X)}}, the standard recipe

: d\exp_XY = \left .\frac{d}{dt}e^{Z(t)}\right|_{t = 0}, Z(0) = X, Z'(0) = Y

is employed. With {{math|1=Z(t) = X + tY}} the result

{{NumBlk|:|d\exp_XY = e^{X}\frac{1 - e^{-\mathrm{ad}_{X}}}{\mathrm{ad}_{X}}Y|{{EquationRef|3}}}}

follows immediately from {{EquationNote|(1)}}. In particular, {{math|1=dexp0:Tg0 → TGexp(0) = TGe}} is the identity because {{math|TgXg}} (since {{math|g}} is a vector space) and {{math|TGeg}}.

Proof

The proof given below assumes a matrix Lie group. This means that the exponential mapping from the Lie algebra to the matrix Lie group is given by the usual power series, i.e. matrix exponentiation. The conclusion of the proof still holds in the general case, provided each occurrence of {{math|exp}} is correctly interpreted. See comments on the general case below.

The outline of proof makes use of the technique of differentiation with respect to {{math|s}} of the parametrized expression

:\Gamma(s, t) = e^{-sX(t)}\frac{\partial}{\partial t} e^{sX(t)}

to obtain a first order differential equation for {{math|Γ}} which can then be solved by direct integration in {{mvar|s}}. The solution is then {{math|eX Γ(1, t)}}.

Lemma

Let {{math|Ad}} denote the adjoint action of the group on its Lie algebra. The action is given by {{math|1=AdAX = AXA−1}} for {{math|AG, Xg}}. A frequently useful relationship between {{math|Ad}} and {{math|ad}} is given by{{harvnb|Hall|2015}} Proposition 3.35A proof of the identity can be found in here. The relationship is simply that between a representation of a Lie group and that of its Lie algebra according to the Lie correspondence, since both {{math|Ad}} and {{math|ad}} are representations with {{math|1=ad = dAd}}.

{{Equation box 1

|indent =:

|equation =

\mathrm{Ad}_{e^X} = e^{\mathrm{ad}_X}, ~~X \in \mathfrak{g}~.               {{EquationRef|(4)}}

|cellpadding= 6

|border

|border colour = #0073CF

|bgcolor=#F9FFF7

}}

Proof

Using the product rule twice one finds,

:\frac{\partial\Gamma}{\partial s} = e^{-sX}(-X)\frac{\partial}{\partial t}e^{sX(t)} + e^{-sX}\frac{\partial}{\partial t}\left[X(t)e^{sX(t)}\right] = e^{-sX}\frac{dX}{dt}e^{sX}.

Then one observes that

:\frac{\partial\Gamma}{\partial s} = \mathrm{Ad}_{e^{-sX}}X' = e^{-\mathrm{ad}_{sX}}X',

by {{EquationNote|(4)}} above. Integration yields

:\Gamma(1, t) = e^{-X(t)}\frac{\partial}{\partial t}e^{X(t)} = \int_0^1 \frac{\partial\Gamma}{\partial s}ds = \int_0^1 e^{-\mathrm{ad}_{sX}}X'ds.

Using the formal power series to expand the exponential, integrating term by term, and finally recognizing ({{EquationNote|2}}),

:\Gamma(1, t) = \int_0^1 \sum_{k = 0}^\infty \frac{(-1)^ks^k}{k!} (\mathrm{ad}_X)^k\frac{dX}{dt}ds = \sum_{k = 0}^\infty \frac{(-1)^k}{(k + 1)!}(\mathrm{ad}_X)^k \frac{dX}{dt} = \frac{1-e^{-\mathrm{ad}_X}}{\mathrm{ad}_X}\frac{dX}{dt},

and the result follows. The proof, as presented here, is essentially the one given in {{harvtxt|Rossmann|2002}}. A proof with a more algebraic touch can be found in {{harvtxt|Hall|2015}}.See also {{harvnb|Tuynman|1995}} from which Hall's proof is taken.

=Comments on the general case=

The formula in the general case is given by{{harvnb|Sternberg|2004}} This is equation (1.11).

:\frac{d}{dt}\exp(C(t)) = \exp(C)\phi(-\mathrm{ad}(C))C~ ',

where

It holds that

:\tau(\log z)\phi(-\log z) = 1

for |z − 1| < 1 where

:\tau(w) = \frac{w}{1 - e^{-w}}.

Here, {{math|τ}} is the exponential generating function of

:(-1)^k b_k,

where {{math|bk}} are the Bernoulli numbers.

:\phi(z) = \frac{e^z - 1}{z} = 1 + \frac{1}{2!}z + \frac{1}{3!}z^2 + \cdots,

which formally reduces to

:\frac{d}{dt}\exp(C(t)) = \exp(C)\frac{1 - e^{-\mathrm{ad}_{C}}}{\mathrm{ad}_{C}}\frac{dC(t)}{dt}.

Here the {{math|exp}}-notation is used for the exponential mapping of the Lie algebra and the calculus-style notation in the fraction indicates the usual formal series expansion. For more information and two full proofs in the general case, see the freely available {{harvtxt|Sternberg|2004}} reference.

=A direct formal argument=

An immediate way to see what the answer must be, provided it exists is the following. Existence needs to be proved separately in each case. By direct differentiation of the standard limit definition of the exponential, and exchanging the order of differentiation and limit,

:\begin{align}

\frac{d}{dt}e^{X(t)}

&= \lim_{N \to \infty}\frac{d}{dt}\left(1 + \frac{X(t)}{N}\right)^N\\

&= \lim_{N \to \infty}\sum_{k=1}^N\left(1 + \frac{X(t)}{N}\right)^{N-k}\frac{1}{N}\frac{dX(t)}{dt}\left(1 + \frac{X(t)}{N}\right)^{k-1}~,

\end{align}

where each factor owes its place to the non-commutativity of {{math|X(t)}} and {{math|X´(t)}}.

Dividing the unit interval into {{math|N}} sections {{math|1=Δs = {{sfrac|Δk|N}}}} ({{math|1=Δk = 1}} since the sum indices are integers) and letting {{mvar|N}} → ∞, {{math|Δkdk, {{sfrac|k|N}} → s}}, {{math|Σ → ∫}} yields

:\begin{align}

\frac{d}{dt}e^{X(t)} &= \int_{0}^1e^{(1-s)X}X'e^{sX}ds = e^X \int_{0}^1 \mathrm{Ad}_{e^{-sX}} X' ds \\

&= e^X \int_{0}^1 e^{-\mathrm{ad}_{sX}} dsX' = e^X \frac{1-e^{-\mathrm{ad}_X}}{\mathrm{ad}_X}\frac{dX}{dt}~.

\end{align}

Applications

=Local behavior of the exponential map=

The inverse function theorem together with the derivative of the exponential map provides information about the local behavior of {{math|exp}}. Any {{math|Ck, 0 ≤ k ≤ ∞, ω}} map {{math|f}} between vector spaces (here first considering matrix Lie groups) has a {{math|Ck}} inverse such that {{math|f}} is a {{math|Ck}} bijection in an open set around a point {{math|x}} in the domain provided {{math|dfx}} is invertible. From ({{EquationNote|3}}) it follows that this will happen precisely when

:\frac{1 - e^{\mathrm{ad_X}}}{\mathrm{ad}_X}

is invertible. This, in turn, happens when the eigenvalues of this operator are all nonzero. The eigenvalues of {{math|{{sfrac|1 − exp(−adX)|adX}}}} are related to those of {{math|adX}} as follows. If {{math|g}} is an analytic function of a complex variable expressed in a power series such that {{math|g(U)}} for a matrix {{math|U}} converges, then the eigenvalues of {{math|g(U)}} will be {{math|g(λij)}}, where {{math|λij}} are the eigenvalues of {{math|U}}, the double subscript is made clear below.This is seen by choosing a basis for the underlying vector space such that {{math|U}} is triangular, the eigenvalues being the diagonal elements. Then {{math|Uk}} is triangular with diagonal elements {{math|λik}}. It follows that the eigenvalues of {{math|U}} are {{math|f(λi)}}. See {{harvnb|Rossmann|2002}}, Lemma 6 in section 1.2. In the present case with {{math|1=g(U) = {{sfrac|1 − exp(−U)|U}}}} and {{math|1=U = adX}}, the eigenvalues of {{math|{{sfrac|1 − exp(−adX)|adX}}}} are

:\frac{1 - e^{-\lambda_{ij}}}{\lambda_{ij}},

where the {{math|λij}} are the eigenvalues of {{math|adX}}. Putting {{math|1={{sfrac|1 − exp(−λij)|λij}} = 0}} one sees that {{math|dexp}} is invertible precisely when

:\lambda_{ij} \ne k2\pi i, k = \pm1, \pm2, \ldots.

The eigenvalues of {{math|adX}} are, in turn, related to those of {{math|X}}. Let the eigenvalues of {{math|X}} be {{math|λi}}. Fix an ordered basis {{math|ei}} of the underlying vector space {{math|V}} such that {{math|X}} is lower triangular. Then

:Xe_i = \lambda_ie_i + \cdots,

with the remaining terms multiples of {{math|en}} with {{math|n > i}}. Let {{math|Eij}} be the corresponding basis for matrix space, i.e. {{math|1=(Eij)kl = δikδjl}}. Order this basis such that {{math|Eij < Enm}} if {{math|ij < nm}}. One checks that the action of {{math|adX}} is given by

:\mathrm{ad}_XE_{ij} = (\lambda_i - \lambda_j)E_{ij} + \cdots \equiv \lambda_{ij}E_{ij} + \cdots,

with the remaining terms multiples of {{math|Emn > Eij}}. This means that {{math|adX}} is lower triangular with its eigenvalues {{math|1=λij = λiλj}} on the diagonal. The conclusion is that {{math|dexpX}} is invertible, hence {{math|exp}} is a local bianalytical bijection around {{math|X}}, when the eigenvalues of {{math|X}} satisfy{{harvnb|Rossmann|2002}} Proposition 7, section 1.2.Matrices whose eigenvalues {{math|λ}} satisfy {{math|{{!}}Im λ{{!}} < π}} are, under the exponential, in bijection with matrices whose eigenvalues {{math|μ}} are not on the negative real line or zero. The {{math|λ}} and {{math|μ}} are related by the complex exponential. See {{harvtxt|Rossmann|2002}} Remark 2c section 1.2.

:\lambda_i - \lambda_j \ne k2\pi i, \quad k = \pm1, \pm2, \ldots, \quad 1 \le i, j \le n = \dim V.

In particular, in the case of matrix Lie groups, it follows, since {{math|dexp0}} is invertible, by the inverse function theorem that {{math|exp}} is a bi-analytic bijection in a neighborhood of {{math|0 ∈ g}} in matrix space. Furthermore, {{math|exp}}, is a bi-analytic bijection from a neighborhood of {{math|0 ∈ g}} in {{math|g}} to a neighborhood of {{math|eG}}.{{harvnb|Hall|2015}} Corollary 3.44. The same conclusion holds for general Lie groups using the manifold version of the inverse function theorem.

It also follows from the implicit function theorem that {{math|dexpξ}} itself is invertible for {{math|ξ}} sufficiently small.{{harvnb|Sternberg|2004}} Section 1.6.

= Derivation of a Baker–Campbell–Hausdorff formula =

{{Main|Baker–Campbell–Hausdorff formula}}

If {{math|Z(t)}} is defined such that

:e^{Z(t)} = e^{X} e^{tY},

an expression for {{math|1=Z(1) = log( exp X exp Y )}}, the Baker–Campbell–Hausdorff formula, can be derived from the above formula,

:\exp(-Z(t))\frac{d}{dt}\exp(Z(t)) = \frac{1 - e^{-\mathrm{ad}_{Z}}}{\mathrm{ad}_{Z}}Z'(t).

Its left-hand side is easy to see to equal Y. Thus,

:Y = \frac{1 - e^{-\mathrm{ad}_{Z}}}{\mathrm{ad}_{Z}}Z'(t),

and hence, formally,{{harvnb|Hall|2015}}Section 5.5.{{harvnb|Sternberg|2004}} Section 1.2.

:

Z'(t) = \frac{\mathrm{ad}_{Z}}{1 - e^{-\mathrm{ad}_{Z}}} Y \equiv \psi\left(e^{\mathrm{ad}_{Z}}\right)Y, \quad

\psi(w) = \frac{w\log w}{w - 1} = 1 + \sum_{m=1}^\infty \frac{(-1)^{m + 1}}{m(m + 1)}(w - 1)^m, \|w\| < 1.

However, using the relationship between {{math|Ad}} and {{math|ad}} given by {{EquationNote|(4)}}, it is straightforward to further see that

: e^{\mathrm{ad}_{Z}} = e^{\mathrm{ad}_{X}} e^{t\mathrm{ad}_{Y}}

and hence

:Z'(t) = \psi\left(e^{\mathrm{ad}_{X}} e^{t\mathrm{ad}_{Y}}\right)Y.

Putting this into the form of an integral in t from 0 to 1 yields,

:Z(1) = \log(\exp X\exp Y) = X + \left( \int^1_0 \psi \left(e^{\operatorname{ad}_X} ~ e^{t \,\text{ad}_Y}\right) \, dt \right) \, Y,

an integral formula for {{math|Z(1)}} that is more tractable in practice than the explicit Dynkin's series formula due to the simplicity of the series expansion of {{math|ψ}}. Note this expression consists of {{math|X+Y}} and nested commutators thereof with {{mvar|X}} or {{mvar|Y}}. A textbook proof along these lines can be found in {{harvtxt|Hall|2015}} and {{harvtxt|Miller|1972}}.

=Derivation of Dynkin's series formula =

Image:Eugene Dynkin 2003.jpg at home in 2003. In 1947 Dynkin proved the explicit BCH series formula.{{harvnb|Dynkin|1947}} Poincaré, Baker, Campbell and Hausdorff were mostly concerned with the existence of a bracket series, which suffices in many applications, for instance, in proving central results in the Lie correspondence.{{harvnb|Rossmann|2002}} Chapter 2.{{harvnb|Hall|2015}} Chapter 5. Photo courtesy of the Dynkin Collection.]]

Dynkin's formula mentioned may also be derived analogously, starting from the parametric extension

:e^{Z(t)} = e^{tX} e^{tY},

whence

:e^{-Z(t)} \frac{de^{Z(t)}}{dt} = e^{-t \, \mathrm{ad}_{Y}}X + Y ~,

so that, using the above general formula,

:Z' = \frac{\mathrm{ad}_{Z}}{1 - e^{-\mathrm{ad}_{Z}}} ~ \left(e^{-t \, \mathrm{ad}_{Y}}X + Y\right) = \frac{\mathrm{ad}_{Z}}{e^{\mathrm{ad}_{Z}} - 1} ~ \left(X + e^{t \, \mathrm{ad}_{X}}Y\right) .

Since, however,

:\begin{align}

\mathrm{ad_Z}

&= \log\left(\exp\left(\mathrm{ad}_Z\right)\right) = \log\left(1 + \left(\exp\left(\mathrm{ad}_Z\right) - 1\right)\right) \\

&= \sum\limits^{\infty}_{n=1} \frac{(-1)^{n+1}}{n} (\exp(\mathrm{ad}_Z) - 1)^n ~, \quad \|\mathrm{ad}_Z\| < \log 2 ~~,

\end{align}

the last step by virtue of the Mercator series expansion, it follows that

{{NumBlk|:|Z' = \sum\limits^{\infty}_{n=1} \frac{(-1)^{n-1}}{n} \left(e^{\mathrm{ad}_Z} - 1\right)^{n-1} ~ \left(X + e^{t \, \mathrm{ad}_{X}}Y\right)~,|{{EquationRef|5}}}}

and, thus, integrating,

:Z(1) = \int^1 _0 dt ~\frac{dZ(t)}{dt} = \sum^{\infty}_{n=1} \frac{(-1)^{n-1}}{n} \int^1 _0 dt ~\left(e^{t \, \mathrm{ad}_{X}} e^{t\mathrm{ad}_{Y}} - 1\right)^{n-1} ~ \left(X + e^{t \, \mathrm{ad}_{X}}Y\right) .

It is at this point evident that the qualitative statement of the BCH formula holds, namely {{math|Z}} lies in the Lie algebra generated by {{math|X, Y}} and is expressible as a series in repeated brackets {{EquationRef|(A)}}. For each {{mvar|k}}, terms for each partition thereof are organized inside the integral {{math|∫dt tk−1}}. The resulting Dynkin's formula is then

{{Equation box 1

|indent =

|equation = Z = \sum_{k = 1}^\infty \frac{(-1)^{k-1}}{k} \sum_{s \in S_{k}} \frac{1}{i_1 + j_1 + \cdots + i_k + j_k}\frac{[X^{(i_1)}Y^{(j_1)}\cdots X^{(i_k)}Y^{(j_k)}]}{i_1!j_1!\cdots i_k!j_k!}, \quad i_r,j_r \ge 0,\quad i_r + j_r > 0,\quad 1 \le r \le k.

|cellpadding= 6

|border

|border colour = #0073CF

|bgcolor=#F9FFF7

}}

For a similar proof with detailed series expansions, see {{harvtxt|Rossmann|2002}}.

=Combinatoric details=

Change the summation index in ({{EquationNote|5}}) to {{math|1=k = n − 1}} and expand

{{NumBlk|:|\frac{dZ}{dt} = \sum_{k = 0}^\infty \frac{(-1)^{k}}{k + 1}\left\{\left(e^{\mathrm{ad}_{tX}}e^{\mathrm{ad}_{tY}} - 1\right)^kX + \left(e^{\mathrm{ad}_{tX}}e^{\mathrm{ad}_{tY}} - 1\right)^ke^{\mathrm{ad}_{tX}}Y\right\}|{{EquationRef|97}}}}

in a power series. To handle the series expansions simply, consider first {{math|1=Z = log(eXeY)}}. The {{math|log}}-series and the {{math|exp}}-series are given by

:\log(A) = \sum_{k = 1}^\infty \frac{(-1)^{k + 1}}{k}{(A - I)}^k,\quad \text{and}\quad e^X = \sum_{k = 0}^\infty \frac{X^k}{k!}

respectively. Combining these one obtains

{{NumBlk|:|\log\left(e^X e^Y\right) = \sum_{k = 1}^\infty \frac{(-1)^{k + 1}}{k}{\left(e^Xe^Y - I\right)}^k =

\sum_{k = 1}^\infty \frac{(-1)^{k + 1}}{k}\left({\sum_{i = 0}^\infty \frac{X^i}{i!}\sum_{j = 0}^\infty \frac{Y^j}{j!} - I}\right)^k =

\sum_{k = 1}^\infty \frac{(-1)^{k + 1}}{k} \left(\sum_{i,j \ge 0, i + j > 1}^\infty \frac{X^iY^j}{i!j!} \right)^k.|{{EquationRef|98}}}}

This becomes

{{Equation box 1

|indent =

|equation =

Z = \log\left(e^X e^Y\right) = \sum_{k = 1}^\infty \frac{(-1)^{k + 1}}{k} \sum_{s \in S_k} \frac{X^{i_1}Y^{j_1}\cdots X^{i_k}Y^{j_k}}{i_1!j_1!\cdots i_k!j_k!}, \quad i_r, j_r \ge 0, \quad i_r + j_r > 0, \quad 1 \le r \le k,        {{EquationRef|(99)}}

|cellpadding= 6

|border

|border colour = #0073CF

|bgcolor=#F9FFF7

}}

where {{math|Sk}} is the set of all sequences {{math|1=s = (i1, j1, ..., ik, jk)}} of length {{math|2k}} subject to the conditions in {{EquationNote|(99)}}.

Now substitute {{math|(eXeY − 1)}} for {{math|(eadtXeadtY − 1)}} in the LHS of ({{EquationNote|98}}). Equation {{EquationNote|(99)}} then gives

:\begin{align}

\frac{dZ}{dt} = \sum_{k = 0}^\infty \frac{(-1)^{k}}{k + 1} \sum_{s \in S_k, i_{k+1} \ge 0} &t^{i_1 + j_1 + \cdots + i_k + j_k}\frac{{\mathrm{ad}_{X}}^{i_1}{\mathrm{ad}_{Y}}^{j_1}\cdots {\mathrm{ad}_{X}}^{i_k}{\mathrm{ad}_{Y}}^{j_k}}{i_1!j_1!\cdots i_k!j_k!}X \\

{}+{} &t^{i_1 + j_1 + \cdots + i_k + j_k + i_{k + 1}}\frac{{\mathrm{ad}_{X}}^{i_1}{\mathrm{ad}_{Y}}^{j_1}\cdots {\mathrm{ad}_{X}}^{i_k}{\mathrm{ad}_{Y}}^{j_k}X^{i_{k+1}}}{i_1!j_1!\cdots i_k!j_k!i_{k+1}!}Y, \quad i_r,j_r \ge 0, \quad i_r + j_r > 0,\quad 1 \le r \le k,

\end{align}

or, with a switch of notation, see An explicit Baker–Campbell–Hausdorff formula,

:\begin{align}

\frac{dZ}{dt} = \sum_{k = 0}^\infty \frac{(-1)^{k}}{k + 1} \sum_{s \in S_k, i_{k+1} \ge 0}

&t^{i_1 + j_1 + \cdots + i_k + j_k}\frac{\left[X^{(i_1)}Y^{(j_1)}\cdots X^{(i_k)}Y^{(j_k)}X\right]}{i_1!j_1!\cdots i_k!j_k!}\\

{}+{} &t^{i_1 + j_1 + \cdots + i_k + j_k + i_{k+1}}\frac{\left[X^{(i_1)}Y^{(j_1)}\cdots X^{(i_k)}Y^{(j_k)}X^{(i_{k+1})}Y\right]}{i_1!j_1!\cdots i_k!j_k!i_{k+1}!}, \quad i_r,j_r \ge 0, \quad i_r + j_r > 0, \quad 1 \le r \le k

\end{align}.

Note that the summation index for the rightmost {{math|eadtX}} in the second term in ({{EquationNote|97}}) is denoted {{math|ik + 1}}, but is not an element of a sequence {{math|sSk}}. Now integrate {{math|1=Z = Z(1) = ∫{{sfrac|dZ|dt}}dt}}, using {{math|1=Z(0) = 0}},

:\begin{align}

Z = \sum_{k = 0}^\infty \frac{(-1)^{k}}{k + 1} \sum_{s \in S_k, i_{k+1} \ge 0} &\frac{1}{i_1 + j_1 + \cdots + i_k + j_k + 1}\frac{\left[X^{(i_1)}Y^{(j_1)}\cdots X^{(i_k)}Y^{(j_k)}X\right]}{i_1!j_1!\cdots i_k!j_k!}\\

{}+{} &\frac{1}{i_1 + j_1 + \cdots + i_k + j_k + i_{k+1} + 1}\frac{\left[X^{(i_1)}Y^{(j_1)}\cdots X^{(i_k)}Y^{(j_k)}X^{(i_{k+1})}Y\right]}{i_1!j_1!\cdots i_k!j_k!i_{k+1}!}, \quad i_r,j_r \ge 0,\quad i_r + j_r > 0,\quad 1 \le r \le k

\end{align}.

Write this as

:\begin{align}

Z = \sum_{k = 0}^\infty \frac{(-1)^{k}}{k + 1} \sum_{s \in S_k, i_{k+1} \ge 0}

&\frac{1}{i_1 + j_1 + \cdots + i_k + j_k + (i_{k + 1} = 1) + (j_{k+1} = 0)}\frac{\left[X^{(i_1)}Y^{(j_1)}\cdots X^{(i_k)}Y^{(j_k)}X^{(i_{k + 1} = 1)}Y^{(j_{k+1} = 0)}\right]}{i_1!j_1!\cdots i_k!j_k!(i_{k+1} = 1)!(j_{k+1} = 0)!}\\

{}+{} &\frac{1}{i_1 + j_1 + \cdots + i_k + j_k + i_{k+1} + (j_{k+1} = 1)}\frac{\left[X^{(i_1)}Y^{(j_1)}\cdots X^{(i_k)}Y^{(j_k)}X^{(i_{k+1})}Y^{(j_{k+1} = 1)}\right]}{i_1!j_1!\cdots i_k!j_k!i_{k+1}!(j_{k+1} = 1)!}, \\\\

& (i_r,j_r \ge 0,\quad i_r + j_r > 0,\quad 1 \le r \le k).

\end{align}

This amounts to

{{NumBlk|:|Z = \sum_{k = 0}^\infty \frac{(-1)^{k} }{k + 1} \sum_{s \in S_{k+1} } \frac{1}{i_1 + j_1 + \cdots + i_k + j_k + i_{k + 1} + j_{k+1} }\frac{\left[X^{(i_1)}Y^{(j_1)}\cdots X^{(i_k)}Y^{(j_k)}X^{(i_{k + 1})}Y^{(j_{k+1})}\right]}{i_1!j_1!\cdots i_k!j_k!i_{k+1}!j_{k+1}!}, |{{EquationRef|100}}}}

where i_r,j_r \ge 0,\quad i_r + j_r > 0,\quad 1 \le r \le k + 1, using the simple observation that {{math|1=[T, T] = 0}} for all {{math|T}}. That is, in ({{EquationNote|100}}), the leading term vanishes unless {{math|jk + 1}} equals {{math|0}} or {{math|1}}, corresponding to the first and second terms in the equation before it. In case {{math|1=jk + 1 = 0}}, {{math|ik + 1}} must equal {{math|1}}, else the term vanishes for the same reason ({{math|1=ik + 1 = 0}} is not allowed). Finally, shift the index, {{math|kk − 1}},

{{Equation box 1

|indent =

|equation =Z = \log e^Xe^Y = \sum_{k = 1}^\infty \frac{(-1)^{k-1}}{k} \sum_{s \in S_{k}} \frac{1}{i_1 + j_1 + \cdots + i_k + j_k}\frac{\left[X^{(i_1)}Y^{(j_1)}\cdots X^{(i_k)}Y^{(j_k)}\right]}{i_1!j_1!\cdots i_k!j_k!},~ i_r,j_r \ge 0,~ i_r + j_r > 0,~ 1 \le r \le k.

|cellpadding= 6

|border

|border colour = #0073CF

|bgcolor=#F9FFF7

}}

This is Dynkin's formula. The striking similarity with (99) is not accidental: It reflects the Dynkin–Specht–Wever map, underpinning the original, different, derivation of the formula. Namely, if

:X^{i_1}Y^{j_1} \cdots X^{i_k}Y^{j_k}

is expressible as a bracket series, then necessarily{{harvnb|Sternberg|2004}} Chapter 1.12.2.

{{NumBlk|:|X^{i_1}Y^{j_1} \cdots X^{i_k}Y^{j_k} = \frac{\left[X^{(i_1)}Y^{(j_1)}\cdots X^{(i_k)}Y^{(j_k)}\right]}{i_1 + j_1 + \cdots + i_k + j_k}.|{{EquationRef|B}}}}

Putting observation {{EquationNote|(A)}} and theorem ({{EquationNote|B}}) together yields a concise proof of the explicit BCH formula.

See also

Remarks

{{reflist|group=nb}}

Notes

{{reflist}}

References

  • {{citation|last=Dynkin|first=Eugene Borisovich|author-link=Eugene Dynkin |year=1947|language=Russian|title=Вычисление коэффициентов в формуле Campbell–Hausdorff|trans-title=Calculation of the coefficients in the Campbell–Hausdorff formula|journal=Doklady Akademii Nauk SSSR|volume=57|pages=323–326}} ; translation from [https://books.google.com/books?id=D9ZF5O_JH2gC&dq=Dynkin+Yushkevich++Campbell&pg=PA31 Google books].
  • {{Citation| last=Hall|first=Brian C.|title=Lie groups, Lie algebras, and representations: An elementary introduction|edition=2nd|series=Graduate Texts in Mathematics|volume=222|publisher=Springer|year=2015|isbn=978-3319134666}}
  • {{citation|last=Miller| first=Wllard| title=Symmetry Groups and their Applications|publisher=Academic Press| year=1972| isbn=0-12-497460-0}}
  • {{citation|last=Poincaré|first=H.|year=1899|title=Sur les groupes continus|journal=Cambridge Philos. Trans.|volume=18|pages=220–55|author-link=Henri Poincaré}}
  • {{citation|last=Rossmann|first= Wulf|title=Lie Groups – An Introduction Through Linear Groups|publisher=Oxford Science Publications|year=2002|series=Oxford Graduate Texts in Mathematics|isbn=0-19-859683-9}}
  • {{citation|last=Schur|first=F.|year=1891|title=Zur Theorie der endlichen Transformationsgruppen|journal=Abh. Math. Sem. Univ. Hamburg|volume=4|pages=15–32|author-link=Friedrich Schur}}
  • {{cite journal|last=Suzuki |first=Masuo| year=1985|title=Decomposition formulas of exponential operators and Lie exponentials with some applications to quantum mechanics and statistical physics|journal=Journal of Mathematical Physics| volume=26| issue=4|pages=601–612|doi=10.1063/1.526596| bibcode=1985JMP....26..601S}}
  • {{citation|author=Tuynman|year=1995|title=The derivation of the exponential map of matrices|journal=Amer. Math. Monthly|volume=102|issue=9|pages=818–819|doi=10.2307/2974511|jstor=2974511}}
  • Veltman, M, 't Hooft, G & de Wit, B (2007). "Lie Groups in Physics", [http://www.staff.science.uu.nl/~hooft101/lectures/lieg07.pdf online lectures].
  • {{cite journal|doi=10.1063/1.1705306 | author =Wilcox, R. M. | title=Exponential Operators and Parameter Differentiation in Quantum Physics | journal=Journal of Mathematical Physics | volume=8 | pages=962–982 | year=1967|issue=4| bibcode =1967JMP.....8..962W }}