Directional derivative

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Lua error in package.lua at line 80: module 'strict' not found.

In mathematics, the directional derivative of a multivariate differentiable function along a given vector v at a given point x intuitively represents the instantaneous rate of change of the function, moving through x with a velocity specified by v. It therefore generalizes the notion of a partial derivative, in which the rate of change is taken along one of the coordinate curves, all other coordinates being constant.

The directional derivative is a special case of the Gâteaux derivative.

Definition

File:Directional derivative contour plot.svg
A contour plot of f(x, y)=x^2 + y^2, showing the gradient vector in blue, and the unit vector \bold{u} scaled by the directional derivative in the direction of \bold{u} in orange. The gradient vector is longer because the gradient points in the direction of greatest rate of increase of a function.

The directional derivative of a scalar function

f(\bold{x}) = f(x_1, x_2, \ldots, x_n)

along a vector

\bold{v} = (v_1, \ldots, v_n)

is the function defined by the limit[1]

\nabla_{\bold{v}}{f}(\bold{x}) = \lim_{h \rightarrow 0}{\frac{f(\bold{x} + h\bold{v}) - f(\bold{x})}{h}}.

In the context of a function on a Euclidean space, some texts restrict the vector v to being a unit vector. Without the restriction, this definition is valid in a broad range of contexts, for example where the norm of a vector (and hence a unit vector) is undefined.[2]

If the function f is differentiable at x, then the directional derivative exists along any vector v, and one has

\nabla_{\bold{v}}{f}(\bold{x}) = \nabla f(\bold{x}) \cdot \bold{v}

where the \nabla on the right denotes the gradient and \cdot is the dot product.[3] Intuitively, the directional derivative of f at a point x represents the rate of change of f with respect to time when moving past x at velocity v.

Variation using only direction of vector for Euclidean space

File:Geometrical interpretation of a directional derivative.svg
The angle α between the tangent A and the horizontal will be maximum if the cutting plane contains the direction of the gradient A.

Some authors[4] define the directional derivative to be with respect to an arbitrary nonzero vector v after normalization, thus being independent of its magnitude and depending only on its direction.[5]

This definition gives the rate of increase of f per unit of distance moved in the given direction. In this case, one has

\nabla_{\bold{v}}{f}(\bold{x}) = \lim_{h \rightarrow 0}{\frac{f(\bold{x} + h\bold{v}) - f(\bold{x})}{h|\bold{v}|}},

or in case f is differentiable at x,

\nabla_{\bold{v}}{f}(\bold{x}) = \nabla f(\bold{x}) \cdot \frac{\bold{v}}{|\bold{v}|} .

This definition is not equivalent with the one above except when v is a unit vector.

Notation

Directional derivatives can be also denoted by:

\nabla_{\bold{v}}{f}(\bold{x}) \sim \frac{\partial{f(\bold{x})}}{\partial{v}} \sim f'_\mathbf{v}(\bold{x}) \sim D_\bold{v}f(\bold{x}) \sim \partial_\bold{v}f(\bold{x}) \sim \mathbf{v}\cdot{\nabla f(\bold{x})} \sim \bold{v}\cdot \frac{\partial f(\bold{x})}{\partial\bold{x}}

where v is a parameterization of a curve to which v is tangent and which determines its magnitude.

Properties

Many of the familiar properties of the ordinary derivative hold for the directional derivative. These include, for any functions f and g defined in a neighborhood of, and differentiable at, p:

  1. sum rule:
    \nabla_{\bold{v}} (f + g) = \nabla_{\bold{v}} f + \nabla_{\bold{v}} g
  2. constant factor rule: For any constant c,
    \nabla_{\bold{v}} (cf) = c\nabla_{\bold{v}} f
  3. product rule (or Leibniz's rule):
    \nabla_{\bold{v}} (fg) = g\nabla_{\bold{v}} f + f\nabla_{\bold{v}} g
  4. chain rule: If g is differentiable at p and h is differentiable at g(p), then
    \nabla_{\bold{v}}(h\circ g)(\bold{p}) = h'(g(\bold{p})) \nabla_{\bold{v}} g (\bold{p})

In differential geometry

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

Let M be a differentiable manifold and p a point of M. Suppose that f is a function defined in a neighborhood of p, and differentiable at p. If v is a tangent vector to M at p, then the directional derivative of f along v, denoted variously as \nabla_{\bold{v}} f(\bold{p}) (see covariant derivative), L_{\bold{v}} f(\bold{p}) (see Lie derivative), or {\bold{v}}_{\bold{p}}(f) (see Tangent space § Definition via derivations), can be defined as follows. Let γ : [−1,1] → M be a differentiable curve with γ(0) = p and γ′(0) = v. Then the directional derivative is defined by

\nabla_{\bold{v}} f(\bold{p}) = \left.\frac{d}{d\tau} f\circ\gamma(\tau)\right|_{\tau=0}

This definition can be proven independent of the choice of γ, provided γ is selected in the prescribed manner so that γ′(0) = v.

The Lie derivative

The Lie derivative of a vector field \scriptstyle W^\mu(x) along a vector field \scriptstyle V^\mu(x) is given by the difference of two directional derivatives (with vanishing torsion):

\mathcal{L}_V W^\mu=(V\cdot\nabla) W^\mu-(W\cdot\nabla) V^\mu

In particular, for a scalar field \scriptstyle \phi(x), the Lie derivative reduces to the standard directional derivative:

\mathcal{L}_V \phi=(V\cdot\nabla) \phi

The Riemann tensor

Directional derivatives are often used in introductory derivations of the Riemann curvature tensor. Consider a curved rectangle with an infinitesimal vector δ along one edge and δ′ along the other. We translate a covector S along δ then δ′ and then subtract the translation along δ′ and then δ. Instead of building the directional derivative using partial derivatives, we use the covariant derivative. The translation operator for δ is thus

1+\sum_\nu \delta^\nu D_\nu=1+\delta\cdot D

and for δ

1+\sum_\mu \delta'^\mu D_\mu=1+\delta'\cdot D

The difference between the two paths is then

(1+\delta'\cdot D)(1+\delta\cdot D)S^\rho-(1+\delta\cdot D)(1+\delta'\cdot D)S^\rho=\sum_{\mu,\nu}\delta'^\mu \delta^\nu[D_\mu,D_\nu]S_\rho

It can be argued[6] that the noncommutativity of the covariant derivatives measures the curvature of the manifold:

[D_\mu,D_\nu]S_\rho=\pm \sum_\sigma R^\sigma_{\rho\mu\nu}S_\sigma

with R the Riemann tensor of course and the sign depending on the sign convention of the author.

In group theory

Translations

In the Poincaré algebra, we can define an infinitesimal translation operator P as

\mathbf{P}=i\nabla

(the i ensures that P is a self-adjoint operator) For a finite displacement λ, the unitary Hilbert space representation for translations is[7]

U(\boldsymbol{\lambda})=\exp\left(-i\boldsymbol{\lambda}\cdot\mathbf{P}\right)

By using the above definition of the infinitesimal translation operator, we see that the finite translation operator is an exponentiated directional derivative:

U(\boldsymbol{\lambda})=\exp\left(\boldsymbol{\lambda}\cdot\nabla\right)

This is a translation operator in the sense that it acts on multivariable functions f(x) as

U(\boldsymbol{\lambda}) f(\mathbf{x})=\exp\left(\boldsymbol{\lambda}\cdot\nabla\right) f(\mathbf{x})=f(\mathbf{x}+\boldsymbol{\lambda})

Rotations

The rotation operator also contains a directional derivative. The rotation operator for an angle θ, i.e. by an amount θ=|θ| about an axis parallel to \scriptstyle \hat{\theta}=θ/θ is

U(R(\boldsymbol{\theta}))=\exp(-i\boldsymbol{\theta}\cdot\mathbf{L})

Here L is the vector operator that generates SO(3):

\mathbf{L}=\begin{pmatrix}
 0& 0 & 0\\ 
 0& 0 & 1\\ 
 0& -1 & 0
\end{pmatrix}\mathbf{i}+\begin{pmatrix}
0 &0  & -1\\ 
 0& 0 &0 \\ 
1 & 0 & 0
\end{pmatrix}\mathbf{j}+\begin{pmatrix}
 0&1  &0 \\ 
 -1&0  &0 \\ 
0 & 0 & 0
\end{pmatrix}\mathbf{k}

It may be shown geometrically that an infinitesimal right-handed rotation changes the position vector x by

\mathbf{x}\rightarrow \mathbf{x}-\delta\boldsymbol{\theta}\times\mathbf{x}

So we would expect under infinitesimal rotation:

U(R(\delta\boldsymbol{\theta}))f(\mathbf{x})=f(\mathbf{x}-\delta\boldsymbol{\theta}\times\mathbf{x})=f(\mathbf{x})-(\delta\boldsymbol{\theta}\times\mathbf{x})\cdot\nabla f

It follows that

U(R(\delta\boldsymbol{\theta}))=1-(\delta\boldsymbol{\theta}\times\mathbf{x})\cdot\nabla

Following the same exponentiation procedure as above, we arrive at the rotation operator in the position basis, which is an exponentiated directional derivative:[11]

U(R(\boldsymbol{\theta}))=\exp(-(\boldsymbol{\theta}\times\mathbf{x})\cdot\nabla)

Normal derivative

A normal derivative is a directional derivative taken in the direction normal (that is, orthogonal) to some surface in space, or more generally along a normal vector field orthogonal to some hypersurface. See for example Neumann boundary condition. If the normal direction is denoted by \bold{n}, then the directional derivative of a function f is sometimes denoted as \frac{ \partial f}{\partial n}. In other notations

\frac{ \partial f}{\partial n} = \nabla f(\bold{x}) \cdot \bold{n} = \nabla_{\bold{n}}{f}(\bold{x}) = \frac{\partial f}{\partial \bold{x}}\cdot\bold{n} = Df(\bold{x})[\bold{n}]

In the continuum mechanics of solids

Several important results in continuum mechanics require the derivatives of vectors with respect to vectors and of tensors with respect to vectors and tensors.[12] The directional directive provides a systematic way of finding these derivatives.

The definitions of directional derivatives for various situations are given below. It is assumed that the functions are sufficiently smooth that derivatives can be taken.

Derivatives of scalar valued functions of vectors

Let f(\mathbf{v}) be a real valued function of the vector \mathbf{v}. Then the derivative of f(\mathbf{v}) with respect to \mathbf{v} (or at \mathbf{v}) in the direction \mathbf{u} is defined as


  \frac{\partial f}{\partial \mathbf{v}}\cdot\mathbf{u} = Df(\mathbf{v})[\mathbf{u}] 
     = \left[\frac{d }{d \alpha}~f(\mathbf{v} + \alpha~\mathbf{u})\right]_{\alpha = 0}

for all vectors \mathbf{u}.

Properties:

  1. If f(\mathbf{v}) = f_1(\mathbf{v}) + f_2(\mathbf{v}) then 
   \frac{\partial f}{\partial \mathbf{v}}\cdot\mathbf{u} =  \left(\frac{\partial f_1}{\partial \mathbf{v}} + \frac{\partial f_2}{\partial \mathbf{v}}\right)\cdot\mathbf{u}
  2. If f(\mathbf{v}) = f_1(\mathbf{v})~ f_2(\mathbf{v}) then 
   \frac{\partial f}{\partial \mathbf{v}}\cdot\mathbf{u} =  \left(\frac{\partial f_1}{\partial \mathbf{v}}\cdot\mathbf{u}\right)~f_2(\mathbf{v}) + f_1(\mathbf{v})~\left(\frac{\partial f_2}{\partial \mathbf{v}}\cdot\mathbf{u} \right)
  3. If f(\mathbf{v}) = f_1(f_2(\mathbf{v})) then 
   \frac{\partial f}{\partial \mathbf{v}}\cdot\mathbf{u} =  \frac{\partial f_1}{\partial f_2}~\frac{\partial f_2}{\partial \mathbf{v}}\cdot\mathbf{u}

Derivatives of vector valued functions of vectors

Let \mathbf{f}(\mathbf{v}) be a vector valued function of the vector \mathbf{v}. Then the derivative of \mathbf{f}(\mathbf{v}) with respect to \mathbf{v} (or at \mathbf{v}) in the direction \mathbf{u} is the second-order tensor defined as


  \frac{\partial \mathbf{f}}{\partial \mathbf{v}}\cdot\mathbf{u} = D\mathbf{f}(\mathbf{v})[\mathbf{u}] 
     = \left[\frac{d }{d \alpha}~\mathbf{f}(\mathbf{v} + \alpha~\mathbf{u})\right]_{\alpha = 0}

for all vectors \mathbf{u}.

Properties:

  1. If \mathbf{f}(\mathbf{v}) = \mathbf{f}_1(\mathbf{v}) + \mathbf{f}_2(\mathbf{v}) then 
   \frac{\partial \mathbf{f}}{\partial \mathbf{v}}\cdot\mathbf{u} =  \left(\frac{\partial \mathbf{f}_1}{\partial \mathbf{v}} + \frac{\partial \mathbf{f}_2}{\partial \mathbf{v}}\right)\cdot\mathbf{u}
  2. If \mathbf{f}(\mathbf{v}) = \mathbf{f}_1(\mathbf{v})\times\mathbf{f}_2(\mathbf{v}) then 
   \frac{\partial \mathbf{f}}{\partial \mathbf{v}}\cdot\mathbf{u} =  \left(\frac{\partial \mathbf{f}_1}{\partial \mathbf{v}}\cdot\mathbf{u}\right)\times\mathbf{f}_2(\mathbf{v}) + \mathbf{f}_1(\mathbf{v})\times\left(\frac{\partial \mathbf{f}_2}{\partial \mathbf{v}}\cdot\mathbf{u} \right)
  3. If \mathbf{f}(\mathbf{v}) = \mathbf{f}_1(\mathbf{f}_2(\mathbf{v})) then 
   \frac{\partial \mathbf{f}}{\partial \mathbf{v}}\cdot\mathbf{u} =  \frac{\partial \mathbf{f}_1}{\partial \mathbf{f}_2}\cdot\left(\frac{\partial \mathbf{f}_2}{\partial \mathbf{v}}\cdot\mathbf{u} \right)

Derivatives of scalar valued functions of second-order tensors

Let f(\boldsymbol{S}) be a real valued function of the second order tensor \boldsymbol{S}. Then the derivative of f(\boldsymbol{S}) with respect to \boldsymbol{S} (or at \boldsymbol{S}) in the direction \boldsymbol{T} is the second order tensor defined as


  \frac{\partial f}{\partial \boldsymbol{S}}:\boldsymbol{T} = Df(\boldsymbol{S})[\boldsymbol{T}] 
     = \left[\frac{d }{d \alpha}~f(\boldsymbol{S} + \alpha\boldsymbol{T})\right]_{\alpha = 0}

for all second order tensors \boldsymbol{T}.

Properties:

  1. If f(\boldsymbol{S}) = f_1(\boldsymbol{S}) + f_2(\boldsymbol{S}) then  \frac{\partial f}{\partial \boldsymbol{S}}:\boldsymbol{T} =  \left(\frac{\partial f_1}{\partial \boldsymbol{S}} + \frac{\partial f_2}{\partial \boldsymbol{S}}\right):\boldsymbol{T}
  2. If f(\boldsymbol{S}) = f_1(\boldsymbol{S})~ f_2(\boldsymbol{S}) then  \frac{\partial f}{\partial \boldsymbol{S}}:\boldsymbol{T} =  \left(\frac{\partial f_1}{\partial \boldsymbol{S}}:\boldsymbol{T}\right)~f_2(\boldsymbol{S}) + f_1(\boldsymbol{S})~\left(\frac{\partial f_2}{\partial \boldsymbol{S}}:\boldsymbol{T} \right)
  3. If f(\boldsymbol{S}) = f_1(f_2(\boldsymbol{S})) then  \frac{\partial f}{\partial \boldsymbol{S}}:\boldsymbol{T} =  \frac{\partial f_1}{\partial f_2}~\left(\frac{\partial f_2}{\partial \boldsymbol{S}}:\boldsymbol{T} \right)

Derivatives of tensor valued functions of second-order tensors

Let \boldsymbol{F}(\boldsymbol{S}) be a second order tensor valued function of the second order tensor \boldsymbol{S}. Then the derivative of \boldsymbol{F}(\boldsymbol{S}) with respect to \boldsymbol{S} (or at \boldsymbol{S}) in the direction \boldsymbol{T} is the fourth order tensor defined as


  \frac{\partial \boldsymbol{F}}{\partial \boldsymbol{S}}:\boldsymbol{T} = D\boldsymbol{F}(\boldsymbol{S})[\boldsymbol{T}] 
     = \left[\frac{d }{d \alpha}~\boldsymbol{F}(\boldsymbol{S} + \alpha\boldsymbol{T})\right]_{\alpha = 0}

for all second order tensors \boldsymbol{T}.

Properties:

  1. If \boldsymbol{F}(\boldsymbol{S}) = \boldsymbol{F}_1(\boldsymbol{S}) + \boldsymbol{F}_2(\boldsymbol{S}) then  \frac{\partial \boldsymbol{F}}{\partial \boldsymbol{S}}:\boldsymbol{T} =  \left(\frac{\partial \boldsymbol{F}_1}{\partial \boldsymbol{S}} + \frac{\partial \boldsymbol{F}_2}{\partial \boldsymbol{S}}\right):\boldsymbol{T}
  2. If \boldsymbol{F}(\boldsymbol{S}) = \boldsymbol{F}_1(\boldsymbol{S})\cdot\boldsymbol{F}_2(\boldsymbol{S}) then  \frac{\partial \boldsymbol{F}}{\partial \boldsymbol{S}}:\boldsymbol{T} =  \left(\frac{\partial \boldsymbol{F}_1}{\partial \boldsymbol{S}}:\boldsymbol{T}\right)\cdot\boldsymbol{F}_2(\boldsymbol{S}) + \boldsymbol{F}_1(\boldsymbol{S})\cdot\left(\frac{\partial \boldsymbol{F}_2}{\partial \boldsymbol{S}}:\boldsymbol{T} \right)
  3. If \boldsymbol{F}(\boldsymbol{S}) = \boldsymbol{F}_1(\boldsymbol{F}_2(\boldsymbol{S})) then  \frac{\partial \boldsymbol{F}}{\partial \boldsymbol{S}}:\boldsymbol{T} =  \frac{\partial \boldsymbol{F}_1}{\partial \boldsymbol{F}_2}:\left(\frac{\partial \boldsymbol{F}_2}{\partial \boldsymbol{S}}:\boldsymbol{T} \right)
  4. If f(\boldsymbol{S}) = f_1(\boldsymbol{F}_2(\boldsymbol{S})) then  \frac{\partial f}{\partial \boldsymbol{S}}:\boldsymbol{T} =  \frac{\partial f_1}{\partial \boldsymbol{F}_2}:\left(\frac{\partial \boldsymbol{F}_2}{\partial \boldsymbol{S}}:\boldsymbol{T} \right)

See also

Notes

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. The applicability extends to functions over spaces without a metric and to differentiable manifolds, such as in general relativity.
  3. Technically, the gradient ∇f is a covector, and the "dot product" is the action of this covector on the vector v (or equivalently, the duality pairing of the covector and the vector).
  4. Thomas, George B. Jr.; and Finney, Ross L. (1979) Calculus and Analytic Geometry, Addison-Wesley Publ. Co., fifth edition, p. 593.
  5. This typically assumes a Euclidean space – for example, a function of several variables typically has no definition of the magnitude of a vector, and hence of a unit vector.
  6. Lua error in package.lua at line 80: module 'strict' not found.
  7. Lua error in package.lua at line 80: module 'strict' not found.
  8. Lua error in package.lua at line 80: module 'strict' not found.
  9. Lua error in package.lua at line 80: module 'strict' not found.
  10. Lua error in package.lua at line 80: module 'strict' not found.
  11. Lua error in package.lua at line 80: module 'strict' not found.
  12. J. E. Marsden and T. J. R. Hughes, 2000, Mathematical Foundations of Elasticity, Dover.

References

  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.

External links