Rectifier (neural networks)

Lua error in package.lua at line 80: module 'strict' not found.

File:Rectifier and softplus functions.svg

Plot of the rectifier (blue) and softplus (green) functions near x = 0

In the context of artificial neural networks, the rectifier is an activation function defined as

f(x) = \max(0, x),

where x is the input to a neuron. This is also known as a ramp function and is analogous to half-wave rectification in electrical engineering. This activation function has been argued to be more biologically plausible^[1] than the widely used logistic sigmoid (which is inspired by probability theory; see logistic regression) and its more practical^[2] counterpart, the hyperbolic tangent. The rectifier is, as of 2015^[update], the most popular activation function for deep neural networks.^[3]

A unit employing the rectifier is also called a rectified linear unit (ReLU).^[4]

A smooth approximation to the rectifier is the analytic function

f(x) = \ln(1 + e^x),

which is called the softplus function.^[5] The derivative of softplus is $f'(x) = e^x / (e^x + 1) = 1 / (1 + e^{-x})$ , i.e. the logistic function.

Rectified linear units find applications in computer vision^[1] and speech recognition^[6]^[7] using deep neural nets.

Variants

Noisy ReLUs

Rectified linear units can be extended to include Gaussian noise, making them noisy ReLUs, giving^[4]

f(x) = \max(0, x + Y)

, with

Y \sim \mathcal{N}(0, \sigma(x))

Noisy ReLUs have been used with some success in restricted Boltzmann machines for computer vision tasks.^[4]

Leaky ReLUs

Leaky ReLUs allow a small, non-zero gradient when the unit is not active.^[7]

f(x) = \begin{cases} x & \mbox{if } x > 0 \\ 0.01x & \mbox{otherwise} \end{cases}

Parametric ReLUs take this idea further by making the coefficient of leakage into a parameter that is learned along with the other neural network parameters.^[8]

f(x) = \begin{cases} x & \mbox{if } x > 0 \\ a x & \mbox{otherwise} \end{cases}

Note that for $a<0$ , this is equivalent to

f(x) = \max(x, ax)

and thus has a relation to "maxout" networks.^[8]

Advantages

Biological plausibility: One-sided, compared to the antisymmetry of tanh.
Sparse activation: For example, in a randomly initialized network, only about 50% of hidden units are activated (having a non-zero output).
Efficient gradient propagation: No vanishing gradient problem or exploding effect.
Efficient computation: Only comparison, addition and multiplication.

For the first time in 2011,^[1] the use of the rectifier as a non-linearity has been shown to enable training deep supervised neural networks without requiring unsupervised pre-training. Rectified linear units, compared to sigmoid function or similar activation functions, allow for faster and effective training of deep neural architectures on large and complex datasets.

Potential problems

Non-differentiable at zero: however it is differentiable anywhere else, including points arbitrarily close to (but not equal to) zero.

References

↑ ^1.0 ^1.1 ^1.2 Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ ^4.0 ^4.1 ^4.2 Lua error in package.lua at line 80: module 'strict' not found.
↑ C. Dugas, Y. Bengio, F. Bélisle, C. Nadeau, R. Garcia, NIPS'2000, (2001),Incorporating Second-Order Functional Knowledge for Better Option Pricing.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ ^7.0 ^7.1 Andrew L. Maas, Awni Y. Hannun, Andrew Y. Ng (2014). Rectifier Nonlinearities Improve Neural Network Acoustic Models
↑ ^8.0 ^8.1 Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun (2015) Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

[glorot2011-1] 1.0 ^1.1 ^1.2 Lua error in package.lua at line 80: module 'strict' not found.

[2] Lua error in package.lua at line 80: module 'strict' not found.

[3] Lua error in package.lua at line 80: module 'strict' not found.

[nair2010-4] 4.0 ^4.1 ^4.2 Lua error in package.lua at line 80: module 'strict' not found.

[5] C. Dugas, Y. Bengio, F. Bélisle, C. Nadeau, R. Garcia, NIPS'2000, (2001),Incorporating Second-Order Functional Knowledge for Better Option Pricing.

[tothl2013-6] Lua error in package.lua at line 80: module 'strict' not found.

[maas2014-7] 7.0 ^7.1 Andrew L. Maas, Awni Y. Hannun, Andrew Y. Ng (2014). Rectifier Nonlinearities Improve Neural Network Acoustic Models

[prelu-8] 8.0 ^8.1 Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun (2015) Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

Rectifier (neural networks)

Contents

Variants

Noisy ReLUs

Leaky ReLUs

Advantages

Potential problems

See also

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools