Self-supervised learning

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Lua error in package.lua at line 80: module 'strict' not found.

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

Self-supervised learning (SSL) refers to a machine learning paradigm, and corresponding methods, for processing unlabeled data to obtain useful representations that can help with downstream learning tasks. The most salient thing about SSL methods is that they do not need human-annotated labels, which means they are designed to take in datasets consisting entirely of unlabeled data samples. Then the typical SSL pipeline consists of learning supervisory signals (labels generated automatically) in a first stage, which are then used for some supervised learning task in the second and later stages. For this reason, SSL can be described as an intermediate form of unsupervised and supervised learning.

The typical SSL method is based on an artificial neural network or other model such as a decision list.[1] The model learns in two steps. First, the task is solved based on an auxiliary or pretext classification task using pseudo-labels which help to initialize the model parameters.[2][3] Second, the actual task is performed with supervised or unsupervised learning.[4][5][6] Other auxiliary tasks involve pattern completion from masked input patterns (silent pauses in speech or image portions masked in black).

Self-supervised learning was referred as "self-labeling" in 2013. Self-labeling generates labels based on values of the input variables, as for example, to allow the application of supervised learning methods on unlabeled time-series.[citation needed]

Self-supervised learning has produced promising results in recent years and has found practical application in audio processing and is being used by Facebook and others for speech recognition.[7] The primary appeal of SSL is that training can occur with data of lower quality, rather than improving ultimate outcomes. Self-supervised learning more closely imitates the way humans learn to classify objects.[8]

Types

For a binary classification task, training data can be divided into positive examples and negative examples. Positive examples are those that match the target. For example, if you're learning to identify birds, the positive training data are those pictures that contain birds. Negative examples are those that do not.[9]

Contrastive self-supervised learning

Contrastive self-supervised learning uses both positive and negative examples. Contrastive learning's loss function minimizes the distance between positive samples while maximizing the distance between negative samples.[9]

Non-contrastive self-supervised learning

Non-contrastive self-supervised learning (NCSSL) uses only positive examples. Counterintuitively, NCSSL converges on a useful local minimum rather than reaching a trivial solution, with zero loss. For the example of binary classification, it would trivially learn to classify each example as positive. Effective NCSSL requires an extra predictor on the online side that does not back-propagate on the target side.[9]

Comparison with other forms of machine learning

SSL belongs to supervised learning methods insofar as the goal is to generate a classified output from the input. At the same time, however, it does not require the explicit use of labeled input-output pairs. Instead, correlations, metadata embedded in the data, or domain knowledge present in the input are implicitly and autonomously extracted from the data. These supervisory signals, generated from the data, can then be used for training.[8]

SSL is similar to unsupervised learning in that it does not require labels in the sample data. Unlike unsupervised learning, however, learning is not done using inherent data structures.

Semi-supervised learning combines supervised and unsupervised learning, requiring only a small portion of the learning data be labeled.[3]

In transfer learning a model designed for one task is reused on a different task.[10]

Training an autoencoder intrinsically constitutes a self-supervised process, because the output pattern needs to become an optimal reconstruction of the input pattern itself. However, in current jargon, the term 'self-supervised' has become associated with classification tasks that are based on a pretext-task training setup. This involves the (human) design of such pretext task(s), unlike the case of fully self-contained autoencoder training.[11]

In reinforcement learning, self-supervising learning from a combination of losses can create abstract representations where only the most important information about the state are kept in a compressed way.[12]

Examples

Self-supervised learning is particularly suitable for speech recognition. For example, Facebook developed wav2vec, a self-supervised algorithm, to perform speech recognition using two deep convolutional neural networks that build on each other.[7]

Google's Bidirectional Encoder Representations from Transformers (BERT) model is used to better understand the context of search queries.[13]

OpenAI's GPT-3 is an autoregressive language model that can be used in language processing. It can be used to translate texts or answer questions, among other things.[14]

Bootstrap Your Own Latent is a NCSSL that produced excellent results on ImageNet and on transfer and semi-supervised benchmarks.[15]

The Yarowsky algorithm is an example of self-supervised learning in natural language processing. From a small number of labeled examples, it learns to predict which word sense of a polysemous word is being used at a given point in text.

DirectPred is a NCSSL that directly sets the predictor weights instead of learning it via gradient update.[9]

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. Lua error in package.lua at line 80: module 'strict' not found.
  3. 3.0 3.1 Lua error in package.lua at line 80: module 'strict' not found.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. Lua error in package.lua at line 80: module 'strict' not found.
  6. Lua error in package.lua at line 80: module 'strict' not found.
  7. 7.0 7.1 Lua error in package.lua at line 80: module 'strict' not found.
  8. 8.0 8.1 Lua error in package.lua at line 80: module 'strict' not found.
  9. 9.0 9.1 9.2 9.3 Lua error in package.lua at line 80: module 'strict' not found.
  10. Lua error in package.lua at line 80: module 'strict' not found.
  11. Lua error in package.lua at line 80: module 'strict' not found.
  12. Lua error in package.lua at line 80: module 'strict' not found.
  13. Lua error in package.lua at line 80: module 'strict' not found.
  14. Lua error in package.lua at line 80: module 'strict' not found.
  15. Lua error in package.lua at line 80: module 'strict' not found.

Further reading

  • Lua error in package.lua at line 80: module 'strict' not found.

External links

  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.