• The Transformer is a deep learning model introduced in 2017, used prima

    From Denis Mosko@2:5064/54.1315 to All on Thu Sep 17 07:22:02 2020
    rily in the field of NLP.

    Like RNNs, Transformers are designed to handle sequential data, such as natural language, for tasks such as translation and text summarization. However, unlike RNNs, Transformers do not require that the sequential data be processed in order. For example, if the input data is a natural language sentence, the Transformer does not need to process the beginning of it before the end. Due to this feature, the Transformer allows for much more parallelization than RNNs and therefore reduced training times.

    Since their introduction, Transformers have become the model of choice for tackling many problems in NLP, replacing older recurrent neural network models such as the long short-term memory LSTM. Since the Transformer model facilitates more parallelization during training, it has enabled training on larger datasets than was possible before it was introduced. This has led to the development of pretrained systems such as BERT and GPT, which have been trained with huge general language datasets, and can be fine-tuned to specific language tasks.

    Why?
    --- GoldED+/W32-MINGW 1.1.5-b20120519 (Kubik 3.0)
    * Origin: В начале было слово. В конце будет ориджин. (2:5064/54.1315)