Beyond The Transformer
Published in 2017 by Google Brain, 'Attention Is All You Need' [1] established the Transformer neural network architecture, which parallelised previously serial computation through their multi-head, self-attention mechanism - described by the language of linear algebra.