Autotransformer

A transformer variant for semantic continuous sentence representation and generation

We modify the transformer to give a fixed-length and distributed sentence representation. This allows sampling from the latent space to generate sentences, semantic interpolations, sentence analogies etc. The code is based on the Tensorflow Tensor2Tensor library and trains on TPU out of the box.

based on:

Vaswani et al., Attention Is All You Need, arxiv.org/abs/1706.03762

Cer et al., Universal Sentence Encoder, arxiv.org/abs/1803.11175

Cifka et al., Eval all, trust a few, do wrong to none: Comparing sentence generation models, arxiv.org/abs/1804.07972

Cifka and Bojar, Are BLEU and Meaning Representation in Opposition? arxiv.org/abs/1805.06536

motivation:

Bowman et al., Generating Sentences from a Continuous Space, arxiv.org/abs/1511.06349

Gan et al., Learning Generic Sentence Representations Using Convolutional Neural Networks arxiv.org/abs/1611.07897

Semeniuta et al., A Hybrid Convolutional Variational Autoencoder for Text Generation, arxiv.org/abs/1702.02390

Zhao et al, Adversarially Regularized Autoencoders, arxiv.org/abs/1706.04223

https://blogs.helsinki.fi/language-technology/files/2017/09/FINMT2017-Bojar.pdf

A rough guide to training transformer on TPU can be found here:

https://github.com/eyaler/autotransformer/blob/master/transformer_tpu.txt

Acknowledgements: This work was supported by Deep Learning Camp Jeju 2018 which was organized by TensorFlow Korea User Group

autotransformer

Commits

added references

add references to readme

first code push

added transformer_tpu.txt

Update README.md

Merge pull request #1 from eyaler/add-license-1

README

Autotransformer