Tensorflow Transformers (tf-transformers)¶

State-of-the-art Faster Natural Language Processing in TensorFlow 2.0.

tf-transformers provides general-purpose architectures (BERT, GPT-2, RoBERTa, T5, Seq2Seq…) for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages in TensorFlow 2.0.

tf-transformers is the fastest library for Transformer based architectures, comparing to existing similar implementations in TensorFlow 2.0. It is 80x faster comparing to famous similar libraries like HuggingFace Tensorflow 2.0 implementations. For more details about benchmarking please look BENCHMARK here.

This is the documentation of our repository tf-transformers <https://github.com/legacyai/tf-transformers>. You can also follow our documentation <https://legacyai.github.com/tf-transformers? that teaches how to use this library, as well as the other features of this library.

Features¶

High performance on NLU and NLG tasks
Low barrier to entry for educators and practitioners

State-of-the-art NLP for everyone:

Deep learning researchers
Hands-on practitioners
AI/ML/NLP teachers and educators

Lower compute costs, smaller carbon footprint:

Researchers can share trained models instead of always retraining
Practitioners can reduce compute time and production costs
8 architectures with over 30 pretrained models, some in more than 100 languages

Choose the right framework for every part of a model’s lifetime:

Train state-of-the-art models in 3 lines of code
Complete support for Tensorflow 2.0 models.
Seamlessly pick the right framework for training, evaluation, production

Contents¶

The documentation is organized in five parts:

GET STARTED contains a quick tour, the installation instructions and some useful information about our philosophy and a glossary.
MODELS contains general documentation on how to use the library.
MODEL USAGE contains quick examples on how to use the models.
ADVANCED TUTORIALS contains more advanced guides that are more specific to training and inference in production.
RESEARCH focuses on tutorials that have less to do with how to use the library but more about general research in transformers model, most written in fast pre-process and TPU
TFLITE contains quick examples on how to use tflite models.
BENCHMARK contains quick examples on how to benchmark models and the results.

The library currently contains Tensorflow implementations, pretrained model weights, usage scripts, tutorials and conversion utilities for the following models.

Supported models¶

ALBERT (from Google Research and the Toyota Technological Institute at Chicago) released with the paper ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut.
BART (from Facebook) released with the paper BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer.
BERT (from Google) released with the paper BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.
BERT For Sequence Generation (from Google) released with the paper Leveraging Pre-trained Checkpoints for Sequence Generation Tasks by Sascha Rothe, Shashi Narayan, Aliaksei Severyn.
CLIP (from OpenAI) released with the paper Learning Transferable Visual Models From Natural Language Supervision by Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever.
GPT-2 (from OpenAI) released with the paper Language Models are Unsupervised Multitask Learners by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**.
M2M100 (from Facebook) released with the paper Beyond English-Centric Multilingual Machine Translation by by Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin.
MarianMT Machine translation models trained using OPUS data by Jörg Tiedemann. The Marian Framework is being developed by the Microsoft Translator Team.
MBart (from Facebook) released with the paper Multilingual Denoising Pre-training for Neural Machine Translation by Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer.
MBart-50 (from Facebook) released with the paper Multilingual Translation with Extensible Multilingual Pretraining and Finetuning by Yuqing Tang, Chau Tran, Xian Li, Peng-Jen Chen, Naman Goyal, Vishrav Chaudhary, Jiatao Gu, Angela Fan.
MT5 (from Google AI) released with the paper mT5: A massively multilingual pre-trained text-to-text transformer by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
RoBERTa (from Facebook), released together with the paper a Robustly Optimized BERT Pretraining Approach by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
T5 (from Google AI) released with the paper Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer by Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.
Vision Transformer (ViT) (from Google AI) released with the paper An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale by Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby.

Get started

Tutorials

Tokenizers

Benchmarks