Tensorflow Transformers (tf-transformers)¶
State-of-the-art Faster Natural Language Processing in TensorFlow 2.0.
tf-transformers provides general-purpose architectures (BERT, GPT-2, RoBERTa, T5, Seq2Seq…) for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages in TensorFlow 2.0.
tf-transformers is the fastest library for Transformer based architectures, comparing to existing similar implementations in TensorFlow 2.0. It is 80x faster comparing to famous similar libraries like HuggingFace Tensorflow 2.0 implementations. For more details about benchmarking please look BENCHMARK here.
This is the documentation of our repository tf-transformers <https://github.com/legacyai/tf-transformers>. You can also follow our documentation <https://legacyai.github.com/tf-transformers? that teaches how to use this library, as well as the other features of this library.
Features¶
High performance on NLU and NLG tasks
Low barrier to entry for educators and practitioners
State-of-the-art NLP for everyone:
Deep learning researchers
Hands-on practitioners
AI/ML/NLP teachers and educators
Lower compute costs, smaller carbon footprint:
Researchers can share trained models instead of always retraining
Practitioners can reduce compute time and production costs
8 architectures with over 30 pretrained models, some in more than 100 languages
Choose the right framework for every part of a model’s lifetime:
Train state-of-the-art models in 3 lines of code
Complete support for Tensorflow 2.0 models.
Seamlessly pick the right framework for training, evaluation, production
Contents¶
The documentation is organized in five parts:
GET STARTED contains a quick tour, the installation instructions and some useful information about our philosophy and a glossary.
MODELS contains general documentation on how to use the library.
MODEL USAGE contains quick examples on how to use the models.
ADVANCED TUTORIALS contains more advanced guides that are more specific to training and inference in production.
RESEARCH focuses on tutorials that have less to do with how to use the library but more about general research in transformers model, most written in fast pre-process and TPU
TFLITE contains quick examples on how to use tflite models.
BENCHMARK contains quick examples on how to benchmark models and the results.
The library currently contains Tensorflow implementations, pretrained model weights, usage scripts, tutorials and conversion utilities for the following models.
Supported models¶
ALBERT (from Google Research and the Toyota Technological Institute at Chicago) released with the paper ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut.
BART (from Facebook) released with the paper BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer.
BERT (from Google) released with the paper BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.
BERT For Sequence Generation (from Google) released with the paper Leveraging Pre-trained Checkpoints for Sequence Generation Tasks by Sascha Rothe, Shashi Narayan, Aliaksei Severyn.
CLIP (from OpenAI) released with the paper Learning Transferable Visual Models From Natural Language Supervision by Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever.
GPT-2 (from OpenAI) released with the paper Language Models are Unsupervised Multitask Learners by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**.
M2M100 (from Facebook) released with the paper Beyond English-Centric Multilingual Machine Translation by by Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin.
MarianMT Machine translation models trained using OPUS data by Jörg Tiedemann. The Marian Framework is being developed by the Microsoft Translator Team.
MBart (from Facebook) released with the paper Multilingual Denoising Pre-training for Neural Machine Translation by Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer.
MBart-50 (from Facebook) released with the paper Multilingual Translation with Extensible Multilingual Pretraining and Finetuning by Yuqing Tang, Chau Tran, Xian Li, Peng-Jen Chen, Naman Goyal, Vishrav Chaudhary, Jiatao Gu, Angela Fan.
MT5 (from Google AI) released with the paper mT5: A massively multilingual pre-trained text-to-text transformer by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
RoBERTa (from Facebook), released together with the paper a Robustly Optimized BERT Pretraining Approach by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
T5 (from Google AI) released with the paper Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer by Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.
Vision Transformer (ViT) (from Google AI) released with the paper An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale by Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby.
- Writing and Reading TFRecords
- Classify text (MRPC) with Albert
- Train (Masked Language Model) with tf-transformers in TPU
- Classify Flowers (Image Classification) with ViT using multi-GPU
- Prepare Data
- Plot few examples
- Load Model, Optimizer , Trainer
- Prepare Data for Training
- Prepare Dataset
- Wandb Configuration
- Accuracy Callback
- Train :-)
- Visualize the Tensorboard
- Load Trained Model for Testing and Save it as serialzed model
- Calculate Predictions
- Plot Confusion Matrix
- Model Serialization (Production)
- Advanced Serialization (Include pre-processing with models)
- Evaluate Accuracy using joint Serializaton Model
- Plot Mistakes of Model
- Create Sentence Embedding Roberta Model + Zeroshot from Scratch
- Prepare Training TFRecords using Quora
- Prepare Validation TFRecords using STS-b
- Prepare Training and Validation Dataset from TFRecords
- Build Sentence Transformer Model
- Load Model, Optimizer , Trainer
- Wandb Configuration
- Zero-Shot on STS before Training
- Set Hyperparameters and Configs
- Train :-)
- Visualize the Tensorboard
- Load Trained Model for Testing and Save it as serialzed model
- Model Serialization (Production)
- Quora Sentence Embeddings
- Most Similar Sentences
- Prompt Engineering using CLIP
- GPT2 for QA using Squad V1 ( Causal LM )
- Load Data, Tokenizer
- Prepare Training TFRecords and Validation TFRecords using Squad ( causal and prefix )
- Prepare Validation TFRecords
- Wandb Configuration
- Load Model, Optimizer , Trainer
- Set Hyperparameters and Configs
- Train GPT2 Causal :-)
- Evaluation Script (Squad V1) - Exact match, F1 score
- Evaluate ( exact match and F1 score ) on all checkpoints - GPT2 Causal
- Code Java to C# using T5
- Read and Write Images as TFRecords