Transformers serve as the foundation for pre-trained systems such as BERT and GPT. The Transformer architecture implements an encoder-decoder structure free of recursion and convolutions.
Given the importance of Transformers in machine learning and AI, we’ve compiled a list of five books to help you better understand the sequence transduction model.
Transformers are becoming an essential component of many neural network architectures and are used in various applications, including NLP, Speech Recognition, Time Series, and Computer Vision. Transformers have undergone numerous modifications and changes, resulting in newer techniques and methods. The first comprehensive book on transformers is Transformers for Machine Learning: A Deep Dive.
The theoretical explanations of cutting-edge transformer architectures will appeal to postgraduate students and researchers (academic and industry), as they will provide a single entry point into a rapidly evolving field. In addition, undergraduate students, practitioners, and professionals will benefit from the practical hands-on case studies and code because it allows for quick experimentation and lowers the barrier to entry into the field.
Transformers have quickly become the dominant architecture for achieving cutting-edge results on various natural language processing tasks since their introduction in 2017.
Transformers have been used to create realistic news stories, improve Google Search queries, and even corny joke-telling chatbots. Hugging Face Transformers creators Lewis Tunstall, Leandro von Werra, and Thomas Wolf use a hands-on approach in this guide to teach you how transformers work and how to integrate them into your applications. You will quickly discover the various tasks with which they can assist you.
Transformer-based language models have dominated NLP research and have evolved into a new paradigm. This book will teach you how to use the Python Transformers library to create various transformer-based NLP applications.
The book introduces Transformers by demonstrating how to write your first hello-world program. Then you’ll learn how a tokenizer works and how to train your own. As you progress, you’ll look at the architecture of autoencoding models like BERT and autoregressive models like GPT. Next, you will learn how to train and fine-tune models for various NLU and NLG problems, such as text classification, token classification, and text representation. This book also teaches you how to build efficient models for complex problems like long-context NLP tasks with limited computational capacity. You will also work with multilingual and cross-lingual issues and optimize models through performance monitoring. Likewise, you will learn how to deconstruct these models for interpretability and explainability. Finally, you’ll be able to put your transformer models into production.
Deep learning (DL) is a critical component of the exciting advances in machine learning and artificial intelligence that are taking place today. Deep Learning is a comprehensive guide to DL. This book is ideal for developers, data scientists, analysts, and others—including those with no prior machine learning or statistics experience—because it illuminates both the core concepts and the hands-on programming techniques required to succeed.
Magnus Ekman demonstrates how to use the essential building blocks of deep neural networks, such as artificial neurons and fully connected, convolutional, and recurrent layers, to build advanced architectures such as the Transformer. He explains how these concepts are applied to developing modern networks for computer vision and natural language processing (NLP), such as Mask R-CNN, GPT, and BERT. He also describes how a natural language translator and a system that generates natural language descriptions of images work.
Throughout, Ekman uses TensorFlow with Keras to provide concise, well-annotated code examples. Corresponding PyTorch models are available online, and the book thus covers the two most widely used Python libraries for DL in industry and academia. He introduces neural architecture search (NAS), discussing critical ethical issues and resources for further study.
This book introduces you to the world of transformers, highlighting the strengths of various models and platforms while teaching you the problem-solving techniques you’ll need to overcome model shortcomings. Hugging Face will be used to pre-train a RoBERTa model from scratch, including creating the dataset, definition of the data collator, and training the model. In addition, step-by-step instructions are provided in Transformers for Natural Language Processing, 2nd Edition, if you wish to fine-tune a pre-trained model, such as GPT-3.
This book examines machine translations, speech-to-text, text-to-speech, question-answering, and other NLP tasks. It offers strategies for resolving complex language issues and may even alleviate anxiety associated with fake news (read chapter 13 for more details). You will observe how cutting-edge platforms such as OpenAI have extended transformers beyond language into computer vision tasks and Codex-based code creation. By the conclusion of this book, you will understand how transformers function, how to implement them, and how to solve problems like an AI detective!
Source: indiaai.gov.in