hub / github.com/appvision-ai/fast-bert

github.com/appvision-ai/fast-bert @v2.0.25 sqlite

repository ↗ · DeepWiki ↗ · release v2.0.25 ↗

463 symbols 1,456 edges 41 files 142 documented · 31%

README

Fast-Bert

New - Learning Rate Finder for Text Classification Training (borrowed with thanks from https://github.com/davidtvs/pytorch-lr-finder)

Supports LAMB optimizer for faster training. Please refer to https://arxiv.org/abs/1904.00962 for the paper on LAMB optimizer.

Supports BERT and XLNet for both Multi-Class and Multi-Label text classification.

Fast-Bert is the deep learning library that allows developers and data scientists to train and deploy BERT and XLNet based models for natural language processing tasks beginning with Text Classification.

The work on FastBert is built on solid foundations provided by the excellent Hugging Face BERT PyTorch library and is inspired by fast.ai and strives to make the cutting edge deep learning technologies accessible for the vast community of machine learning practitioners.

With FastBert, you will be able to:

Train (more precisely fine-tune) BERT, RoBERTa and XLNet text classification models on your custom dataset.
Tune model hyper-parameters such as epochs, learning rate, batch size, optimiser schedule and more.
Save and deploy trained model for inference (including on AWS Sagemaker).

Fast-Bert will support both multi-class and multi-label text classification for the following and in due course, it will support other NLU tasks such as Named Entity Recognition, Question Answering and Custom Corpus fine-tuning.

BERT (from Google) released with the paper BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.

2) XLNet (from Google/CMU) released with the paper XLNet: Generalized Autoregressive Pretraining for Language Understanding by Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le.

3) RoBERTa (from Facebook), a Robustly Optimized BERT Pretraining Approach by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du et al.

4) DistilBERT (from HuggingFace), released together with the blogpost Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT by Victor Sanh, Lysandre Debut and Thomas Wolf.

Installation

This repo is tested on Python 3.6+.

With pip

PyTorch-Transformers can be installed by pip as follows:

pip install fast-bert

From source

Clone the repository and run:

pip install [--editable] .

pip install git+https://github.com/kaushaltrivedi/fast-bert.git

You will also need to install NVIDIA Apex.

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Usage

Text Classification

1. Create a DataBunch object

The databunch object takes training, validation and test csv files and converts the data into internal representation for BERT, RoBERTa, DistilBERT or XLNet. The object also instantiates the correct data-loaders based on device profile and batch_size and max_sequence_length.


from fast_bert.data_cls import BertDataBunch

databunch = BertDataBunch(DATA_PATH, LABEL_PATH,
                          tokenizer='bert-base-uncased',
                          train_file='train.csv',
                          val_file='val.csv',
                          label_file='labels.csv',
                          text_col='text',
                          label_col='label',
                          batch_size_per_gpu=16,
                          max_seq_length=512,
                          multi_gpu=True,
                          multi_label=False,
                          model_type='bert')

File format for train.csv and val.csv

index	text	label
0	Looking through the other comments, I'm amazed that there aren't any warnings to potential viewers of what they have to look forward to when renting this garbage. First off, I rented this thing with the understanding that it was a competently rendered Indiana Jones knock-off.	neg
1	I've watched the first 17 episodes and this series is simply amazing! I haven't been this interested in an anime series since Neon Genesis Evangelion. This series is actually based off an h-game, which I'm not sure if it's been done before or not, I haven't played the game, but from what I've heard it follows it very well	pos
2	his movie is nothing short of a dark, gritty masterpiece. I may be bias, as the Apartheid era is an area I've always felt for.	pos

In case the column names are different than the usual text and labels, you will have to provide those names in the databunch text_col and label_col parameters.

labels.csv will contain a list of all unique labels. In this case the file will contain:

pos
neg

For multi-label classification, labels.csv will contain all possible labels:

toxic
severe_toxic
obscene
threat
insult
identity_hate

The file train.csv will then contain one column for each label, with each column value being either 0 or 1. Don't forget to change multi_label=True for multi-label classification in BertDataBunch.

id	text	toxic	severe_toxic	obscene	threat	insult	identity_hate
0	Why the edits made under my username Hardcore Metallica Fan were reverted?	0	0	0	0	0	0
0	I will mess you up	1	0	0	1	0	0

label_col will be a list of label column names. In this case it will be:

['toxic','severe_toxic','obscene','threat','insult','identity_hate']

Tokenizer

You can either create a tokenizer object and pass it to DataBunch or you can pass the model name as tokenizer and DataBunch will automatically download and instantiate an appropriate tokenizer object.

For example for using XLNet base cased model, set tokenizer parameter to 'xlnet-base-cased'. DataBunch will automatically download and instantiate XLNetTokenizer with the vocabulary for xlnet-base-cased model.

Model Type

Fast-Bert supports XLNet, RoBERTa and BERT based classification models. Set model type parameter value to 'bert', roberta or 'xlnet' in order to initiate an appropriate databunch object.

2. Create a Learner Object

BertLearner is the ‘learner’ object that holds everything together. It encapsulates the key logic for the lifecycle of the model such as training, validation and inference.

The learner object will take the databunch created earlier as as input alongwith some of the other parameters such as location for one of the pretrained models, FP16 training, multi_gpu and multi_label options.

The learner class contains the logic for training loop, validation loop, optimiser strategies and key metrics calculation. This help the developers focus on their custom use-cases without worrying about these repetitive activities.

At the same time the learner object is flexible enough to be customised either via using flexible parameters or by creating a subclass of BertLearner and redefining relevant methods.


from fast_bert.learner_cls import BertLearner
from fast_bert.metrics import accuracy
import logging

logger = logging.getLogger()
device_cuda = torch.device("cuda")
metrics = [{'name': 'accuracy', 'function': accuracy}]

learner = BertLearner.from_pretrained_model(
                        databunch,
                        pretrained_path='bert-base-uncased',
                        metrics=metrics,
                        device=device_cuda,
                        logger=logger,
                        output_dir=OUTPUT_DIR,
                        finetuned_wgts_path=None,
                        warmup_steps=500,
                        multi_gpu=True,
                        is_fp16=True,
                        multi_label=False,
                        logging_steps=50)

parameter	description
databunch	Databunch object created earlier
pretrained_path	Directory for the location of the pretrained model files or the name of one of the pretrained models i.e. bert-base-uncased, xlnet-large-cased, etc
metrics	List of metrics functions that you want the model to calculate on the validation set, e.g. accuracy, beta, etc
device	torch.device of type cuda or cpu
logger	logger object
output_dir	Directory for model to save trained artefacts, tokenizer vocabulary and tensorboard files
finetuned_wgts_path	provide the location for fine-tuned language model (experimental feature)
warmup_steps	number of training warms steps for the scheduler
multi_gpu	multiple GPUs available e.g. if running on AWS p3.8xlarge instance
is_fp16	FP16 training
multi_label	multilabel classification
logging_steps	number of steps between each tensorboard metrics calculation. Set it to 0 to disable tensor flow logging. Keeping this value too low will lower the training speed as model will be evaluated each time the metrics are logged

3. Find the optimal learning rate

The learning rate is one of the most important hyperparameters for model training. We have incorporated the learining rate finder that was proposed by Leslie Smith and then built into the fastai library.

learner.lr_find(start_lr=1e-5,optimizer_type='lamb')

The code is heavily borrowed from David Silva's pytorch-lr-finder library.

Learning rate range test

4. Train the model

learner.fit(epochs=6,
            lr=6e-5,
            validate=True,  # Evaluate the model after each epoch
            schedule_type="warmup_cosine",
            optimizer_type="lamb")

Fast-Bert now supports LAMB optmizer. Due to the speed of training, we have set LAMB as the default optimizer. You can switch back to AdamW by setting optimizer_type to 'adamw'.

5. Save trained model artifacts

learner.save_model()

Mode

Core symbols most depended-on inside this repo

detach

called by 27

fast_bert/summarisation/modeling_bertabs.py

fast_bert/optimization.py

get_lr

called by 12

fast_bert/learner_cls.py

zero_grad

called by 10

fast_bert/summarisation/modeling_bertabs.py

get_optimizer

called by 8

fast_bert/learner_util.py

to_list

called by 7

fast_bert/learner_qa.py

_create_examples

called by 6

fast_bert/data_cls.py

Shape

Method 262

Function 103

Class 88

Route 10

Languages

Python100%

Modules by API surface

fast_bert/summarisation/modeling_bertabs.py64 symbols

fast_bert/learner_cls.py36 symbols

fast_bert/learner_cls copy.py36 symbols

fast_bert/data.py34 symbols

fast_bert/data_cls.py30 symbols

fast_bert/utils_squad_evaluate.py25 symbols

fast_bert/optimization.py23 symbols

fast_bert/data_abs.py19 symbols

fast_bert/data_ner.py17 symbols

fast_bert/modeling.py16 symbols

fast_bert/learner_ner.py15 symbols

fast_bert/data_qa.py14 symbols

Dependencies from manifests, versioned

transformers4.22. · 1×

For agents

$ claude mcp add fast-bert \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact