This blogpost is my understanding of the Transformers library after participating and learning from the HuggingFace Course with FastAI bent - generously organized by Weights & Biases and weight lifted by some great folks Wayde Gilliam, Sanyam Bhutani, Zach Muller, Andrea Pessl & Morgan McGuire. Sorry, if I have missed anyone, but thank you all for the great hard work to bring this to the masses.
Also this blogpost is submitted as part of the blogpost competition held by the team which was announced in session-3.
Setting up a work Environment:
For using local installations use conda/mamba and then utilize the fastconda channel to grab all the necessary fastai related libraries.
While learning this course, I am using google colab pro and this needs few settings before we get started.
First thing we need is to set the Runtime-Type, so for this setting, navigate to the colab -> Runtime Tab and click on “Change Runtime Type”.
Screen Shot 2021-08-02 at 2 14 24 PM
On the “Change Runtime Type” pop-up select the below options.
Screen Shot 2021-08-02 at 2 16 14 PM
You are all set as per environment.
Before we import anything, we need to install the required libraries. The following lines are needed as per our requirement on whether we are using fastai, HFTransformers, blurt api, adaptnlp api or fast hugs etc.
But the Blurr API has lots of convenient functions like BLURR*.*get_hf_objects to get the hugging face objects, which are vital in constructing out the further steps in building an application like constructing data blocks.
In FastAI, tokenization can be done using the following code snippet
spacy = WordTokenizer()
tkn = Tokenizer(spacy)
But in case of integrating 🤗 api we prefer using their config and that can be done using their AutoConfig class like below: