Checkpoints capture the exact value of all parameters used by a model. Checkpoints do not contain any description of the computation defined by the model and thus are typically only useful, when source code uses available saved parameter values.
Note: Please read about AutoTokenizer example in this earlier week-1 blogpost [Hugging Face + FastAI - Session 1 - Ravi Chandra Veeramachaneni](https://ravichandraveeramachaneni.github.io/posts/bp7/) * To create a model with a pertained checkpoint we can use the below code snippet ``` model = AutoModel.from_pretrained(checkpoint) ``` * The Automodel’s output will be a feature vector for each of those tokens called “hidden states / features” The Features are simply what model has learned from that input tokens. * These features are now passed to the specific part of the model called “head” and this head would be different based on tasks like Summarization, Text-generation etc.
Softmax function takes the predictions from model as input and outputs the same as probabilities between 0 and 1 and they all sum upto one.
Cross-Entropy loss is use of negative loss on probabilities. (Or in simple terms) Cross-Entropy loss is a combination of using the negative log likelihood on the log values of the probabilities from the softmax function.
AutoTokenizerand it utilizes the
from_pretrainedmethod to do that.
AutoModelwhich “can automatically guess the appropriate model architecture for your checkpoint, and then instantiates a model with this architecture.”
from_pretrainedmethod that is used to take the checkpoint and output the model.
save_pretrainedmethod. This method will output two files a config.json which has metadata of transformers version, checkpoint information etc. and a pytorch_model.bin file which contains our model weights.
truncateand also we can specify the
max_lengthto limit the sequence length.
Attention layers of the transformers model contextualize each token.