Deep Learning for Coders / Chapter-1 / Week-2

Published: June 16, 2021
This following notes is from the Week-2 of the FastAI/fastBook reading session hosted by Aman Arora (Weights & Baises)
  • Important links
    • - Join the #Fastbook channel for discussions and ask questions
    • - check out the forums and explore the existing topics or ask anything that you are stuck with.
    • - Weights & Biases forums where we have weekly information on the Forum tab.
    • FastBook-Week-2 - This would be the link for the Week 1 on the wandb forums.
    • All my learnings from the session are also posted in my blog and in more detailed fashion at
  • KeyPoints - Chapter 1(Your Deep Learning Journey) - Continued
    • The functional form of the model is called architecture.
    • Weights / Parameters are interchangeable words.
    • The actual results of the model are called “Predictions” and they are calculated from the independent variable, which is the data not including the labels.
    • The way our model is performing, that measure is called “Loss”.
    • In a dataset we have 2 things, 1. the Images & 2. the Labels. The labels are the categories or classes like cat or dog.
    • The Labels are also called “Targets or Dependent variables” and the loss is dependent on the labels as well, not just solely on the predictions.
      • Some of the limitations are: Model does not exist without data, model learns from the input data and the outcome is “predictions”.
    • Some of the key functions/points learned out of fastAI library:
      • untar_data() - A function that takes the url of the data set, downloads it and unzip it.
      • In FastAI for Images, we have functions starting with Images like ImageDataLoaders and for text we have functions starting with Text like TextDataLoaders.
      • In FastAI we have 2 types of transforms, Item transforms(item_tfms), and the other is Batch transforms(batch_tfms). The item tranformation operates on each item / input image to resize them to a similar size and the batch transform operates on the batches of items and pass them to the GPU(s) for training.
      • And “EPOCH” is a one pass through, of all the images in training. And the process of the model learning in each epoch is called “Model Fitting”.
      • First Model as described in chapter 1
        from import *
        path = untar_data(URLs.PETS)/'images'
        def is_cat(x): return x[0].isupper()
        dls = ImageDataLoaders.from_name_func(
        path, get_image_files(path), valid_pct=0.2, seed=42,
        label_func=is_cat, item_tfms=Resize(224))
        learn = cnn_learner(dls, resnet34, metrics=error_rate)
      • ImageDataLoaders function (Understanding each parameter)
        • label_func: Takes a function as input, which is used for the classifying the outcome like Yes/No.
          • Example : def is_cat(): return x[0].isupper(). A specific example since in this dataset of cats vs dogs the cats start with Uppercase and dogs with lowercase.
        • item_tfms: Item Transformations - The item transform takes an Resize(224) input and transforms each image aka item into 224x224 size from what ever the original size of the each image might be.
        • valid_pct: A valid percentage is required to split the data into training & validation. A validation set is critical for testing the model for what it has learned in the training phase from rest of the training data.
        All Image Credits - AmanArora - FastAI reading group - week 2
        • Based on the predictions on the validation set(Hold-on set) , we can measure the performance of the model and also avoid overfitting.
        • seed() - sets the seed value so that we get the same validation set every time we run the Model.
        • cnn_learner function: A Convolutional Neural Network is a type of model that is best for vision based tasks and learns from the data. This method takes a particular type of architecture for example resnet34(34-layers) which is a mathematical model that can classify things.
      • Overfitting - A concept thats occurs when the model fits exactly against the training data. In other words if the model tries to fit too closely to each point in the data, then the model becomes “overfitted”. And because of this overfitting model will be unable to generalize well to new data.
      • Metrics - A metric is a function that measures how good the model predictions are comparing the output with actual label and is called Accuracy.
      • Loss vs Metrics - A Loss is used by model for improving the training performance by updating weights using methods like SGD(using back-propagation) and Metrics is just a measure for us to know the performance of the model.
      • Transfer Learning - Using a pre-trained dataset like IMAGENET for classifying a different task. A IMAGENET is an original dataset with 1M images used for vision tasks.
      • Pre-trained weights - We use the weights from the pre-trained model and use that for our task. In this context the last layer is just updated with the new head (head - is our categories like cat & dog). That last layer replaced originally contained 1000 categories and now has just 2 categories.
      • Fine Tuning - Training the model on a general dataset & then training it on our own dataset. This is where we are using the pre-trained weights for all the layers unaltered except for the last layer(head). The process of retaining the model stem(pre-trained weights) and just training the new head is called Fine Tuning. And it’s a Transfer learning technique.
    • The fine tuning in Fast AI has 2 passes, in the first pass the pre-trained stem is frozen and the Head(the last layer / our data layer) is trained. And in second pass the stem & the trained head from first pass is again trained but at different speeds (trained head is again trained faster than the stem in this phase).
    • learn.fine_tune() - The number of times we want that pass to go through. If the no.of.epochs = 1, then each pass goes exactly once in the fine tuning steps. The Higher the epochs, the model learns better.
      • Model Learning - Demystifying Model Learning
        • Each layer of the Model (each layer of the neural network : example 34 layers in restnet34) learns differently & different kinds of input patterns and by the last layer the model will be able to actually understand the task that we are aiming for.
        • The best example of the how model learns after each layer is described in the paper by Matthew D. Zeiler & Rob Fergus.
        • One of the Key idea in Computer Vision is to use the Pretrained weights since learning with lots of data in the stem gives the knowledge of shapes, sizes and all sort of information for the neural network and the last layer which is trained on our data knows how to recognize cats vs dogs. This also reduces lot of compute.
        • Image Recognition can also handle non-image related tasks by converting the graphs & charts into images and then try identifying the patterns from those images.
  • KeyPoints - Chapter 2 (From Model to Production)
    • For deploying models into production we need data, obviously a trained model, API’s around Model allowing it to be accessed by the users, nice UI/UX experience if the model will be served directly as a service from the browser, good infrastructure, best coding practices etc.
    • There are 4 main categories in a deep learning project before production.
      1. 1.
        Data Preparation
      2. 2.
        Labelling data
      3. 3.
        Model Training
      4. 4.
    • The better way would be to allocate same time for each task.