Publish to my blog (weekly)

    • The batch size is a hyperparameter that defines the number of samples to work through before updating the internal model parameters.
    • When the batch is the size of one sample, the learning algorithm is called stochastic gradient descent
    • When the batch size is more than one sample and less than the size of the training dataset, the learning algorithm is called mini-batch gradient descent.
      • Batch Gradient Descent. Batch Size = Size of Training Set
      •  
      • Stochastic Gradient Descent. Batch Size = 1
      •  
      • Mini-Batch Gradient Descent. 1 < Batch Size < Size of Training Set
    • Finally, let’s make this concrete with a small example.

       

      Assume you have a dataset with 200 samples (rows of data) and you choose a batch size of 5 and 1,000 epochs.

       

      This means that the dataset will be divided into 40 batches, each with five samples. The model weights will be updated after each batch of five samples.

       

      This also means that one epoch will involve 40 batches or 40 updates to the model.

       

      With 1,000 epochs, the model will be exposed to or pass through the whole dataset 1,000 times. That is a total of 40,000 batches during the entire training process.

Posted from Diigo. The rest of my favorite links are here.

댓글

이 블로그의 인기 게시물

Publish to my blog (weekly)

Publish to my blog (weekly)