What is Stochastic gradient descent and Mini Batch Gradient Descent in ML?
Here we are going to discuss the second and third types of gradient descent which are known as Stochastic Gradient Descent and mini-batch gradient descent. Stochastic gradient descent The word ‘Stochastic’ means a system or a process that is linked with a random probability. Hence, in Stochastic Gradient Descent, a few samples are selected randomly instead of the whole data set for each iteration. In Gradient Descent, there is a term called ‘batch’ which denotes the total number of samples from a dataset that is used for calculating the gradient for each iteration, In typical Gradient descent optimization, like Batch Gradient Descent, the batch is taken to be the whole dataset. Although, using the whole dataset is really useful for getting to the minima in a less noisy or less random manner. But the problem arises when our datasets get really huge. Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness p