About 864,000 results
Open links in new tab
  1. optimization - Batch gradient descent versus stochastic gradient ...

    Additionally, batch gradient descent, given an annealed learning rate, will eventually find the minimum located in it's basin of attraction. Stochastic gradient descent (SGD) computes the gradient using a …

  2. Gradient Descent with constraints (lagrange multipliers)

    Since the gradient descent algorithm is designed to find local minima, it fails to converge when you give it a problem with constraints. There are typically three solutions: Use a numerical method which is …

  3. gradient descent using python and numpy - Stack Overflow

    Jul 22, 2013 · Below you can find my implementation of gradient descent for linear regression problem. At first, you calculate gradient like X.T * (X * w - y) / N and update your current theta with this …

  4. machine learning - why gradient descent when we can solve linear ...

    Aug 12, 2013 · what is the benefit of using Gradient Descent in the linear regression space? looks like the we can solve the problem (finding theta0-n that minimum the cost func) with analytical method so …

  5. machine learning - Why use gradient descent for linear regression, …

    May 11, 2017 · The main reason why gradient descent is used for linear regression is the computational complexity: it's computationally cheaper (faster) to find the solution using the gradient descent in …

  6. python - How to Implement Full Batch Gradient Descent with Nesterov ...

    Mar 4, 2024 · 0 I'm working on a machine learning project in PyTorch where I need to optimize a model using the full batch gradient descent method. The key requirement is that the optimizer should use all …

  7. Why use gradient descent with neural networks?

    Jul 8, 2017 · When training a neural network using the back-propagation algorithm, the gradient descent method is used to determine the weight updates. My question is: Rather than using gradient descent …

  8. Can someone explain to me the difference between a cost function and ...

    So in gradient descent, you follow the negative of the gradient to the point where the cost is a minimum. If someone is talking about gradient descent in a machine learning context, the cost function is …

  9. What is 'mini-batch' in deep learning? - Stack Overflow

    Oct 7, 2019 · Both are approaches to gradient descent. But in a batch gradient descent you process the entire training set in one iteration. Whereas, in a mini-batch gradient descent you process a small …

  10. python 3.x - Adam Optimizer vs Gradient Descent - Stack Overflow

    Aug 25, 2018 · Its is an adaptive method compared to the gradient descent which maintains a single learning rate for all weight updates and the learning rate does not change. Adam has the advantage …