Gradient descent, or 'Downhill algorithm,' is one of the simplest algorithms for optimization problems. The algorithm searches for the local minimum. It can apply to both constrained and unconstrained problems. The algorithm requires computing the first derivative of the function. Another drawback of the method is its computation speed, as it's a first-order method.
Define objective function f(x)
Compute gradient of f w.r.t. x, i.e.gradient(f,x) = [df/dx1 , df/dx2, . . .]
where x = {x1, x2, . . .}
3. Perform iterative loop; given s = learning rate (constant). s can't be too large to prevent solution divergence.
for i=1:maxiter
fo = f(x)
x = x - gradient(f,x)*s % update variable
f = f(x)
if norm(f-fo,2) < tol
break
end
end
EXAMPLE 1: Himmelblau's function
The test example with Himmelblau's function is shown here
The objective function f and its gradients (f,x and f,y) are given as
Himmeblau's function is a nonconvex type, which contains four local minima. The optimum solution depends on starting point as we roll a stone downhill.
EXAMPLE 2: Profit target
A company plans to submit a product's price (x) to its customer. The cost (c) and profit target (a) are given. The objective function can be written in the form:
f = 1/2*(x-c-a)^2
where
gradient(f,x) = x-c-a,
The gradient descent solution is as follows:
Given, f
gradient(f,x) = x-c-a
for i = 1:maxiter
fo = 1/2*(x-c-a)^2
x = x - (x-c-a)*s
f = 1/2*(x-c-a)^2
if norm(f-fo,2) < tol
break
end
end
Further reading