Section 19.6.5 noted that the output of the logistic function could be interpreted as a probability p assigned by the model to the proposition that f(x)=1; the probability that f(x)=0 is therefore 1 – p. Write down the probability p as a function of x and calculate the derivative of log p with respect to each weight wi. Repeat the process for log(1-p). These calculations give a learning rule for minimizing the negative-log-likelihood loss function for a probabilistic hypothesis. Comment on any resemblance to other learning rules in the chapter.
Please review attach file for instructions. This file is too large to display.View in new window
Please review attach file for instructions. This file is too large to display.View in new window