Difference between Sigmoid and Softmax function in deep learning

Home » News » Difference between Sigmoid and Softmax function in deep learning

7 years ago 0 139

Softmax function can be understood as a generalized version of a sigmoid function or an extension of a sigmoid function. Softmax function is usually used in the output layers of neural networks.

Following are some of the differences between Sigmoid and Softmax function:

1. The sigmoid function is used for the two-class (binary) classification problem, whereas the softmax function is used for the multi-class classification problem.

2. Sum of all softmax units are supposed to be 1. In sigmoid, it’s not really necessary. Sigmoid just makes output between 0 to 1. The softmax enforces that the sum of the probabilities of all the output classes are equal to one, so in order to increase the probability of a particular class, softmax must correspondingly decrease the probability of at least one of the other classes.

When you use a softmax, basically you get a probability of each class (join distribution and a multinomial likelihood) whose sum is bound to be one. In case, you use sigmoid for multi class classification, it’d be like a marginal distribution and a Bernoulli likelihood.

3. Formula for Sigmoid and Softmax

Sigmoid function:

Softmax function:

Let me illustrate the point 2 with an example here. Lets say, we have 6 inputs:

[1,2,3,4,5,6]

If we pass these inputs through the sigmoid function, we will get following output:

[0.5, 0.73, 0.88, 0.95, 0.98, 0.99]

Sum of the above output units is 5.03 which is greater than 1.

But in case of softmax, the sum of output units is always 1. Lets see how? Pass the same input to softmax function, and we get following output:

[0.001, 0.009, 0.03, 0.06, 0.1, 0.8] which sums up to 1.

4. Sigmoid is usually used as an activation function in hidden layers (but we use ReLU nowadays) while Softmax is used in output layers.

A general rule of thumb is to use ReLU as an activation function in hidden layers and softmax in output layer in a neural networks. For more information on activation functions, please visit my this post.

AI Artificial Intelligence Machine Learning Retail

Why AI Is The Future Of Retail

2 years ago

0 178

Data intraday forex Machine Learning Neural nets Oanda PositionBook Volume Profile

Indicator(s) Derived from PositionBook Data

2 years ago

0 181

Machine Learning

30 Must-Know TensorFlow Interview Questions and Answers

3 years ago

0 157

Difference between Sigmoid and Softmax function in deep learning

Leave a Reply Cancel reply

Categories

Make Money

Top Reviews

Best Provider

Difference between Sigmoid and Softmax function in deep learning

Share this:

Leave a Reply Cancel reply

Categories

Make Money

Top Reviews

Best Provider

Discover more from Topoin