Page Summary
-
Logistic regression models output probabilities, which can be used directly or converted to binary categories.
-
The sigmoid function ensures the output of logistic regression is always between 0 and 1, representing a probability.
-
A logistic regression model uses a linear equation and the sigmoid function to calculate the probability of an event.
-
The log-odds (z) represent the log of the ratio of probabilities for the two possible outcomes.
Many problems require a probability estimate as output. Logistic regression is an extremely efficient mechanism for calculating probabilities. Practically speaking, you can use the returned probability in either of the following two ways:
-
Applied "as is." For example, if a spam-prediction model takes an email as input and outputs a value of
0.932, this implies a93.2%probability that the email is spam. -
Converted to a binary category such as
TrueorFalse,SpamorNot Spam.
This module focuses on using logistic regression model output as-is. In the Classification module , you'll learn how to convert this output into a binary category.
Sigmoid function
You might be wondering how a logistic regression model can ensure its output represents a probability, always outputting a value between 0 and 1. As it happens, there's a family of functions called logistic functionswhose output has those same characteristics. The standard logistic function, also known as the sigmoid function ( sigmoid means "s-shaped"), has the formula:
\[f(x) = \frac{1}{1 + e^{-x}}\]
where:
- f(x) is the output of the sigmoid function.
- e is Euler's number : a mathematical constant ≈ 2.71828.
- x is the input to the sigmoid function.
Figure 1 shows the corresponding graph of the sigmoid function.
As the input, x
, increases, the output of the sigmoid function approaches
but never reaches 1
. Similarly, as the input decreases, the sigmoid
function's output approaches but never reaches 0
.
Transforming linear output using the sigmoid function
The following equation represents the linear component of a logistic regression model:
\[z = b + w_1x_1 + w_2x_2 + \ldots + w_Nx_N\]
where:
- z is the output of the linear equation, also called the log odds .
- b is the bias.
- The w values are the model's learned weights.
- The x values are the feature values for a particular example.
To obtain the logistic regression prediction, the z value is then passed to the sigmoid function, yielding a value (a probability) between 0 and 1:
\[y' = \frac{1}{1 + e^{-z}}\]
where:
- y' is the output of the logistic regression model.
- e is Euler's number : a mathematical constant ≈ 2.71828.
- z is the linear output (as calculated in the preceding equation).
Figure 2 illustrates how linear output is transformed to logistic regression output using these calculations.
In Figure 2, a linear equation becomes input to the sigmoid function, which bends the straight line into an s-shape. Notice that the linear equation can output very big or very small values of z, but the output of the sigmoid function, y', is always between 0 and 1, exclusive. For example, the yellow square on the left graph has a z value of –10, but the sigmoid function in the right graph maps that –10 into a y' value of 0.00004.
Exercise: Check your understanding
A logistic regression model with three features has the following bias and weights:
\[\begin{align} b &= 1 \\ w_1 &= 2 \\ w_2 &= -1 \\ w_3 &= 5 \end{align} \]
Given the following input values:
\[\begin{align} x_1 &= 0 \\ x_2 &= 10 \\ x_3 &= 2 \end{align} \]
Answer the following two questions.
As calculated in #1 above, the log-odds for the input values is 1. Plugging that value for z into the sigmoid function:
\(y = \frac{1}{1 + e^{-z}} = \frac{1}{1 + e^{-1}} = \frac{1}{1 + 0.367} = \frac{1}{1.367} = 0.731\)

