Logistic regression: Calculating a probability with the sigmoid function

  • Logistic regression models output probabilities, which can be used directly or converted to binary categories.

  • The sigmoid function ensures the output of logistic regression is always between 0 and 1, representing a probability.

  • A logistic regression model uses a linear equation and the sigmoid function to calculate the probability of an event.

  • The log-odds (z) represent the log of the ratio of probabilities for the two possible outcomes.

Many problems require a probability estimate as output. Logistic regression is an extremely efficient mechanism for calculating probabilities. Practically speaking, you can use the returned probability in either of the following two ways:

  • Applied "as is." For example, if a spam-prediction model takes an email as input and outputs a value of 0.932 , this implies a 93.2% probability that the email is spam.

  • Converted to a binary category such as True or False , Spam or Not Spam .

This module focuses on using logistic regression model output as-is. In the Classification module , you'll learn how to convert this output into a binary category.

Sigmoid function

You might be wondering how a logistic regression model can ensure its output represents a probability, always outputting a value between 0 and 1. As it happens, there's a family of functions called logistic functionswhose output has those same characteristics. The standard logistic function, also known as the sigmoid function ( sigmoid means "s-shaped"), has the formula:

\[f(x) = \frac{1}{1 + e^{-x}}\]

where:

  • f(x) is the output of the sigmoid function.
  • e is Euler's number : a mathematical constant ≈ 2.71828.
  • x is the input to the sigmoid function.

Figure 1 shows the corresponding graph of the sigmoid function.

Sigmoid (s-shaped) curve plotted on the Cartesian coordinate plane,
         centered at the origin.
Figure 1. Graph of the sigmoid function. The curve approaches 0 as x values decrease to negative infinity, and 1 as x values increase toward infinity.

As the input, x , increases, the output of the sigmoid function approaches but never reaches 1 . Similarly, as the input decreases, the sigmoid function's output approaches but never reaches 0 .

Transforming linear output using the sigmoid function

The following equation represents the linear component of a logistic regression model:

\[z = b + w_1x_1 + w_2x_2 + \ldots + w_Nx_N\]

where:

  • z is the output of the linear equation, also called the log odds .
  • b is the bias.
  • The w values are the model's learned weights.
  • The x values are the feature values for a particular example.

To obtain the logistic regression prediction, the z value is then passed to the sigmoid function, yielding a value (a probability) between 0 and 1:

\[y' = \frac{1}{1 + e^{-z}}\]

where:

  • y' is the output of the logistic regression model.
  • e is Euler's number : a mathematical constant ≈ 2.71828.
  • z is the linear output (as calculated in the preceding equation).

Figure 2 illustrates how linear output is transformed to logistic regression output using these calculations.

Left: Line with the points (-7.5, –10), (-2.5, 0), and (0, 5)
         highlighted. Right: Sigmoid curve with the corresponding transformed
         points (-10, 0.00004), (0, 0.5), and (5, 0.9933) highlighted.
Figure 2. Left: graph of the linear function z = 2x + 5, with three points highlighted. Right: Sigmoid curve with the same three points highlighted after being transformed by the sigmoid function.

In Figure 2, a linear equation becomes input to the sigmoid function, which bends the straight line into an s-shape. Notice that the linear equation can output very big or very small values of z, but the output of the sigmoid function, y', is always between 0 and 1, exclusive. For example, the yellow square on the left graph has a z value of –10, but the sigmoid function in the right graph maps that –10 into a y' value of 0.00004.

Exercise: Check your understanding

A logistic regression model with three features has the following bias and weights:

\[\begin{align} b &= 1 \\ w_1 &= 2 \\ w_2 &= -1 \\ w_3 &= 5 \end{align} \]

Given the following input values:

\[\begin{align} x_1 &= 0 \\ x_2 &= 10 \\ x_3 &= 2 \end{align} \]

Answer the following two questions.

1. What is the value of z for these input values?
–1
0
0.731
1
Correct! The linear equation defined by the weights and bias is z = 1 + 2x 1 – x 2 + 5 x 3 . Plugging the input values into the equation produces z = 1 + (2)(0) - (10) + (5)(2) = 1
2. What is the logistic regression prediction for these input values?
0.268
0.5
0.731

As calculated in #1 above, the log-odds for the input values is 1. Plugging that value for z into the sigmoid function:

\(y = \frac{1}{1 + e^{-z}} = \frac{1}{1 + e^{-1}} = \frac{1}{1 + 0.367} = \frac{1}{1.367} = 0.731\)

1
Remember, the output of the sigmoid function will always be greater than 0 and less than 1.
Create a Mobile Website
View Site in Mobile | Classic
Share by: