My background is in mathematics. One thing about math that I find simultaneously fascinating and confusing is that there are an insane number of conceptual connections between different ideas. Fully understanding connections takes months or years of study.
An example is the logit function — on one hand it’s very simple, but on the other hand it’s related to dozens of other concepts that at first glance are unrelated, in particular logistic regression prediction.
The easy part. If p is a number between 0 and 1 then:
logit(p) = ln( p / 1-p )
Because p / 1-p is the odds of something, the logit function is also called log-odds.
For example, suppose p = 0.6 then logit(p) = ln( 0.6 / 0.4 ) = ln(1.5) = 0.4055 where 1.5 is the odds.
Now logistic regression is more complicated. Suppose you want to predict something that can take one of two values, such as ‘democrat’ or ‘republican’, based on features such as age (x1) and income (x2).
The logistic function is logistic(x1, x2) = 1 / ( 1 + e^z ) where z = b0 + (b1)(x1) + (b2)(x2). The constants b0, b1, b2 must be determined using training data that has known input and output values.
Anyway, as it turns out the logit and logistic functions are mathematical inverses of each other: logit(logistic(z)) = z and logistic(logit(z)) = z.
This conceptual connection between the logit and logistic functions means that logistic regression prediction can be performed directly, using the logistic function, or indirectly, using the logit function. The R language uses the indirect approach, which for me, is more difficult to understand.