[Regressor] Poisson Regression

안암동컴맹·2024년 4월 6일

Machine Learning

목록 보기

83/103

Poisson Regression

Introduction

Poisson Regression is a statistical approach used to model count data, particularly for outcomes that represent counts or rates following a Poisson distribution. This method is integral in fields where the prediction of discrete events is crucial, such as epidemiology, insurance, and sports analytics. It extends the linear regression framework to accommodate non-negative integer responses, offering insights into how explanatory variables influence the log-rate of a given outcome.

Background and Theory

Poisson Distribution Basics

The Poisson distribution is key to understanding Poisson regression. It models the probability of a given number of events occurring within a fixed interval, assuming these events happen at a constant rate and independently of each other. The distribution's probability mass function for observing $k$ events is defined as:

P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}

where $\lambda$ is the event rate and $e$ is Euler's number.

Mathematical Formulation of Poisson Regression

Poisson regression predicts the log of the expected count as a linear combination of the input variables. If $Y$ denotes the count variable, then its expected value conditioned on explanatory variables $X$ is related to those variables through the logarithm function:

\log(\mathbb{E}[Y|X]) = \beta_0 + \beta_1X_1 + \beta_2X_2 + \ldots + \beta_nX_n

Here, $\beta_0, \beta_1, \ldots, \beta_n$ are parameters to be estimated.

Optimization Process

The coefficients of a Poisson regression model are typically estimated using Maximum Likelihood Estimation (MLE). The likelihood function for a set of parameters given the observed data in Poisson regression is:

L(\beta; y, X) = \prod_{i=1}^{N} \frac{e^{-\lambda_i} \lambda_i^{y_i}}{y_i!}

where $\lambda_i = e^{\beta_0 + \beta_1X_{i1} + \beta_2X_{i2} + \ldots + \beta_nX_{in}}$ is the expected count for the $i$ -th observation, and $y_i$ is the observed count.

Gradient-Based Optimization

To find the parameter estimates that maximize the likelihood, gradient-based optimization techniques are often employed. The gradient of the likelihood function with respect to the parameters $\beta$ can be calculated and used to iteratively adjust the parameters until the maximum likelihood estimates are found.

The gradient of the log-likelihood function (since the logarithm is a monotonic function, maximizing the log-likelihood maximizes the likelihood) with respect to $\beta_j$ is:

\frac{\partial \log L(\beta; y, X)}{\partial \beta_j} = \sum_{i=1}^{N} X_{ij} (y_i - \lambda_i)

This gradient tells us how to adjust $\beta_j$ to increase the likelihood, given the data. Optimization algorithms, such as Newton-Raphson or gradient ascent, use these gradients to update the parameters iteratively.

Implementation

Parameters

learning_rate: float, default = 0.01
Step size of the gradient descent update
max_iter: int, default = 100
Number of iteration
l1_ratio: float, default = 0.5
Balancing parameter of L1 and L2 in elastic-net regularization
alpha: float, defualt = 0.01
Regularization strength
regularization: Literal['l1', 'l2', 'elastic-net'], default = None
Regularization type

Applications

Poisson regression has broad applications, including but not limited to:

Estimating the number of calls to a call center based on time of day and day of the week.
Modeling the count of traffic accidents at different intersections to identify high-risk areas.
Predicting the number of goals a soccer team scores in a match based on team and opponent statistics.

Strengths and Limitations

Strengths

Provides a natural way to model count data.
Coefficients can be interpreted as log-relative changes in the count, given a one-unit change in the predictor.

Limitations

Assumes the mean and variance of the distribution are equal (equidispersion), which may not always hold true.
Does not naturally handle zero-inflated data or overdispersion without model modifications.

Advanced Topics

Gradient Ascent and Newton-Raphson Method: More on the iterative algorithms used for finding the MLE of Poisson regression parameters.
Generalized Linear Models (GLMs): A broader class of models that includes Poisson regression as a special case.
Handling Overdispersion and Zero Inflation: Techniques like Negative Binomial Regression or Zero-Inflated Poisson Regression to address data that doesn't fit the strict assumptions of Poisson regression.

References

Dobson, Annette J., and Adrian G. Barnett. "An Introduction to Generalized Linear Models." CRC Press, 2008.

Hilbe, Joseph M. "Negative Binomial Regression." Cambridge University Press, 2011.

McCullagh, Peter, and John Nelder. "Generalized Linear Models." Chapman & Hall/CRC, 1989.

안암동컴맹

𝖪𝗈𝗋𝖾𝖺 𝖴𝗇𝗂𝗏. 𝖢𝗈𝗆𝗉𝗎𝗍𝖾𝗋 𝖲𝖼𝗂𝖾𝗇𝖼𝖾 & 𝖤𝗇𝗀𝗂𝗇𝖾𝖾𝗋𝗂𝗇𝗀