[Regressor] Poisson Regression

안암동컴맹·2024년 4월 6일
0

Machine Learning

목록 보기
83/103

Poisson Regression

Introduction

Poisson Regression is a statistical approach used to model count data, particularly for outcomes that represent counts or rates following a Poisson distribution. This method is integral in fields where the prediction of discrete events is crucial, such as epidemiology, insurance, and sports analytics. It extends the linear regression framework to accommodate non-negative integer responses, offering insights into how explanatory variables influence the log-rate of a given outcome.

Background and Theory

Poisson Distribution Basics

The Poisson distribution is key to understanding Poisson regression. It models the probability of a given number of events occurring within a fixed interval, assuming these events happen at a constant rate and independently of each other. The distribution's probability mass function for observing kk events is defined as:

P(X=k)=λkeλk!P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}

where λ\lambda is the event rate and ee is Euler's number.

Mathematical Formulation of Poisson Regression

Poisson regression predicts the log of the expected count as a linear combination of the input variables. If YY denotes the count variable, then its expected value conditioned on explanatory variables XX is related to those variables through the logarithm function:

log(E[YX])=β0+β1X1+β2X2++βnXn\log(\mathbb{E}[Y|X]) = \beta_0 + \beta_1X_1 + \beta_2X_2 + \ldots + \beta_nX_n

Here, β0,β1,,βn\beta_0, \beta_1, \ldots, \beta_n are parameters to be estimated.

Optimization Process

The coefficients of a Poisson regression model are typically estimated using Maximum Likelihood Estimation (MLE). The likelihood function for a set of parameters given the observed data in Poisson regression is:

L(β;y,X)=i=1Neλiλiyiyi!L(\beta; y, X) = \prod_{i=1}^{N} \frac{e^{-\lambda_i} \lambda_i^{y_i}}{y_i!}

where λi=eβ0+β1Xi1+β2Xi2++βnXin\lambda_i = e^{\beta_0 + \beta_1X_{i1} + \beta_2X_{i2} + \ldots + \beta_nX_{in}} is the expected count for the ii-th observation, and yiy_i is the observed count.

Gradient-Based Optimization

To find the parameter estimates that maximize the likelihood, gradient-based optimization techniques are often employed. The gradient of the likelihood function with respect to the parameters β\beta can be calculated and used to iteratively adjust the parameters until the maximum likelihood estimates are found.

The gradient of the log-likelihood function (since the logarithm is a monotonic function, maximizing the log-likelihood maximizes the likelihood) with respect to βj\beta_j is:

logL(β;y,X)βj=i=1NXij(yiλi)\frac{\partial \log L(\beta; y, X)}{\partial \beta_j} = \sum_{i=1}^{N} X_{ij} (y_i - \lambda_i)

This gradient tells us how to adjust βj\beta_j to increase the likelihood, given the data. Optimization algorithms, such as Newton-Raphson or gradient ascent, use these gradients to update the parameters iteratively.

Implementation

Parameters

  • learning_rate: float, default = 0.01
    Step size of the gradient descent update
  • max_iter: int, default = 100
    Number of iteration
  • l1_ratio: float, default = 0.5
    Balancing parameter of L1 and L2 in elastic-net regularization
  • alpha: float, defualt = 0.01
    Regularization strength
  • regularization: Literal['l1', 'l2', 'elastic-net'], default = None
    Regularization type

Applications

Poisson regression has broad applications, including but not limited to:

  • Estimating the number of calls to a call center based on time of day and day of the week.
  • Modeling the count of traffic accidents at different intersections to identify high-risk areas.
  • Predicting the number of goals a soccer team scores in a match based on team and opponent statistics.

Strengths and Limitations

Strengths

  • Provides a natural way to model count data.
  • Coefficients can be interpreted as log-relative changes in the count, given a one-unit change in the predictor.

Limitations

  • Assumes the mean and variance of the distribution are equal (equidispersion), which may not always hold true.
  • Does not naturally handle zero-inflated data or overdispersion without model modifications.

Advanced Topics

  • Gradient Ascent and Newton-Raphson Method: More on the iterative algorithms used for finding the MLE of Poisson regression parameters.
  • Generalized Linear Models (GLMs): A broader class of models that includes Poisson regression as a special case.
  • Handling Overdispersion and Zero Inflation: Techniques like Negative Binomial Regression or Zero-Inflated Poisson Regression to address data that doesn't fit the strict assumptions of Poisson regression.

References

  1. Dobson, Annette J., and Adrian G. Barnett. "An Introduction to Generalized Linear Models." CRC Press, 2008.
  2. Hilbe, Joseph M. "Negative Binomial Regression." Cambridge University Press, 2011.
  3. McCullagh, Peter, and John Nelder. "Generalized Linear Models." Chapman & Hall/CRC, 1989.
profile
𝖪𝗈𝗋𝖾𝖺 𝖴𝗇𝗂𝗏. 𝖢𝗈𝗆𝗉𝗎𝗍𝖾𝗋 𝖲𝖼𝗂𝖾𝗇𝖼𝖾 & 𝖤𝗇𝗀𝗂𝗇𝖾𝖾𝗋𝗂𝗇𝗀

0개의 댓글