Calculating the Area under Chi-Square Distribution using Raw R

The R language is used for scientific and numerical programming. R has a built-in function pchisq(x, df) that returns the area under the chi-square (aka chi-squared with a ‘d’) distribution from x to +infinity. Just for fun I thought I’d code up an equivalent function using raw R.

My version is called mychiprob(x, df). It uses ACM Algorithm 299. The function calls helper function mygauss(z) which uses ACM Algorithm 209 to calculate the area under the Normal distribution. My version returns the area under the chi-square distribution from 0 to x (instead of from x to +infinity). Because the total area under the chi-square distribution is 1.0, the two forms are essentially equivalent.

ChiSquareUsingRawR

There’s really no point in using a custom R function like this. I wrote it just to brush up on my R syntax and for my personal geek entertainment.

# calcchisq.R
# calculate area under chi-square distribution
# using raw R, just for fun

mygauss = function(z) {
  # area under Normal curve from -inf to z
  # ACM Algorithm 209
  y <- 0.0
  p <- 0.0 # result
  w <- 0.0 # scratch variable

  if (z == 0.0) {
    p <- 0.0
  }
  else {
    y = 3.0) {
      p <- 1.0
    }
    else if (y < 1.0) {
      w <- y * y;
      p <- ((((((((0.000124818987 * w -
        0.001075204047) * w +
        0.005198775019) * w -
        0.019198292004) * w +
        0.059054035642) * w -
        0.151968751364) * w +
        0.319152932694) * w -
        0.531923007300) * w +
        0.797884560593) * y * 2.0
    }
    else {
      y <- y - 2.0;
      p  0.0) {
    return((p + 1.0) / 2)
  }
  else {
    return((1.0 - p) / 2)
  }

} # mygauss

mychiprob = function(x, df) {
  # area under chi-square distribution with
  # degrees freedom = df, from 0 to x
  # ACM Algorithm 299
  a <- 0.0  # variables
  y <- 0.0
  s <- 0.0
  z <- 0.0
  e <- 0.0
  c <- 0.0
  even <- FALSE   # is df even?
  if (df %% 2 == 0) {
    even <- TRUE
  }
  a  1) {
    y <- myexp(-a)
  }

  if (even == TRUE) {
    s <- y
  }
  else {
    s  2) {
    x <- 0.5 * (df - 1.0)
    if (even == TRUE) {
      z <- 1.0
    }
    else {
      z  40.0) {
      if (even == TRUE) {
        e <- 0.0
      }
      else {
        e <- 0.5723649429247000870717135
      }
      c <- log(a) # ln()    
      while (z <= x) {
        e <- log(z) + e;
        s <- s + MyExp(c * z - a - e)
        z  40.0
    else {
      if (even == TRUE) {
        e <- 1.0
      }
      else {
        e <- 0.5641895835477562869480795 / sqrt(a)
      }
      c <- 0.0
      while (z <= x) {
        e <- e * (a / z)
        c <- c + e
        z  2
  else {
    return(s)
  }

} # mychiprob

myexp = function(x) {
  if (x < -40.0) {
    return(0.0)
  }
  else {
    return(exp(x))
  }
} # myexp

# =====================================
cat("\nBegin chi-square area demo using R\n")

x <- c(1.0, 1.5, 2.0, 2.5, 3.0)
df <- length(x) - 1

for (i in 1:length(x)) {
  cat("==========\n")
  p1 <- 1.0 - pchisq(x[i], df)
  p2 <- mychiprob(x[i], df)
  cat("x = ", x[i], "\n")
  cat("Area from 0 to x using R  pchisq() = ")
  cat(p1, "\n")
  cat("Area from 0 to x using mychiprob() = ")
  cat(p2, "\n")
  cat("==========\n")
}

cat("\nEnd demo\n")
Advertisements
This entry was posted in Machine Learning. Bookmark the permalink.