最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

r - Difference Between geom_smooth() and Manual LOESS Fit in Logistic Regression - Stack Overflow

programmeradmin7浏览0评论

I am working with a binary outcome (chd, 0 or 1) and a continuous predictor (sbp, systolic blood pressure). I want to visualize the relationship between sbp and the probability of chd using LOESS smoothing. However, I noticed a difference between using geom_smooth() in ggplot2 and manually fitting a LOESS model.

set.seed(123)
n <- 100
data_mi <- data.frame(
  sbp = rnorm(n, mean = 130, sd = 15),  # Systolic BP
  chd = rbinom(n, size = 1, prob = 0.3)  # CHD occurrence (0/1)
)
chd_odd_log = log(sum(data_mi$chd) / (nrow(data_mi) - sum(data_mi$chd)))

data_mi$chd_odd = ifelse(data_mi$chd==1,chd_odd_log,1/chd_odd_log)

library(ggplot2)
loess_fit <- loess(chd ~ sbp, data = data_mi, degree = 1)
loess_pred <- predict(loess_fit)
ggplot(data_mi, aes(x = sbp, y = chd_odd)) +
  geom_smooth(method = "loess")+
  geom_point(aes(y = log(loess_pred / (1 - loess_pred))))

# Logit transformation
plot(data_mi$sbp, log(loess_pred / (1 - loess_pred)), main = "Log-Odds Transformation")

I think the difference is in the order of transformation. Then, how can the loess handle the binary data?

I am working with a binary outcome (chd, 0 or 1) and a continuous predictor (sbp, systolic blood pressure). I want to visualize the relationship between sbp and the probability of chd using LOESS smoothing. However, I noticed a difference between using geom_smooth() in ggplot2 and manually fitting a LOESS model.

set.seed(123)
n <- 100
data_mi <- data.frame(
  sbp = rnorm(n, mean = 130, sd = 15),  # Systolic BP
  chd = rbinom(n, size = 1, prob = 0.3)  # CHD occurrence (0/1)
)
chd_odd_log = log(sum(data_mi$chd) / (nrow(data_mi) - sum(data_mi$chd)))

data_mi$chd_odd = ifelse(data_mi$chd==1,chd_odd_log,1/chd_odd_log)

library(ggplot2)
loess_fit <- loess(chd ~ sbp, data = data_mi, degree = 1)
loess_pred <- predict(loess_fit)
ggplot(data_mi, aes(x = sbp, y = chd_odd)) +
  geom_smooth(method = "loess")+
  geom_point(aes(y = log(loess_pred / (1 - loess_pred))))

# Logit transformation
plot(data_mi$sbp, log(loess_pred / (1 - loess_pred)), main = "Log-Odds Transformation")

I think the difference is in the order of transformation. Then, how can the loess handle the binary data?

Share Improve this question edited Feb 5 at 2:51 doraemon asked Feb 5 at 2:04 doraemondoraemon 8356 silver badges14 bronze badges 2
  • any particular reason you used degree 1 in your manual loess call rather than the default degree = 2 ? – Ben Bolker Commented Feb 5 at 2:28
  • There is no specific reason I used degree = 2...I just copied the code from the note. – doraemon Commented Feb 5 at 2:47
Add a comment  | 

1 Answer 1

Reset to default 1

Your two plots aren't doing the same thing. Try the following:

set.seed(123)
n <- 100
data_mi <- data.frame(
  sbp = rnorm(n, mean = 130, sd = 15),  # Systolic BP
  chd = rbinom(n, size = 1, prob = 0.3)  # CHD occurrence (0/1)
)

library(ggplot2)
p1 <- ggplot(data_mi, aes(x = sbp, y = chd)) +
  geom_point() +
  geom_smooth(method = "loess")

loess_fit <- loess(chd ~ sbp, data = data_mi)
loess_pred <- predict(loess_fit)

p1 +
  geom_point(aes(y=loess_pred), col="red")

发布评论

评论列表(0)

  1. 暂无评论