最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

r - Draw quantiles on densities using stat_summary and geom = "segment" - Stack Overflow

programmeradmin2浏览0评论

I'm interested in simple ways to draw quantiles on density plots in ggplot2. I'm not interested in geom_vline solutions, as I want the lines to stop at the top of the density. I'm curious why the following doesn't work:

dat <- data.frame(x = rnorm(1e5))

ggplot(
  data = dat,
  aes(x = x)
) + 
  geom_density() + 
  stat_summary(
    aes(y = 0,yend = after_stat(density)),
    fun = \(x) quantile(x,0.5),
    geom = "segment",
    orientation = "y"
  )

Yields:

Error in `stat_summary()`:
! Problem while mapping stat to aesthetics.
ℹ Error occurred in the 2nd layer.
Caused by error in `map_statistic()`:
! Aesthetics must be valid computed stats.
✖ The following aesthetics are invalid:
✖ `yend = after_stat(density)`
ℹ Did you map your stat in the wrong layer?
Run `rlang::last_trace()` to see where the error occurred.

I'm aware of methods where you pre-calculate the density outside of ggplot and then manually draw the line segments after extracting the height from the density estimate. I thought there must be a simpler way to do it using stat_summary and after_stat but I can't seem to find the right incantation.

I'm also aware that the ggridges package supports this sort of thing in part, but that comes with having to commit to a ridge plot.

Part of my confusion here is that if I replace yend = after_stat(density) with yend = 1 it draws the line at that fixed height just fine. I assumed that the computed variable density ought to be available, and I'm puzzled why it doesn't seem to be.

I'm interested in simple ways to draw quantiles on density plots in ggplot2. I'm not interested in geom_vline solutions, as I want the lines to stop at the top of the density. I'm curious why the following doesn't work:

dat <- data.frame(x = rnorm(1e5))

ggplot(
  data = dat,
  aes(x = x)
) + 
  geom_density() + 
  stat_summary(
    aes(y = 0,yend = after_stat(density)),
    fun = \(x) quantile(x,0.5),
    geom = "segment",
    orientation = "y"
  )

Yields:

Error in `stat_summary()`:
! Problem while mapping stat to aesthetics.
ℹ Error occurred in the 2nd layer.
Caused by error in `map_statistic()`:
! Aesthetics must be valid computed stats.
✖ The following aesthetics are invalid:
✖ `yend = after_stat(density)`
ℹ Did you map your stat in the wrong layer?
Run `rlang::last_trace()` to see where the error occurred.

I'm aware of methods where you pre-calculate the density outside of ggplot and then manually draw the line segments after extracting the height from the density estimate. I thought there must be a simpler way to do it using stat_summary and after_stat but I can't seem to find the right incantation.

I'm also aware that the ggridges package supports this sort of thing in part, but that comes with having to commit to a ridge plot.

Part of my confusion here is that if I replace yend = after_stat(density) with yend = 1 it draws the line at that fixed height just fine. I assumed that the computed variable density ought to be available, and I'm puzzled why it doesn't seem to be.

Share Improve this question asked Feb 7 at 17:08 joranjoran 174k33 gold badges435 silver badges481 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

A few points:

  1. after_stat() only has access to statistics computed in that layer. i.e., after_stat(density) would work in geom_density(), but not in stat_summary(), because density isn't computed in stat_summary().

  2. You'll also run into problems because you're trying to summarize to a single value, whereas stat_summary() expects one value per unique x (or y, depending on orientation), even after summarizing. It looks like maybe you're trying to "filter" to just the median x value using fun = \(x) quantile(x,0.5), but that's not what fun does.

I would use @F.Privé's method for getting a point density. You can use this either to create a new dataframe inside a geom_segment():

library(ggplot2)
set.seed(13)

density_at_val <- function(x, value) {
  # vectorize to accept multiple values
  inner_fx <- function(x, value) mean(dnorm(value, mean = x, sd = bw.nrd0(x)))
  sapply(value, inner_fx, x = x)
}

ggplot(data = dat, aes(x = x)) +
  geom_density() +
  geom_segment(
    data = tibble(q = quantile(dat$x, 0.5), d = density_at_val(dat$x, q)),
    aes(x = q, y = 0, yend = d)
  )

Or equivalently, inside annotate():

ggplot(data = dat, aes(x = x)) +
  geom_density() +
  annotate(
    x = quantile(dat$x, 0.5),
    y = 0,
    yend = density_at_val(dat$x, quantile(dat$x, 0.5)),
    geom = "segment"
  )

发布评论

评论列表(0)

  1. 暂无评论