I'm interested in simple ways to draw quantiles on density plots in ggplot2. I'm not interested in geom_vline
solutions, as I want the lines to stop at the top of the density. I'm curious why the following doesn't work:
dat <- data.frame(x = rnorm(1e5))
ggplot(
data = dat,
aes(x = x)
) +
geom_density() +
stat_summary(
aes(y = 0,yend = after_stat(density)),
fun = \(x) quantile(x,0.5),
geom = "segment",
orientation = "y"
)
Yields:
Error in `stat_summary()`:
! Problem while mapping stat to aesthetics.
ℹ Error occurred in the 2nd layer.
Caused by error in `map_statistic()`:
! Aesthetics must be valid computed stats.
✖ The following aesthetics are invalid:
✖ `yend = after_stat(density)`
ℹ Did you map your stat in the wrong layer?
Run `rlang::last_trace()` to see where the error occurred.
I'm aware of methods where you pre-calculate the density outside of ggplot and then manually draw the line segments after extracting the height from the density estimate. I thought there must be a simpler way to do it using stat_summary
and after_stat
but I can't seem to find the right incantation.
I'm also aware that the ggridges package supports this sort of thing in part, but that comes with having to commit to a ridge plot.
Part of my confusion here is that if I replace yend = after_stat(density)
with yend = 1
it draws the line at that fixed height just fine. I assumed that the computed variable density
ought to be available, and I'm puzzled why it doesn't seem to be.
I'm interested in simple ways to draw quantiles on density plots in ggplot2. I'm not interested in geom_vline
solutions, as I want the lines to stop at the top of the density. I'm curious why the following doesn't work:
dat <- data.frame(x = rnorm(1e5))
ggplot(
data = dat,
aes(x = x)
) +
geom_density() +
stat_summary(
aes(y = 0,yend = after_stat(density)),
fun = \(x) quantile(x,0.5),
geom = "segment",
orientation = "y"
)
Yields:
Error in `stat_summary()`:
! Problem while mapping stat to aesthetics.
ℹ Error occurred in the 2nd layer.
Caused by error in `map_statistic()`:
! Aesthetics must be valid computed stats.
✖ The following aesthetics are invalid:
✖ `yend = after_stat(density)`
ℹ Did you map your stat in the wrong layer?
Run `rlang::last_trace()` to see where the error occurred.
I'm aware of methods where you pre-calculate the density outside of ggplot and then manually draw the line segments after extracting the height from the density estimate. I thought there must be a simpler way to do it using stat_summary
and after_stat
but I can't seem to find the right incantation.
I'm also aware that the ggridges package supports this sort of thing in part, but that comes with having to commit to a ridge plot.
Part of my confusion here is that if I replace yend = after_stat(density)
with yend = 1
it draws the line at that fixed height just fine. I assumed that the computed variable density
ought to be available, and I'm puzzled why it doesn't seem to be.
1 Answer
Reset to default 1A few points:
after_stat()
only has access to statistics computed in that layer. i.e.,after_stat(density)
would work ingeom_density()
, but not instat_summary()
, becausedensity
isn't computed instat_summary()
.You'll also run into problems because you're trying to summarize to a single value, whereas
stat_summary()
expects one value per uniquex
(ory
, depending onorientation
), even after summarizing. It looks like maybe you're trying to "filter" to just the median x value usingfun = \(x) quantile(x,0.5)
, but that's not whatfun
does.
I would use @F.Privé's method for getting a point density. You can use this either to create a new dataframe inside a geom_segment()
:
library(ggplot2)
set.seed(13)
density_at_val <- function(x, value) {
# vectorize to accept multiple values
inner_fx <- function(x, value) mean(dnorm(value, mean = x, sd = bw.nrd0(x)))
sapply(value, inner_fx, x = x)
}
ggplot(data = dat, aes(x = x)) +
geom_density() +
geom_segment(
data = tibble(q = quantile(dat$x, 0.5), d = density_at_val(dat$x, q)),
aes(x = q, y = 0, yend = d)
)
Or equivalently, inside annotate()
:
ggplot(data = dat, aes(x = x)) +
geom_density() +
annotate(
x = quantile(dat$x, 0.5),
y = 0,
yend = density_at_val(dat$x, quantile(dat$x, 0.5)),
geom = "segment"
)