最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

r - Report p-value from lme model (random intercept) with gtsummary using tbl_summary() and add_p() - Stack Overflow

programmeradmin1浏览0评论

I am working in R using gtsummary, trying to report the p-value from the random intercept model

lme(Variable1 ~ Study_Group, random = ~1|ID, na.action = na.omit, data = study_dat) 

in a summary/comparison table using

lme_stat_function <- function(data, variable, by, group, ...){
  nlme::lme(variable ~ by, random = ~1|group, na.action = na.omit, data = data) %>% 
    broom.mixed::tidy()    
}

lme_stat_render <- list(
  all_continuous() ~ "lme_stat_function"
)

study_dat %>% 
  select(ID, Study_Group, Variable1) %>% 
  tbl_summary(
    by = Study_Group,
    type = list(
      where(is.numeric) ~ "continuous2"
    )
  ) %>%
  bold_labels() %>%
  italicize_levels() %>%
  add_p(
    test = lme_stat_render 
  ) 

I have been wrestling with this seemingly simple issue for days and I cannot seem to get anything other than an error that effectively reads For variable Variable1 (Study_Group) and "p.value" statistic: variable lengths differ (found for 'group').

I've been reading through the documentation and can only find pseudo-code for similar scenarios, but not exactly this one. The mixed model above does work, but the function to extract that p-value and report it in the table for the respective variable does not. Here is a look at the general structure of my data:

structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 84, 84, 
85, 86, 87, 87, 88, 88, 89, 89, 90, 90, 91, 92, 93, 93, 94, 95, 
95, 96, 97, 97), Variable1 = c(6, 29, 16, 15, 19, 15, 
4, 8, 26, 9, 7, 27, 3, 31, 16, 19, 8, 12, 5, 7, 7, 12, 9, 54, 
5, 14, 4, 3, 11, 17, 7), Study_Group = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L), levels = c("Group A", 
"Group B"), class = "factor")), row.names = c(NA, -31L
), class = c("tbl_df", "tbl", "data.frame"))

I recognize that the repeated measurements here are only repeated for one group, but I'm not debating statistical methodology here. I'm just trying to get this to work as it is. Any pointers?

I am working in R using gtsummary, trying to report the p-value from the random intercept model

lme(Variable1 ~ Study_Group, random = ~1|ID, na.action = na.omit, data = study_dat) 

in a summary/comparison table using

lme_stat_function <- function(data, variable, by, group, ...){
  nlme::lme(variable ~ by, random = ~1|group, na.action = na.omit, data = data) %>% 
    broom.mixed::tidy()    
}

lme_stat_render <- list(
  all_continuous() ~ "lme_stat_function"
)

study_dat %>% 
  select(ID, Study_Group, Variable1) %>% 
  tbl_summary(
    by = Study_Group,
    type = list(
      where(is.numeric) ~ "continuous2"
    )
  ) %>%
  bold_labels() %>%
  italicize_levels() %>%
  add_p(
    test = lme_stat_render 
  ) 

I have been wrestling with this seemingly simple issue for days and I cannot seem to get anything other than an error that effectively reads For variable Variable1 (Study_Group) and "p.value" statistic: variable lengths differ (found for 'group').

I've been reading through the documentation and can only find pseudo-code for similar scenarios, but not exactly this one. The mixed model above does work, but the function to extract that p-value and report it in the table for the respective variable does not. Here is a look at the general structure of my data:

structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 84, 84, 
85, 86, 87, 87, 88, 88, 89, 89, 90, 90, 91, 92, 93, 93, 94, 95, 
95, 96, 97, 97), Variable1 = c(6, 29, 16, 15, 19, 15, 
4, 8, 26, 9, 7, 27, 3, 31, 16, 19, 8, 12, 5, 7, 7, 12, 9, 54, 
5, 14, 4, 3, 11, 17, 7), Study_Group = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L), levels = c("Group A", 
"Group B"), class = "factor")), row.names = c(NA, -31L
), class = c("tbl_df", "tbl", "data.frame"))

I recognize that the repeated measurements here are only repeated for one group, but I'm not debating statistical methodology here. I'm just trying to get this to work as it is. Any pointers?

Share Improve this question edited Jan 30 at 19:14 Carson Keeter asked Jan 30 at 19:06 Carson KeeterCarson Keeter 831 silver badge6 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 3

You're on the right track. It looks like the issue may be in the construction of the model. In the example below, I've used reformulate() which is a great helper for creating formulas from string input.

library(gtsummary)

study_dat <-
  structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 84, 84, 
                        85, 86, 87, 87, 88, 88, 89, 89, 90, 90, 91, 92, 93, 93, 94, 95, 
                        95, 96, 97, 97), 
                 Variable1 = c(6, 29, 16, 15, 19, 15, 
                               4, 8, 26, 9, 7, 27, 3, 31, 16, 19, 8, 12, 5, 7, 7, 12, 9, 54, 
                               5, 14, 4, 3, 11, 17, 7), 
                 Study_Group = structure(c(1L, 1L, 
                                           1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                           2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L), levels = c("Group A", 
                                                                                                           "Group B"), class = "factor")), row.names = c(NA, -31L
                                                                                                           ), 
            class = c("tbl_df", "tbl", "data.frame"))


# you can construct a function that returns a single 
# row data frame with the p-value in it
# details for writing this function: https://www.danieldsjoberg/gtsummary/reference/tests.html#custom-functions
my_lme_test <- function(data, by, variable, group, ...) {
  nlme::lme(
    data = data, 
    fixed = reformulate(response = variable, termlabels = by),
    random = reformulate(glue::glue("1 | {group}"))
  ) |> 
    broom.mixed::tidy() |> 
    dplyr::filter(startsWith(term, by)) # keep the row associated with treatment
}
# test the function
my_lme_test(study_dat, by = 'Study_Group', variable = "Variable1", group = "ID")
#> # A tibble: 1 × 8
#>   effect group term               estimate std.error    df statistic p.value
#>   <chr>  <chr> <chr>                 <dbl>     <dbl> <dbl>     <dbl>   <dbl>
#> 1 fixed  <NA>  Study_GroupGroup B    -2.48      3.94    21    -0.631   0.535


study_dat |> 
  select(ID, Study_Group, Variable1) |> 
  tbl_summary(
    by = Study_Group,
    include = -ID,
    type = where(is.numeric) ~ "continuous2"
  ) |> 
  add_p(test = everything() ~ my_lme_test, group = ID) |> 
  as_kable() # convert to kable to display on SO
Characteristic Group A N = 14 Group B N = 17 p-value
Variable1 0.5
Median (Q1, Q3) 15 (9, 19) 7 (5, 16)

Created on 2025-01-30 with reprex v2.1.1

发布评论

评论列表(0)

  1. 暂无评论