Duplicate subscripts for columns on R dataframe when `within`

under R version 4.4.2 (2024-10-31) -- "Pile of Leaves", latest macos

$ R --vanilla
> load(file="tttdf")
> str(ttt)
'data.frame':   3 obs. of  17 variables:
 $ .mn.r      : num  0 0 0
 $ .sd.r      : num  0 0 0
 $ .mn.g      : num  0 0 0
 $ .sd.g      : num  0 0 0
 $ .cor.r.g   : num  1 1 1
 $ sep        : num  -1 -1 -1
 $ beta.g.ldp : num  0 0 0
 $ beta.dp.ldp: num  1 1 1
 $ beta.r.ldp : num  0 0 0
 $ sep        : num  -2 -2 -2
 $ lastdpr    : num  -3 -5 -6
 $ declinedpr : num  0 2 3
 $ sep        : num  -3 -3 -3
 $ beta.r.lr  : num  0 0 0
 $ beta.g.lg  : num  0 0 0
 $ beta.g.lr  : num  0 0 0
 $ beta.r.lg  : num  0 0 0

ttt <- within(ttt, hello <- 22)

Error in `[<-.data.frame`(`*tmp*`, nl, value = list(hello = 22, .mn.r = c(0,  : 
  duplicate subscripts for columns
> ## make it work
> xxx <- ttt[,1:ncol(ttt)]
> xxx <- within(xxx, hello <- 22)

I have no idea what could be causing this. This is why I can't shorten the example, either --- e.g., by removing columns.

under R version 4.4.2 (2024-10-31) -- "Pile of Leaves", latest macos

$ R --vanilla
> load(file="tttdf")
> str(ttt)
'data.frame':   3 obs. of  17 variables:
 $ .mn.r      : num  0 0 0
 $ .sd.r      : num  0 0 0
 $ .mn.g      : num  0 0 0
 $ .sd.g      : num  0 0 0
 $ .cor.r.g   : num  1 1 1
 $ sep        : num  -1 -1 -1
 $ beta.g.ldp : num  0 0 0
 $ beta.dp.ldp: num  1 1 1
 $ beta.r.ldp : num  0 0 0
 $ sep        : num  -2 -2 -2
 $ lastdpr    : num  -3 -5 -6
 $ declinedpr : num  0 2 3
 $ sep        : num  -3 -3 -3
 $ beta.r.lr  : num  0 0 0
 $ beta.g.lg  : num  0 0 0
 $ beta.g.lr  : num  0 0 0
 $ beta.r.lg  : num  0 0 0

ttt <- within(ttt, hello <- 22)

Error in `[<-.data.frame`(`*tmp*`, nl, value = list(hello = 22, .mn.r = c(0,  : 
  duplicate subscripts for columns
> ## make it work
> xxx <- ttt[,1:ncol(ttt)]
> xxx <- within(xxx, hello <- 22)

I have no idea what could be causing this. This is why I can't shorten the example, either --- e.g., by removing columns.

Share Improve this question edited Jan 19 at 4:25 jpsmith 17.2k6 gold badges20 silver badges44 bronze badges asked Jan 18 at 19:27 ivo Welch 2,8562 gold badges27 silver badges40 bronze badges

Input in reproducible form: ttt <- structure(list(.mn.r = c(0L, 0L, 0L), .sd.r = c(0L, 0L, 0L), .mn.g = c(0L, 0L, 0L), .sd.g = c(0L, 0L, 0L), .cor.r.g = c(1L, 1L, 1L), sep = c(-1L, -1L, -1L), beta.g.ldp = c(0L, 0L, 0L ), beta.dp.ldp = c(1L, 1L, 1L), beta.r.ldp = c(0L, 0L, 0L ), sep = c(-2L, -2L, -2L), lastdpr = c(-3L, -5L, -6L), declinedpr = c(0L, 2L, 3L), sep = c(-3L, -3L, -3L), beta.r.lr = c(0L, 0L, 0L ), beta.g.lg = c(0L, 0L, 0L), beta.g.lr = c(0L, 0L, 0L), beta.r.lg = c(0L, 0L, 0L)), class = "data.frame", row.names = c("V5", "V6", "V7")) – G. Grothendieck Commented Jan 19 at 13:45

Add a comment |

1 Answer 1

Sorted by: Reset to default 6

The sep column is duplicated. Subsetting the dataframe using ttt[, 1:ncol(ttt)] automatically repairs the column names, which resolves the issue.

In the following example, I create a dataframe with two identical column names. It produces the same error you get. When I subset the columns, their names are fixed.

df <- data.frame(a = 1, a = 2, check.names = FALSE)

within(df, hello <- 22)
# Error in `[<-.data.frame`(`*tmp*`, nl, value = list(hello = 22, a = 1,  : 
#  duplicate subscripts for columns

df[1:ncol(df)]
#   a a.1
# 1 1   2

Explanation:

The behavior that subsetting produces unique names is documented in help(`[.data.frame`); column names will be transformed to be unique, using make.unique() , if necessary (e.g., if columns are selected more than once, or if more than one column of a given name is selected if the data frame has duplicate column names). Also see help(make.names) which additionally produces 'valid' names.

> make.unique(names(df))
[1] "a"   "a.1"

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

Duplicate subscripts for columns on R dataframe when `within` - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)