最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Break list into sub-lists at indices in R - Stack Overflow

programmeradmin3浏览0评论

I have a list lst of objects, say

[1,2,3,4,5,6,7,8,9]

and I want to break the list into a list() of sublists, divided by the appearance of every item in items say [3,6,9] so that I get

list(3:[1,2,3],6:[4,5,6],9:[7,8,9])

In other words, I want a function that is similar to strsplit() but for lists instead of strings.

I have a list lst of objects, say

[1,2,3,4,5,6,7,8,9]

and I want to break the list into a list() of sublists, divided by the appearance of every item in items say [3,6,9] so that I get

list(3:[1,2,3],6:[4,5,6],9:[7,8,9])

In other words, I want a function that is similar to strsplit() but for lists instead of strings.

Share Improve this question asked Feb 14 at 23:13 therickstertherickster 1456 bronze badges 7
  • 2 I am confused, [1,2,3,4,5,6,7,8,9] is not a list in R. Can you give a reproducible example please? – Axeman Commented Feb 14 at 23:18
  • Sorry I mean array to list. c(1,2,3,4,5,6,7,8,9) --> list with keys c(3,6,9) and values c(1,2,3) c(4,5,6) c(7,8,9) – therickster Commented Feb 14 at 23:18
  • 1 That is also not array notation in R. Can you please edit the question with a reproducible example? Why would e.g. "key" 6 relate to the items with index 4, 5 and 6? – Axeman Commented Feb 14 at 23:19
  • i've edited it. – therickster Commented Feb 14 at 23:20
  • 2 What exactly do you mean by sub-list? This word suggests that you need to create several lists and another list with the element of the list type. Is that what you mean? What have you tried so far? What's the problem? Please see: How to create a Minimal, Reproducible Example. – Sergey A Kryukov Commented Feb 14 at 23:48
 |  Show 2 more comments

3 Answers 3

Reset to default 3

Assuming your data are both vectors, as implied in the question:

lst <- 1:9
items <- c(3, 6, 9)

Then you can achieve your desired result using split.

split(lst[seq(max(items))], rep(items, c(items[1], diff(items))))
#> $`3`
#> [1] 1 2 3
#> 
#> $`6`
#> [1] 4 5 6
#> 
#> $`9`
#> [1] 7 8 9

And to show this generalizes, let's use letters in our vector with some different indices:

lst <- letters
items <- c(8, 12, 24)

split(lst[seq(max(items))], rep(items, c(items[1], diff(items))))
#> $`8`
#> [1] "a" "b" "c" "d" "e" "f" "g" "h"
#> 
#> $`12`
#> [1] "i" "j" "k" "l"
#> 
#> $`24`
#>  [1] "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x"

Created on 2025-02-14 with reprex v2.1.0

1) Let ok be a logical vector the same length as x indicating whether the corresponding element of x is in y. Then cumsum(ok)-ok+1 will provide the corresponding indexes of y so index into y and split.

x is sorted in the question but this code does impose that restriction. The elements in y should appear in the same order that they appear in x. Also each element of y should appear at most once in x; otherwise, it does not make sense to call y a "key". See (2) later on to relax that.

This also works for x and y being character vectors.

x <- 1:9
y <- c(3, 6, 9)

ok <- x %in% y
split(x, y[ cumsum(ok) - ok + 1 ] )

giving

$`3`
[1] 1 2 3

$`6`
[1] 4 5 6

$`9`
[1] 7 8 9

2) If the delimiters in y can be repeated in x then y is no longer a "key" vector as described in the poster's comments below the question but if we are interested in that situation anyways then replacing y[...] with just ... will handle that. In this scenario it no longer makes sense to use y as the output names so we use "1", "2", "3", ... instead. This one also works even if x does not end in an element of y in which case the portion after the last separator in x is included as an element in the output.

x <- c("z", "b", "a", "q", "f", "r", "a", "s", "a", "a")
y <- "a"

ok <- x %in% y
split(x, cumsum(ok) - ok + 1 )

giving

$`1`
[1] "z" "b" "a"

$`2`
[1] "q" "f" "r" "a"

$`3`
[1] "s" "a"

$`4`
[1] "a"

You can try findInterval(). Add + 1 to the desired breaks, bc it excludes right boundary.

> split(1:9, findInterval(1:9, c(3, 6, 9) + 1))
$`0`
[1] 1 2 3

$`1`
[1] 4 5 6

$`2`
[1] 7 8 9

> breaks <- c(3, 6, 9)
> split(1:9, findInterval(1:9, c(3, 6, 9) + 1)) |> setNames(breaks)
$`3`
[1] 1 2 3

$`6`
[1] 4 5 6

$`9`
[1] 7 8 9

> split(letters, findInterval(seq_along(letters), c(7, 14, 21) + 1))
$`0`
[1] "a" "b" "c" "d" "e" "f" "g"

$`1`
[1] "h" "i" "j" "k" "l" "m" "n"

$`2`
[1] "o" "p" "q" "r" "s" "t" "u"

$`3`
[1] "v" "w" "x" "y" "z"
发布评论

评论列表(0)

  1. 暂无评论