return FALSE; $r = well_tag_thread__update(array('id' => $id), $update); return $r; } function well_tag_thread_find($tagid, $page, $pagesize) { $arr = well_tag_thread__find(array('tagid' => $tagid), array('id' => -1), $page, $pagesize); return $arr; } function well_tag_thread_find_by_tid($tid, $page, $pagesize) { $arr = well_tag_thread__find(array('tid' => $tid), array(), $page, $pagesize); return $arr; } ?>r - How to find the date a categorical variable was last active? - Stack Overflow
最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

r - How to find the date a categorical variable was last active? - Stack Overflow

programmeradmin1浏览0评论

I have this data fram and I want to create an additional column that tells me the date the category was previously active.

DF <- data.frame(
  Date = rep(c("10-12-2024", "10-17-2024", "10-19-2024"), c(4L, 2L, 2L)),
  category = c("Red", "Red", "Blue", "Blue", "Blue", "Blue", "Red", "Blue")
)

Output:

Date category
10-12-2024 Red
10-12-2024 Red
10-12-2024 Blue
10-12-2024 Blue
10-17-2024 Blue
10-17-2024 Blue
10-19-2024 Red
10-19-2024 Blue

I have this data fram and I want to create an additional column that tells me the date the category was previously active.

DF <- data.frame(
  Date = rep(c("10-12-2024", "10-17-2024", "10-19-2024"), c(4L, 2L, 2L)),
  category = c("Red", "Red", "Blue", "Blue", "Blue", "Blue", "Red", "Blue")
)

Output:

Date category
10-12-2024 Red
10-12-2024 Red
10-12-2024 Blue
10-12-2024 Blue
10-17-2024 Blue
10-17-2024 Blue
10-19-2024 Red
10-19-2024 Blue

So I would like a 3rd column of Previous active day for that category. I'm expecting the output to look like this:

Date category previous active day
10-12-2024 Red
10-12-2024 Red
10-12-2024 Blue
10-12-2024 Blue
10-17-2024 Blue 10-12-2024
10-17-2024 Blue 10-12-2024
10-19-2024 Red 10-12-2024
10-19-2024 Blue 10-17-2024
Share Improve this question asked Jan 31 at 13:54 FPiperFPiper 831 silver badge3 bronze badges 3
  • Could you edit your question to include the code you attempted and why it didnt work? Its easier to help (and for you to proceed independently) if we build off of your current framework – jpsmith Commented Jan 31 at 13:59
  • 1 This appears to be a variation of with(DF, ave(Date, category, FUN = \(x) c(rep("", 1), head(x, -1)))) which is a simple lag by group-variable. You will find tons of answer to questions like those. I currently have no time to search for dupes, but there will be quite a few. What have you searched? Did you try adapt one? You might need to wrap some sort of consecutive_id() if you do not want entries if category is replicated. – Friede Commented Jan 31 at 14:00
  • 1 I answered this in a comment in your prior post. stackoverflow/questions/79355614/… – G. Grothendieck Commented Jan 31 at 14:18
Add a comment  | 

1 Answer 1

Reset to default 1

I would group by category and date then take the first entry - this identifies unique dates that each category was active. Then, lag the date variable by group and merge it back into the original data.

library(dplyr)
DF <- data.frame(
  Date = rep(c("10-12-2024", "10-17-2024", "10-19-2024"), c(4L, 2L, 2L)),
  category = c("Red", "Red", "Blue", "Blue", "Blue", "Blue", "Red", "Blue")
)

DF %>% 
  group_by(Date, category) %>% 
  slice_head(n=1) %>% 
  group_by(category) %>% 
  mutate(previous_active_date = lag(Date)) %>% 
  right_join(DF)
#> Joining with `by = join_by(Date, category)`
#> # A tibble: 8 × 3
#> # Groups:   category [2]
#>   Date       category previous_active_date
#>   <chr>      <chr>    <chr>               
#> 1 10-12-2024 Blue     <NA>                
#> 2 10-12-2024 Blue     <NA>                
#> 3 10-12-2024 Red      <NA>                
#> 4 10-12-2024 Red      <NA>                
#> 5 10-17-2024 Blue     10-12-2024          
#> 6 10-17-2024 Blue     10-12-2024          
#> 7 10-19-2024 Blue     10-17-2024          
#> 8 10-19-2024 Red      10-12-2024

Created on 2025-01-31 with reprex v2.1.1

发布评论

评论列表(0)

  1. 暂无评论