I have table which is formatted as such.

Name	Is this a first time client	Is this the last time he comes?
Client_1	Yes	No
Client_2	No	Yes
Client_1	No	No
Client_1	No	Yes
Client_3	Yes	No

I have table which is formatted as such.

Name	Is this a first time client	Is this the last time he comes?
Client_1	Yes	No
Client_2	No	Yes
Client_1	No	No
Client_1	No	Yes
Client_3	Yes	No

I need to keep only clients for which I have the entire history - where I see the first time they came, and the last time they came. For these clients, i need to keep all of their transactions. Here, I would need to keep all the transactions for client 1.

I want to remove the others - I don't have the first transaction of client 2, and the last transaction of client 3, so they do not interest me.

I can us Excel or R for this - but I have to say I have no idea how...

Share Improve this question asked Jan 30 at 8:36 Margaux 214 bronze badges

Any attempts you've made yourself so far? – Excellor Commented Jan 30 at 8:56
I've tried with a Pivot Table. I had a basic Pivot table - It showed client, the count of yes for first time deposit, and the count of yes for last time deposit. If I made the sum of these two, clients whose sum was superior to 2 meant that these were the ones I needed to keep. But I got stuck after this ! – Margaux Commented Jan 30 at 9:02

Add a comment |

4 Answers 4

Sorted by: Reset to default 3

You could use a combination of LET and FILTER to do it:

=LET(
    Table, A2:C8,
    Clients, INDEX(Table, , 1),
    FirstTime, INDEX(Table, , 2),
    LastTime, INDEX(Table, , 3),
    ValidClients, UNIQUE(FILTER(Clients, (COUNTIFS(Clients, Clients, FirstTime, "Yes") > 0) * (COUNTIFS(Clients, Clients, LastTime, "Yes") > 0))),
    FILTER(Table, ISNUMBER(MATCH(Clients, ValidClients, 0)))
)

An base R alternative might be

by(X, ~Name, subset, any(`Is this a first time client?`=='Yes') & 
     any(`Is this the last time he comes?`=='Yes')) |> 
  do.call(what='rbind') |> 
  `rownames<-`(NULL) # cosmetics

      Name Is this a first time client? Is this the last time he comes?
1 Client_1                          Yes                              No
2 Client_1                           No                              No
3 Client_1                           No                             Yes

X = structure(
  list(
    Name = c("Client_1", "Client_2", "Client_1", "Client_1", "Client_3"),
    `Is this a first time client?` = c("Yes", "No", "No", "No", "Yes"),
    `Is this the last time he comes?` = c("No", "Yes", "No", "Yes", "No")
  ),
  class = "data.frame",
  row.names = c(NA, -5L)
)

You can substitute by with tapply if you like.

You can try this in R like below

library(dplyr)
df %>%
  filter(
    any(`Is this a first time client?` == "Yes"),
    any(`Is this the last time he comes?` == "Yes"),
    .by = Name
  )

which gives

      Name Is this a first time client? Is this the last time he comes?
1 Client_1                          Yes                              No
2 Client_1                           No                              No
3 Client_1                           No                             Yes

Another option is using split

> (d <- split(df, ~Name))[sapply(d, \(x) all(colMeans(x[-1] == "Yes") > 0))]
$Client_1
      Name Is this a first time client? Is this the last time he comes?
1 Client_1                          Yes                              No
3 Client_1                           No                              No
4 Client_1                           No                             Yes

Data

df <- structure(list(Name = c("Client_1", "Client_2", "Client_1", "Client_1",
"Client_3"), `Is this a first time client?` = c("Yes", "No",
"No", "No", "Yes"), `Is this the last time he comes?` = c("No",
"Yes", "No", "Yes", "No")), class = "data.frame", row.names = c(NA, 
-5L))

I'm not sure if this is overkill, but here's my attempt:

=LET(_Data,A2:C6,
_Col1, CHOOSECOLS(_Data,1),
_Col2, CHOOSECOLS(_Data,2),
_Col3, CHOOSECOLS(_Data,3),
_Uni,UNIQUE(_Col1),
_Select,BYROW(_Uni,LAMBDA(a,IF(SUM((_Col1=a)*((_Col2="yes")+(_Col3="yes")))=2,a,""))),
_Drop,FILTER(_Select,_Select<>"","EMPTY"),
_Filter,FILTER(_Data,_Col1=_Drop),
_Filter)

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

r - How to select info based on different rowscriterias? - Stack Overflow

4 Answers 4

Data

与本文相关的文章

评论列表(0)