r - Identify different strings at different time frames

I have a huge dataset with several patients and their medication at different examinations. Medication1 and Dose1 reflect the medication and its dosage at baseline, Medication2 and Dose2 at the second examination, etc. The medication of each patient could be changed over time, as well as the dosage. What I want to do is to evaluate which patients did have a change of medication and which patients had a change of dosages over time for the same medication.

ID= c(1,2,3)
Medication1= c(Ciprofloxacin, Amoxicillin, Penicillin)
Dose1= c(10, 10, 10)
Medication2= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose2= c(20, 10, 10)
Medication3= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose3= c(10, 15, 10)
ID_MED <- data.frame(ID, Medication1, Dose1, Medication2, Dose2, Medication3, Dose3)

So, in the case of different dosages for the same medication, I would get Patient 1 - Ciprofloxacin and Patient 2 - Benzylpenicillin. Regarding the changes of Medication I would only get patient 2.

I guess one solution could be to identify different strings between several columns per ID. But do not know how to identify these different strings over several columns. Could you please help me out?

ID= c(1,2,3)
Medication1= c(Ciprofloxacin, Amoxicillin, Penicillin)
Dose1= c(10, 10, 10)
Medication2= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose2= c(20, 10, 10)
Medication3= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose3= c(10, 15, 10)
ID_MED <- data.frame(ID, Medication1, Dose1, Medication2, Dose2, Medication3, Dose3)

So, in the case of different dosages for the same medication, I would get Patient 1 - Ciprofloxacin and Patient 2 - Benzylpenicillin. Regarding the changes of Medication I would only get patient 2.

I guess one solution could be to identify different strings between several columns per ID. But do not know how to identify these different strings over several columns. Could you please help me out?

Share Improve this question asked 6 hours ago USER12345 936 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

You've asked two separate questions in which the results cannot ideally be combined. One way to answer both questions is to pivot the data to long form and obtain the answer using summarise.

library(tidyr)
library(dplyr)

Change of medication:

pivot_longer(ID_MED, -ID, names_to=c(".value", "exam"),
             names_pattern=c("(\\D+)(\\d+)")) |>
  summarise(med.change=any(Medication!=lag(Medication), na.rm=TRUE), .by=ID)

# A tibble: 3 × 2
     ID med.change
  <dbl> <lgl>     
1     1 FALSE     
2     2 TRUE      
3     3 FALSE

Change of dose within the same medication:

pivot_longer(ID_MED, -ID, names_to=c(".value", "exam"),
             names_pattern=c("(\\D+)(\\d+)")) |>
  summarise(dose.change=n()>1 & var(Dose)>0, .by=c(ID, Medication))

# A tibble: 4 × 3
     ID Medication       dose.change
  <dbl> <chr>            <lgl>      
1     1 Ciprofloxacin    TRUE       
2     2 Amoxicillin      FALSE         
3     2 Benzylpenicillin TRUE       
4     3 Penicillin       FALSE

Data:

  ID   Medication1 Dose1      Medication2 Dose2      Medication3 Dose3
1  1 Ciprofloxacin    10    Ciprofloxacin    20    Ciprofloxacin    10
2  2   Amoxicillin    10 Benzylpenicillin    10 Benzylpenicillin    15
3  3    Penicillin    10       Penicillin    10       Penicillin    10

#dput()
ID_MED <- structure(list(ID = c(1, 2, 3), Medication1 = c("Ciprofloxacin", 
"Amoxicillin", "Penicillin"), Dose1 = c(10, 10, 10), Medication2 = c("Ciprofloxacin", 
"Benzylpenicillin", "Penicillin"), Dose2 = c(20, 10, 10), Medication3 = c("Ciprofloxacin", 
"Benzylpenicillin", "Penicillin"), Dose3 = c(10, 15, 10)), class = "data.frame", row.names = c(NA, 
-3L))

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

r - Identify different strings at different time frames - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)