I have a huge dataset with several patients and their medication at different examinations. Medication1 and Dose1 reflect the medication and its dosage at baseline, Medication2 and Dose2 at the second examination, etc. The medication of each patient could be changed over time, as well as the dosage. What I want to do is to evaluate which patients did have a change of medication and which patients had a change of dosages over time for the same medication.
ID= c(1,2,3)
Medication1= c(Ciprofloxacin, Amoxicillin, Penicillin)
Dose1= c(10, 10, 10)
Medication2= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose2= c(20, 10, 10)
Medication3= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose3= c(10, 15, 10)
ID_MED <- data.frame(ID, Medication1, Dose1, Medication2, Dose2, Medication3, Dose3)
So, in the case of different dosages for the same medication, I would get Patient 1 - Ciprofloxacin and Patient 2 - Benzylpenicillin. Regarding the changes of Medication I would only get patient 2.
I guess one solution could be to identify different strings between several columns per ID. But do not know how to identify these different strings over several columns. Could you please help me out?
I have a huge dataset with several patients and their medication at different examinations. Medication1 and Dose1 reflect the medication and its dosage at baseline, Medication2 and Dose2 at the second examination, etc. The medication of each patient could be changed over time, as well as the dosage. What I want to do is to evaluate which patients did have a change of medication and which patients had a change of dosages over time for the same medication.
ID= c(1,2,3)
Medication1= c(Ciprofloxacin, Amoxicillin, Penicillin)
Dose1= c(10, 10, 10)
Medication2= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose2= c(20, 10, 10)
Medication3= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose3= c(10, 15, 10)
ID_MED <- data.frame(ID, Medication1, Dose1, Medication2, Dose2, Medication3, Dose3)
So, in the case of different dosages for the same medication, I would get Patient 1 - Ciprofloxacin and Patient 2 - Benzylpenicillin. Regarding the changes of Medication I would only get patient 2.
I guess one solution could be to identify different strings between several columns per ID. But do not know how to identify these different strings over several columns. Could you please help me out?
Share Improve this question asked 6 hours ago USER12345USER12345 936 bronze badges1 Answer
Reset to default 0You've asked two separate questions in which the results cannot ideally be combined. One way to answer both questions is to pivot the data to long form and obtain the answer using summarise
.
library(tidyr)
library(dplyr)
Change of medication:
pivot_longer(ID_MED, -ID, names_to=c(".value", "exam"),
names_pattern=c("(\\D+)(\\d+)")) |>
summarise(med.change=any(Medication!=lag(Medication), na.rm=TRUE), .by=ID)
# A tibble: 3 × 2
ID med.change
<dbl> <lgl>
1 1 FALSE
2 2 TRUE
3 3 FALSE
Change of dose within the same medication:
pivot_longer(ID_MED, -ID, names_to=c(".value", "exam"),
names_pattern=c("(\\D+)(\\d+)")) |>
summarise(dose.change=n()>1 & var(Dose)>0, .by=c(ID, Medication))
# A tibble: 4 × 3
ID Medication dose.change
<dbl> <chr> <lgl>
1 1 Ciprofloxacin TRUE
2 2 Amoxicillin FALSE
3 2 Benzylpenicillin TRUE
4 3 Penicillin FALSE
Data:
ID Medication1 Dose1 Medication2 Dose2 Medication3 Dose3
1 1 Ciprofloxacin 10 Ciprofloxacin 20 Ciprofloxacin 10
2 2 Amoxicillin 10 Benzylpenicillin 10 Benzylpenicillin 15
3 3 Penicillin 10 Penicillin 10 Penicillin 10
#dput()
ID_MED <- structure(list(ID = c(1, 2, 3), Medication1 = c("Ciprofloxacin",
"Amoxicillin", "Penicillin"), Dose1 = c(10, 10, 10), Medication2 = c("Ciprofloxacin",
"Benzylpenicillin", "Penicillin"), Dose2 = c(20, 10, 10), Medication3 = c("Ciprofloxacin",
"Benzylpenicillin", "Penicillin"), Dose3 = c(10, 15, 10)), class = "data.frame", row.names = c(NA,
-3L))