最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

r - Identify different strings at different time frames - Stack Overflow

programmeradmin3浏览0评论

I have a huge dataset with several patients and their medication at different examinations. Medication1 and Dose1 reflect the medication and its dosage at baseline, Medication2 and Dose2 at the second examination, etc. The medication of each patient could be changed over time, as well as the dosage. What I want to do is to evaluate which patients did have a change of medication and which patients had a change of dosages over time for the same medication.

ID= c(1,2,3)
Medication1= c(Ciprofloxacin, Amoxicillin, Penicillin)
Dose1= c(10, 10, 10)
Medication2= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose2= c(20, 10, 10)
Medication3= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose3= c(10, 15, 10)
ID_MED <- data.frame(ID, Medication1, Dose1, Medication2, Dose2, Medication3, Dose3)

So, in the case of different dosages for the same medication, I would get Patient 1 - Ciprofloxacin and Patient 2 - Benzylpenicillin. Regarding the changes of Medication I would only get patient 2.

I guess one solution could be to identify different strings between several columns per ID. But do not know how to identify these different strings over several columns. Could you please help me out?

I have a huge dataset with several patients and their medication at different examinations. Medication1 and Dose1 reflect the medication and its dosage at baseline, Medication2 and Dose2 at the second examination, etc. The medication of each patient could be changed over time, as well as the dosage. What I want to do is to evaluate which patients did have a change of medication and which patients had a change of dosages over time for the same medication.

ID= c(1,2,3)
Medication1= c(Ciprofloxacin, Amoxicillin, Penicillin)
Dose1= c(10, 10, 10)
Medication2= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose2= c(20, 10, 10)
Medication3= c(Ciprofloxacin, Benzylpenicillin, Penicillin)
Dose3= c(10, 15, 10)
ID_MED <- data.frame(ID, Medication1, Dose1, Medication2, Dose2, Medication3, Dose3)

So, in the case of different dosages for the same medication, I would get Patient 1 - Ciprofloxacin and Patient 2 - Benzylpenicillin. Regarding the changes of Medication I would only get patient 2.

I guess one solution could be to identify different strings between several columns per ID. But do not know how to identify these different strings over several columns. Could you please help me out?

Share Improve this question asked 6 hours ago USER12345USER12345 936 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

You've asked two separate questions in which the results cannot ideally be combined. One way to answer both questions is to pivot the data to long form and obtain the answer using summarise.

library(tidyr)
library(dplyr)

Change of medication:

pivot_longer(ID_MED, -ID, names_to=c(".value", "exam"),
             names_pattern=c("(\\D+)(\\d+)")) |>
  summarise(med.change=any(Medication!=lag(Medication), na.rm=TRUE), .by=ID)

# A tibble: 3 × 2
     ID med.change
  <dbl> <lgl>     
1     1 FALSE     
2     2 TRUE      
3     3 FALSE

Change of dose within the same medication:

pivot_longer(ID_MED, -ID, names_to=c(".value", "exam"),
             names_pattern=c("(\\D+)(\\d+)")) |>
  summarise(dose.change=n()>1 & var(Dose)>0, .by=c(ID, Medication))

# A tibble: 4 × 3
     ID Medication       dose.change
  <dbl> <chr>            <lgl>      
1     1 Ciprofloxacin    TRUE       
2     2 Amoxicillin      FALSE         
3     2 Benzylpenicillin TRUE       
4     3 Penicillin       FALSE

Data:

  ID   Medication1 Dose1      Medication2 Dose2      Medication3 Dose3
1  1 Ciprofloxacin    10    Ciprofloxacin    20    Ciprofloxacin    10
2  2   Amoxicillin    10 Benzylpenicillin    10 Benzylpenicillin    15
3  3    Penicillin    10       Penicillin    10       Penicillin    10

#dput()
ID_MED <- structure(list(ID = c(1, 2, 3), Medication1 = c("Ciprofloxacin", 
"Amoxicillin", "Penicillin"), Dose1 = c(10, 10, 10), Medication2 = c("Ciprofloxacin", 
"Benzylpenicillin", "Penicillin"), Dose2 = c(20, 10, 10), Medication3 = c("Ciprofloxacin", 
"Benzylpenicillin", "Penicillin"), Dose3 = c(10, 15, 10)), class = "data.frame", row.names = c(NA, 
-3L))
发布评论

评论列表(0)

  1. 暂无评论