最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Detect word in character variable text string and create variable based upon presence of that word SAS - Stack Overflow

programmeradmin1浏览0评论

Hello and sorry for the long title name! I am working with some data that has a long text string (some observations have up to ~2000 characters). Within these strings could be a word (AB/CD) that could be anywhere within the string. I am trying to detect AB/CD within the text string and create a binary variable (ABCD_present) if the word appears in the text.

Below is some example data

data test;
length status $175;
infile datalines dsd dlm="|" truncover;
input ID Status$;

datalines;
1|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data AB/CD
2|This is example AB/CD text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
3|This is example text I am using instead of real data. I AB/CD am making the length of this text longer to mimic the long text strings of my data
4|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
5|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
6|This is example text I am using instead of real data. I am making the length of this text longer to AB/CD mimic the long text strings of my data

;
run;

Any guidance on this would be lovely! I do not have a ton of experience using long text strings.

Thank you in advance

Hello and sorry for the long title name! I am working with some data that has a long text string (some observations have up to ~2000 characters). Within these strings could be a word (AB/CD) that could be anywhere within the string. I am trying to detect AB/CD within the text string and create a binary variable (ABCD_present) if the word appears in the text.

Below is some example data

data test;
length status $175;
infile datalines dsd dlm="|" truncover;
input ID Status$;

datalines;
1|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data AB/CD
2|This is example AB/CD text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
3|This is example text I am using instead of real data. I AB/CD am making the length of this text longer to mimic the long text strings of my data
4|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
5|This is example text I am using instead of real data. I am making the length of this text longer to mimic the long text strings of my data
6|This is example text I am using instead of real data. I am making the length of this text longer to AB/CD mimic the long text strings of my data

;
run;

Any guidance on this would be lovely! I do not have a ton of experience using long text strings.

Thank you in advance

Share Improve this question asked Nov 20, 2024 at 21:17 RyanRyan 1018 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 1

You can use the find function.

data want;
    set test;
    flag_abcd = (find(status, 'AB/CD') > 0);
run;
Status ID   flag_abcd
...    1    1
...    2    1
...    3    1
...    4    0
...    5    0
...    6    1

Two other functions that detect the presence of a substring are INDEX and PRXMATCH

flag = index (status, 'AB/CD') > 0 ;
flag = prxmatch ('m/AB\/CD/', status) > 0 ;

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论