I am using R from SAS using proc iml
. I want to import a dataset from sas (in the example sashelp.class
) and print the first 5 rows.
options set=R_HOME="C:/PROGRA~1/R/R-44~1.2"; * Position of R_HOME found via command R.home() in R;
proc iml;
call ExportDataSetToR("sashelp.class", "dt");
submit / R;
print(dt[1:5,])
endsubmit;
quit;
This is the output I get:
The procedure works correctly, but the output is surprising (to me): for some reason, every value is preceded by "®ÿþ" and followed by "ÿþ". Inside these strange expressions, the correct values are stored, as can be verified with
proc print data = sashelp.class(obs = 5); run;
Upon further insection, I have seen that this also happens with non-imported character data, while numeric data seems to work fine:
proc iml;
submit / R;
seq(0,10,1) # numeric variables work fine
c("Alice", "Bob", "Charles") # character do not
endsubmit;
quit;
In general, this problem is not too disruptive, but I would like to understand which could be the reason for such behaviour? I thought it could be caused by encoding mismatches between SAS and R, but I could not find a way to correct this.
R session info:
- R version 4.4.2 (2024-10-31 ucrt)
- Platform: x86_64-w64-mingw32/x64
- Running under: Windows 11 x64 (build 26100)
SAS info:
- SAS (r) Proprietary Software 9.4 (TS1M7)
- SAS/IML 15.2
I am using R from SAS using proc iml
. I want to import a dataset from sas (in the example sashelp.class
) and print the first 5 rows.
options set=R_HOME="C:/PROGRA~1/R/R-44~1.2"; * Position of R_HOME found via command R.home() in R;
proc iml;
call ExportDataSetToR("sashelp.class", "dt");
submit / R;
print(dt[1:5,])
endsubmit;
quit;
This is the output I get:
The procedure works correctly, but the output is surprising (to me): for some reason, every value is preceded by "®ÿþ" and followed by "ÿþ". Inside these strange expressions, the correct values are stored, as can be verified with
proc print data = sashelp.class(obs = 5); run;
Upon further insection, I have seen that this also happens with non-imported character data, while numeric data seems to work fine:
proc iml;
submit / R;
seq(0,10,1) # numeric variables work fine
c("Alice", "Bob", "Charles") # character do not
endsubmit;
quit;
In general, this problem is not too disruptive, but I would like to understand which could be the reason for such behaviour? I thought it could be caused by encoding mismatches between SAS and R, but I could not find a way to correct this.
R session info:
- R version 4.4.2 (2024-10-31 ucrt)
- Platform: x86_64-w64-mingw32/x64
- Running under: Windows 11 x64 (build 26100)
SAS info:
- SAS (r) Proprietary Software 9.4 (TS1M7)
- SAS/IML 15.2
- What is the ENCODING setting for your SAS session? – Tom Commented Mar 3 at 13:49
- Hi! the encoding for the sas session is WLATIN1 – Luigi Commented Mar 3 at 14:16
- Try running the same test in SAS session started with encoding=UTF-8. If that doesn't work you probably need to check what encoding R is using. – Tom Commented Mar 3 at 14:22
- Mmh, interestingly enough, this did not solve it, although it seems that both SAS and R were using UTF-8 encoding – Luigi Commented Mar 3 at 15:03
- 1 Make sure R on that machine is working as expected. Try running the same R code on that same server using that same R instance. – Tom Commented Mar 3 at 16:10
1 Answer
Reset to default 0This might be because proc iml deals with matrices of one type only. Numeric or character. The sashelp.class data set contains both numeric and character data types.
This blog post shows how to write multiple matrices to a single data set enabling writing both a numeric and character matrix to a data set. I guess you would have to split sashelp.class into two data sets first to separate the character and numeric data.