I'm an R beginner and keep banging my head over this. Is there any way to get the below problem solved?
I have a ts object (called mg.ts) with values for all of my x variables (gdp for 10 countries) and y variables (consumption for 10 countries), and there are no NAs in this dataset. I'm trying to do a for loop to do multiple regressions to loop through each country's gdp - consumption pair.
I made 2 vectors each with the names of all my x variables and y variables and tried to run a nested for loop to run the regression and grab out the corresponding coefficients, but keep getting an error.
I also need to eventually store these coefficients in a dataframe to analyze later, so if anyone knows how to do that easily that would be helpful too.
Sample data (mg.ts):
X pakgdp belgdp pakcons belcons
2011 1 100 100 100 100
2012 2 102 102 103 103
2013 3 105 102 103 106
2014 4 110 102 106 108
2015 5 115 104 108 107
2016 6 120 105 111 108
2017 7 128 107 119 109
2018 8 133 109 125 111
My code:
x <- c("pakgdp", "belgdp", "usagdp",...)
y <- c("pakcons", "belcons", "usagdp",...)
for (i in x) {
for (j in y) {
as.formula(paste(i, “~”, j))
dynlm( j ~ i , data=mg.ts)
summary(dynlm( j ~ i , data=mg.ts))
}
}
This is the error I'm getting:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
In addition:
Warning message:
In storage.mode(v) <- "double" : NAs introduced by coercion
I'm an R beginner and keep banging my head over this. Is there any way to get the below problem solved?
I have a ts object (called mg.ts) with values for all of my x variables (gdp for 10 countries) and y variables (consumption for 10 countries), and there are no NAs in this dataset. I'm trying to do a for loop to do multiple regressions to loop through each country's gdp - consumption pair.
I made 2 vectors each with the names of all my x variables and y variables and tried to run a nested for loop to run the regression and grab out the corresponding coefficients, but keep getting an error.
I also need to eventually store these coefficients in a dataframe to analyze later, so if anyone knows how to do that easily that would be helpful too.
Sample data (mg.ts):
X pakgdp belgdp pakcons belcons
2011 1 100 100 100 100
2012 2 102 102 103 103
2013 3 105 102 103 106
2014 4 110 102 106 108
2015 5 115 104 108 107
2016 6 120 105 111 108
2017 7 128 107 119 109
2018 8 133 109 125 111
My code:
x <- c("pakgdp", "belgdp", "usagdp",...)
y <- c("pakcons", "belcons", "usagdp",...)
for (i in x) {
for (j in y) {
as.formula(paste(i, “~”, j))
dynlm( j ~ i , data=mg.ts)
summary(dynlm( j ~ i , data=mg.ts))
}
}
This is the error I'm getting:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
In addition:
Warning message:
In storage.mode(v) <- "double" : NAs introduced by coercion
Share
Improve this question
asked Mar 3 at 15:12
Rbeginner12345Rbeginner12345
111 silver badge1 bronze badge
1
|
1 Answer
Reset to default 5Your main problem is essentially a typo: you're creating a formula with as.formula()
, but then you're not using it in your dynlm
call — instead, you're trying to use j
and i
directly. As these are characters, not symbols, R doesn't know what to do with them.
However, I've written some code below that handles the two-way combinations of variables slightly more automatically and combines the results into a data frame.
library(dynlm)
library(broom)
library(purrr)
library(dplyr)
create sample data
dd <- read.table(header = TRUE, text = "
X pakgdp belgdp pakcons belcons
2011 1 100 100 100 100
2012 2 102 102 103 103
2013 3 105 102 103 106
2014 4 110 102 106 108
2015 5 115 104 108 107
2016 6 120 105 111 108
2017 7 128 107 119 109
2018 8 133 109 125 111
")
tsdat <- ts(dd[,-1], start = 2011)
fitting function
lmfun <- function(nm) {
res <- dynlm(reformulate(nm[2], response = nm[1]), data = tsdat)
## force variable name evaluation
res$call$formula <- eval(res$call$formula)
res
}
fit all combinations
lmfitlist <- combn(colnames(tsdat), m = 2, lmfun, simplify = FALSE)
process fits/store as tibble
## set up data frame of response/predictor names
vars <- combn(colnames(tsdat), 2, rbind, simplify = FALSE) |>
do.call(what = "rbind") |>
as.data.frame() |>
setNames(c("response", "predictor")) |>
mutate(run = as.character(seq(n())), .before = 1)
purrr::map_dfr(lmfitlist, broom::tidy, .id = "run") |>
dplyr::full_join(x = vars, by = "run") |>
as_tibble()
# A tibble: 12 × 8
run response predictor term estimate std.error statistic p.value
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 1 pakgdp belgdp (Intercept) -297. 35.4 -8.40 0.000155
2 1 pakgdp belgdp belgdp 3.96 0.341 11.6 0.0000243
3 2 pakgdp pakcons (Intercept) -36.5 12.6 -2.90 0.0275
4 2 pakgdp pakcons pakcons 1.38 0.115 12.0 0.0000204
library()
calls for the packages you use. Avoid using...
since that means we can't copy/paste the code for testing. Usedput()
to share sample data. Use the formula you create inside your function:dynlm( as.formula(paste(i, “~”, j)) , data=mg.ts)
. Do you wanti~j
orj~i
? – MrFlick Commented Mar 3 at 15:49