r - Error in str2lang(x) trying to use nls() within a function

I am trying to fit an exponential decay curve to some biological data using nls() within a function (I want to do this multiple times).

I receive an error with the way I am specifying nls() which I can't work out:

Error in str2lang(x) : <text>:2:0: unexpected end of input
1: ~ 
   ^

A small dummy dataset and the code I have written is below - any help would be greatly appreciated.

dat_small <-  structure(list(actual_time = c(5, 9, 30, 59, 119, 171, 216), 
                             activity = c(7158, 7386, 5496, 3884, 1502, 819, 409)), row.names = c(NA, 
                                                                                                  -7L), class = c("tbl_df", "tbl", "data.frame"))
curve_fit <-  function(x, y, df) {
  #browser()
  # Mono-exponential model for activity using actual sampling time
  mod <-  lm(log(df[[y]]) ~ df[[x]], data = df) # get starting values from log-linear model
  C0 <-  as.numeric(exp(mod$coef[1])) # exponential of log-linear intercept
  lambda1 <-  as.numeric(abs(mod$coef[2])) # absolute value of slope of log-linear model
  nls_mod_mono <-  nls(df[[y]] ~ C0*exp(-lambda1*df[[x]]), start = c(C0 = C0, lambda1 = lambda1), data = df)
  summary(nls_mod_mono) # estimate model
  xNew <- seq(0, 240, length.out = 100) # new grid of times
  yNew <- predict(nls_mod_mono, list(x = xNew)) # predict activity
  dfNew <-  data.frame(x = xNew, y = yNew)
  p_mono <-  ggplot(dfNew, aes(x = x, y = y)) +
    geom_line() +
    geom_point(data = df, aes(x = x, y = y), size = 3) +
    xlab("Actual Sampling Time (mins)") + ylab("Activity (Bq)") +
    theme_bw(base_size = 20)
  mono_half_life <-  0.693/as.numeric(coef(nls_mod_mono)[2]) # half-life
  mono_auc <-  as.numeric(coef(nls_mod_mono)[1])/as.numeric(coef(nls_mod_mono)[2]) # AUC to infinity (can be estimated in mono case by simply dividing the intercept by lambda)
  results <-  c(p_mono, mono_half_life, mono_auc)
}
curve_fit("actual_time", "activity", dat_small)

I am trying to fit an exponential decay curve to some biological data using nls() within a function (I want to do this multiple times).

I receive an error with the way I am specifying nls() which I can't work out:

Error in str2lang(x) : <text>:2:0: unexpected end of input
1: ~ 
   ^

A small dummy dataset and the code I have written is below - any help would be greatly appreciated.

dat_small <-  structure(list(actual_time = c(5, 9, 30, 59, 119, 171, 216), 
                             activity = c(7158, 7386, 5496, 3884, 1502, 819, 409)), row.names = c(NA, 
                                                                                                  -7L), class = c("tbl_df", "tbl", "data.frame"))
curve_fit <-  function(x, y, df) {
  #browser()
  # Mono-exponential model for activity using actual sampling time
  mod <-  lm(log(df[[y]]) ~ df[[x]], data = df) # get starting values from log-linear model
  C0 <-  as.numeric(exp(mod$coef[1])) # exponential of log-linear intercept
  lambda1 <-  as.numeric(abs(mod$coef[2])) # absolute value of slope of log-linear model
  nls_mod_mono <-  nls(df[[y]] ~ C0*exp(-lambda1*df[[x]]), start = c(C0 = C0, lambda1 = lambda1), data = df)
  summary(nls_mod_mono) # estimate model
  xNew <- seq(0, 240, length.out = 100) # new grid of times
  yNew <- predict(nls_mod_mono, list(x = xNew)) # predict activity
  dfNew <-  data.frame(x = xNew, y = yNew)
  p_mono <-  ggplot(dfNew, aes(x = x, y = y)) +
    geom_line() +
    geom_point(data = df, aes(x = x, y = y), size = 3) +
    xlab("Actual Sampling Time (mins)") + ylab("Activity (Bq)") +
    theme_bw(base_size = 20)
  mono_half_life <-  0.693/as.numeric(coef(nls_mod_mono)[2]) # half-life
  mono_auc <-  as.numeric(coef(nls_mod_mono)[1])/as.numeric(coef(nls_mod_mono)[2]) # AUC to infinity (can be estimated in mono case by simply dividing the intercept by lambda)
  results <-  c(p_mono, mono_half_life, mono_auc)
}
curve_fit("actual_time", "activity", dat_small)

Share Improve this question edited Jan 18 at 12:07 jay.sf 73.8k8 gold badges63 silver badges125 bronze badges asked Jan 18 at 9:54 LucaS 1,3151 gold badge13 silver badges25 bronze badges

Add a comment |

2 Answers 2

Sorted by: Reset to default 4

Questions to SO should have minimal code focused on the problem at hand. Here the question is how to avoid the error in the nls call so everything after that is not relevant and we omit it. (We do point out, however, that aes in ggplot2 accepts unquoted variables and since x and y are character write it like this aes(.data[[x]], .data[[y]]) .)

1) An easy fix is probably just to avoid the problem by defining a data frame with fixed x and y column names and then use that.

curve_fit1 <- function(x, y, df) {
  data <- data.frame(x = df[[x]], y = df[[y]])
  co <- coef(lm(log(y) ~ x, data))
  st <- c(C0 = exp(co[[1]]), lambda1 = -co[[2]])
  nls(y ~ C0 * exp(-lambda1 * x), data = data, start = st) 
}

curve_fit1("actual_time", "activity", dat_small)

giving

Nonlinear regression model
  model: y ~ C0 * exp(-lambda1 * x)
   data: data
       C0   lambda1 
8.003e+03 1.301e-02 
 residual sum-of-squares: 270020

Number of iterations to convergence: 4 
Achieved convergence tolerance: 4.228e-06

2) Even easier is to use the NLS.expoDecay self starting function in the statforbiology package in which case no starting value is needed. The output is as in (1).

library(statforbiology)

curve_fit2 <- function(x, y, df) {
  data <- data.frame(x = df[[x]], y = df[[y]])
  nls(y ~ NLS.expoDecay(x, C0, lambda1), data)
}

curve_fit2("actual_time", "activity", dat_small)

3) We could alternately use substitute to substitute the variables into the formula. This can apply to either (1) or (2). We will show it as applied to (2).

library(statforbiology)

curve_fit3 <- function(x, y, df) {
  fo <- substitute(y ~ NLS.expoDecay(x, C0, lambda1), 
    list(x = as.name(x), y = as.name(y)))
  nls(fo, df)
}

curve_fit3("actual_time", "activity", dat_small)

giving

Nonlinear regression model
  model: activity ~ NLS.expoDecay(actual_time, C0, lambda1)
   data: df
       C0   lambda1 
8.003e+03 1.301e-02 
 residual sum-of-squares: 270020

Number of iterations to convergence: 4 
Achieved convergence tolerance: 4.228e-06

4) Yet another altertnative is to use the drc package together with DRC.exponDecay from statforbiology. In this case the simple form of formula accepted allows us to use reformulate to generate it and furthermore the drc class object produced from drm has coef, plot and other methods -- try methods(class = "drc") .

library(drc)
library(statforbiology)

curve_fit4 <- function(x, y, df) {
  drm(reformulate(x, y), data = df, fct = DRC.expoDecay())
}

res <- curve_fit4("actual_time", "activity", dat_small)
plot(res)

Try a formula for the nls, in this case activity ~ C0 * exp(-lambda1 * actual_time).

You can use as.formula(paste(y, "~ C0*exp(-lambda1*", x, ")", sep = "")), or sprintf as shown below.

To name the variable in list in the newdata= argument of predict with the name you specified in x, use setNames.

Note, that using df[[x]] is possible, as ist's still used in lm. But then the data= argument is redundant.

The as.numeric can be confusing, as it is actually already "numeric". Use unname instead.

curve_fit <- function(x, y, df, plot=FALSE) {
 ## Mono-exponential model for activity using actual sampling time
 mod <- lm(log(df[[y]]) ~ df[[x]], data=df)  ## get starting values from log-linear model
 C0 <- unname(exp(mod$coef[1]))  ## exponential of log-linear intercept
 lambda1 <- unname(abs(mod$coef[2]))  ## absolute value of slope of log-linear model
 fo <- as.formula(sprintf('%s ~ C0*exp(-lambda1*%s)', y, x))
 nls_mod_mono <- nls(fo, start=c(C0=C0, lambda1=lambda1), data=df)
 xNew <- seq(0, 240, length.out=100)  ## new grid of times
 yNew <- predict(nls_mod_mono, newdata= list(xNew) |> setNames(x))  ## predict activity
 fit_df <- data.frame(x=xNew, y=yNew)
 mono_half_life <- 0.693/coef(nls_mod_mono)[2]  ## half-life
 mono_auc <- coef(nls_mod_mono)[1]/coef(nls_mod_mono)[2]  ## AUC to infinity (can be estimated in mono case by simply dividing the intercept by lambda)
 list(fit_df=fit_df, 
      mono_half_life=unname(mono_half_life), 
      mono_auc=unname(mono_auc))
}

> res <- curve_fit("actual_time", "activity", dat_small)
> res$mono_half_life
[1] 53.25588
> res$mono_auc
[1] 615006.6

I recommend plotting the curve outside the function to avoid feature-creeping.

par(mar=c(5, 4, 2, 2))
plot(y ~ x, res$fit_df, type='l', xlab='Actual Sampling Time (mins)',
     ylab="Activity (Bq)", las=1)

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

r - Error in str2lang(x) trying to use nls() within a function - Stack Overflow

2 Answers 2

与本文相关的文章

评论列表(0)