最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

r - How to find optimal split of train and test to return the minimum RMSE for Boston housing data set without looping - Stack O

programmeradmin2浏览0评论

I'm working to minimize the RMSE for the Boston housing data set. This is a very basic result:

library(Metrics)
df <- MASS::Boston
train <- df[1:400, ]
test <- df[401:506, ]
Boston_lm <- lm(medv ~., data = train)
Boston_lm_RMSE <- Metrics::rmse(actual = test$medv,
predicted = predict(object = Boston_lm, newdata = test))
# 6.155792

However, if the amount of train and test is changed, the RMSE is very different:

df <- MASS::Boston
train <- df[1:300, ]
test <- df[301:506, ]
Boston_lm <- lm(medv ~., data = train)
Boston_lm_RMSE <- Metrics::rmse(actual = test$medv,
predicted = predict(object = Boston_lm, newdata = test))
# 19.13284

Is there a way to determine the train and test amounts that return the lowest RMSE on the test data set without looping through a range of possible values?

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论