逐步回归算法java实现_在R中滚动逐步回归

您遇到错误的原因是滚动数据子集的缺失值(NA) .

以数据(瑞士)为例:

dim(swiss)

# [1] 47 6

split_swiss

length(split_swiss)

# [1] 47 ## rolling subset produce 47 data.frames.

lapply(tail(split_swiss), head) # show the first 6 rows of the last 6 data.frames

[[1]]

Fertility Agriculture Examination Education Catholic Infant.Mortality

Neuchatel 64.4 17.6 35 32 16.92 23.0

Val de Ruz 77.6 37.6 15 7 4.97 20.0

ValdeTravers 67.6 18.7 25 7 8.65 19.5

V. De Geneve 35.0 1.2 37 53 42.34 18.0

Rive Droite 44.7 46.6 16 29 50.43 18.2

Rive Gauche 42.8 27.7 22 29 58.33 19.3

[[2]]

Fertility Agriculture Examination Education Catholic Infant.Mortality

Val de Ruz 77.6 37.6 15 7 4.97 20.0

ValdeTravers 67.6 18.7 25 7 8.65 19.5

V. De Geneve 35.0 1.2 37 53 42.34 18.0

Rive Droite 44.7 46.6 16 29 50.43 18.2

Rive Gauche 42.8 27.7 22 29 58.33 19.3

NA NA NA NA NA NA NA

[[3]]

Fertility Agriculture Examination Education Catholic Infant.Mortality

ValdeTravers 67.6 18.7 25 7 8.65 19.5

V. De Geneve 35.0 1.2 37 53 42.34 18.0

Rive Droite 44.7 46.6 16 29 50.43 18.2

Rive Gauche 42.8 27.7 22 29 58.33 19.3

NA NA NA NA NA NA NA

NA.1 NA NA NA NA NA NA

[[4]]

Fertility Agriculture Examination Education Catholic Infant.Mortality

V. De Geneve 35.0 1.2 37 53 42.34 18.0

Rive Droite 44.7 46.6 16 29 50.43 18.2

Rive Gauche 42.8 27.7 22 29 58.33 19.3

NA NA NA NA NA NA NA

NA.1 NA NA NA NA NA NA

NA.2 NA NA NA NA NA NA

[[5]]

Fertility Agriculture Examination Education Catholic Infant.Mortality

Rive Droite 44.7 46.6 16 29 50.43 18.2

Rive Gauche 42.8 27.7 22 29 58.33 19.3

NA NA NA NA NA NA NA

NA.1 NA NA NA NA NA NA

NA.2 NA NA NA NA NA NA

NA.3 NA NA NA NA NA NA

[[6]]

Fertility Agriculture Examination Education Catholic Infant.Mortality

Rive Gauche 42.8 27.7 22 29 58.33 19.3

NA NA NA NA NA NA NA

NA.1 NA NA NA NA NA NA

NA.2 NA NA NA NA NA NA

NA.3 NA NA NA NA NA NA

NA.4 NA NA NA NA NA NA

如果您要使用这些data.frames运行regsubsets,那么会出现错误,其中预测变量比情况多 .

lapply(split_swiss, function(x) regsubsets(Fertility ~., data=x, nvmax=10, method="forward"))

Error in leaps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax, force.in = force.in, :

y and x different lengths In addition: Warning messages:

1: In leaps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax, force.in = force.in, :

1 linear dependencies found

......

相反,我只能保留12行的子集并继续进行回归,如下所示:

split_swiss_2

lapply(split_swiss_2, function(x) regsubsets(Fertility ~., data=x, nvmax=10, method="forward"))


版权声明:本文为weixin_42347778原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。