ExtraTreeRegressor in scikit-learn (Python) -
ExtraTreeRegressor in scikit-learn (Python) -
i have 2 questions on extratreeregressor in scikit-learn (python).
1) why not possible increment number of features above dimension of input space? algorithm in [1] not restrict number of maximum features. in cases selection higher max_feature can result in improve results.
2) want utilize extratreeregressor implementation of fitted q-iteration, execute extratreeregressor within loop (96 timesteps). first, set max_features 1 , plotted mse after ever iteration (upper graph). increased max_features dimension of input space ('auto') , plotted mse. why mse increases in lastly case?
we expect mse smaller larger value of max_features...
![the upper graph shows mse within loop max features set 1, lower graph shows mse within loop max_features set 'auto'][1]
figure: http://imgur.com/aqgcveu
[1] p. geurts, d. ernst., , l. wehenkel, "extremely randomized trees", machine learning, 63(1), 3-42, 2006.
i believe parameter max_features refers maximum number of features per tree can selected. means each tree in forest can select n_features - result in overfitting since each tree seeing (which opposite of want in bagged tree algorithms).  improve diagnostic plot @ training , testing error on range of max_features - should see  sweetness spot model complexity captures training , testing error without overfitting. 
to  create n_features larger number of features in data, build pipeline , random projection higher dimensional space, fit model in new space. default, don't believe extratreesregressor has functionality, since sklearn has pipeline objects can this.
 python scikit-learn regression 
 
Comments
Post a Comment