python - Adding words to scikit-learn's CountVectorizer's stop list -



python - Adding words to scikit-learn's CountVectorizer's stop list -

scikit-learn's countvectorizer class lets pass string 'english' argument stop_words. want add together things predefined list. can tell me how this?

according source code sklearn.feature_extraction.text, total list (actually frozenset, stop_words) of english_stop_words exposed through __all__. hence if want utilize list plus more items, like:

from sklearn.feature_extraction import text stop_words = text.english_stop_words.union(my_additional_stop_words)

(where my_additional_stop_words sequence of strings) , utilize result stop_words argument. input countvectorizer.__init__ parsed _check_stop_list, pass new frozenset straight through.

python scikit-learn stop-words

Comments

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

django - Access session in user model .save() -

php - .htaccess Multiple Rewrite Rules / Prioritizing -