Which vectorizer should I use?
Now, I am using TfidfVectorizer, which is supposed to lower the weight of common words. Should I use another vectorizer instead, and use nltk to remove common words?
- https://stackoverflow.com/questions/9953619/technique-to-remove-common-wordsand-their-plural-versions-from-a-string
- https://www.geeksforgeeks.org/removing-stop-words-nltk-python/
- https://towardsdatascience.com/very-simple-python-script-for-extracting-most-common-words-from-a-story-1e3570d0b9d0
- https://machinelearningmastery.com/clean-text-machine-learning-python/
Edited by Marek Lovčí