A Comparative Study of TF-IDF and Count Vectorizer under Random State Changes in a Random Forest Classifier for Emotion Detection
Issued Date
2026-01-01
Resource Type
ISSN
22414487
eISSN
17928036
Scopus ID
2-s2.0-105037642873
Journal Title
Engineering Technology and Applied Science Research
Volume
16
Issue
2
Start Page
33247
End Page
33252
Rights Holder(s)
SCOPUS
Bibliographic Citation
Engineering Technology and Applied Science Research Vol.16 No.2 (2026) , 33247-33252
Suggested Citation
Kooptiwoot S., Kooptiwoot S. A Comparative Study of TF-IDF and Count Vectorizer under Random State Changes in a Random Forest Classifier for Emotion Detection. Engineering Technology and Applied Science Research Vol.16 No.2 (2026) , 33247-33252. 33252. doi:10.48084/etasr.16158 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/116645
Title
A Comparative Study of TF-IDF and Count Vectorizer under Random State Changes in a Random Forest Classifier for Emotion Detection
Author(s)
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
In machine learning processes, parameter settings affect model accuracy. Text-based emotion detection requires stable and accurate models, making parameter choices, such as the random state, increasingly important. Previous studies usually set the random state to 42, claiming that this should be the best for obtaining good accuracy. This study examined random state settings, experimenting with values from 1 to 720 and observing the results in accuracy. In addition, a dataset was employed for emotion detection using the Random Forest (RF) classifier with two vectorizers, TF-IDF and Count. The results show that different random state settings affect model accuracy. In the training subset, the TF-IDF vectorizer offered higher and more stable accuracy than the Count vectorizer. However, the Count vectorized achieved higher accuracy on both the validation and test sets.
