Nutri-score classification of snack products using word embedding and random forest
DOI:
https://doi.org/10.59190/stc.v6i3.393Keywords:
Healthy Food, Word2Vec, GloVe, FastText, Random ForestAbstract
The increasing consumption of packaged snack products has raised concerns regarding their nutritional quality and potential health impacts. Although nutritional information is commonly provided on food packaging, many consumers experience difficulties in interpreting ingredient descriptions and nutritional labels, making it challenging to identify whether a product is healthy or unhealthy. Therefore, an automated classification system is needed to assist consumers in understanding nutritional information more effectively. This study proposes a text-based classification framework for categorizing snack products into healthy and unhealthy classes using Natural Language Processing (NLP), word embedding techniques, and the Random Forest algorithm. The dataset was obtained from the Open Food Facts database and filtered to include snack products only. After preprocessing and class balancing, a total of 465 samples were used for model development and evaluation. The preprocessing stage consisted of case folding, tokenization, stopword removal, and stemming. Three word embedding techniques, namely Word2Vec, GloVe, and FastText, were employed to transform textual ingredient descriptions into numerical feature representations. Subsequently, Random Forest was utilized as the classification algorithm, and its performance was evaluated using Accuracy, Balanced Accuracy, Precision, Recall, F1-score, and Macro F1-score. The experimental results show that GloVe achieved the best performance among the evaluated embedding methods, obtaining an accuracy of 86.02%, balanced accuracy of 84.72%, precision of 85.98%, recall of 86.02%, F1-score of 85.91%, and macro F1-score of 85.19%. The findings indicate that GloVe provides a more effective semantic representation of food-related textual information compared to Word2Vec and FastText. Overall, the proposed framework demonstrates the potential of NLP-based approaches for automated nutritional assessment and healthy food classification.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Onky Wanda Darmawan, Junadhi Junadhi, Lusiana Efrizoni, Nurjayadi Nurjayadi

This work is licensed under a Creative Commons Attribution 4.0 International License.


























