Fine-Tuned Transformer Models for Keyword Extraction in Skincare Recommendation Systems
DOI:
https://doi.org/10.30871/jaic.v9i3.9687Keywords:
Keyword extraction, Recommendation Systems, skincare, Weigthed Jaccard, nDCGAbstract
The skincare industry in Indonesia is experiencing rapid growth, with projected revenues reaching nearly 40 billion rupiah by 2024 and expected to continue to increase. The large number of products in circulation makes it difficult for consumers to find products that suit their needs. In this context, a text-based recommendation system that utilizes advances in Natural Language Processing (NLP) technology is a promising solution. This research aims to develop a skincare product recommendation system based on user needs by applying the DistilBERT model, which is specifically fine-tuned with text in the skincare recommendation domain to perform keyword extraction. The resulting keywords are then used as parameters to provide recommendations by using co-occurrence as well as using a modification of Jaccard Similarity to assess the suitability between the content and benefits of the product and user preferences. The trained extraction model achieved the best performance with a micro F1-score of 0.96 at the token level and an exact match rate of 74.25% at the entity level. The evaluation of the recommendation system showed excellent results, with an nDCG value of 0.96 and a user satisfaction rate (CSAT) of 91.9%.
Downloads
References
[1] K. Rodan, K. Fields, G. Majewski, and T. Falla, “Skincare Bootcamp: The Evolving Role of Skincare,” Plast Reconstr Surg Glob Open, vol. 4, p. E1152, Apr. 2016, doi: 10.1097/GOX.0000000000001152.
[2] T. C. Lalchand and J. Joseph, “‘Beakers versus botanicals’ – Analyzing the efficacy of homemade skincare in comparison to manufactured skincare products,” Cosmoderma, vol. 3, p. 164, Nov. 2023, doi: 10.25259/csdm_202_2023.
[3] E. S. Susanto, F. Hamdani, M. Anjarsari, and F. Idifitriani, “Sistem Pendukung Keputusan Pemilihan Skincare Berdasarkan Jenis Kulit Wajah Menggunakan Metode Simple Additive Weighting,” Digital Transformation Technology, vol. 3, no. 2, pp. 786–795, Dec. 2023, doi: 10.47709/digitech.v3i2.2554.
[4] A. Sulami and V. Atina, “Penerapan Metode Content Based Filtering Dalam Sistem Rekomendasi Pemilihan Produk Skincare,” STRING (Satuan Tulisan Riset dan Inovasi Teknologi), vol. 9, pp. 172–181, Dec. 2024.
[5] S. Gousiya Begum and P. K. Sree, “Drug Recommendations Using A ‘Reviews and Sentiment Analysis’ By A Recurrent Neural Network,” Indonesian Journal of Multidisciplinary Science, vol. 2, pp. 3085–3094, 2023, doi: https://doi.org/10.55324/ijoms.v2i9.530.
[6] I. K. Raharjana, D. Siahaan, and C. Fatichah, “User Stories and Natural Language Processing: A Systematic Literature Review,” IEEE Access, vol. 9, pp. 53811–53826, 2021, doi: 10.1109/ACCESS.2021.3070606.
[7] J. Sammet and R. Krestel, “Domain-Specific Keyword Extraction using BERT.” [Online]. Available: https://www.econbiz.de/Record/datenbank-econis-
[8] T. Z. L. Kyaw, S. Uttama, and P. Panwong, “Leveraging Ingredient Profiles in Content-Based Skincare Product Recommendation,” in 8th International Conference on Information Technology 2024, InCIT 2024, Institute of Electrical and Electronics Engineers Inc., 2024, pp. 319–324. doi: 10.1109/InCIT63192.2024.10810620.
[9] V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter,” Oct. 2019, [Online]. Available: http://arxiv.org/abs/1910.01108
[10] Hugging Fcae, “Token classification.” Accessed: Apr. 19, 2025. [Online]. Available: https://huggingface.co/docs/transformers/en/tasks/token_classification
[11] S. Kofi Akpatsa et al., “Online News Sentiment Classification Using DistilBERT,” Journal of Quantum Computing, vol. 4, no. 1, pp. 1–11, 2022, doi: 10.32604/jqc.2022.026658.
[12] K. Alharbi and M. A. Haq, “Enhancing Disaster Response and Public Safety with Advanced Social Media Analytics and Natural Language Processing,” Engineering, Technology and Applied Science Research, vol. 14, no. 3, pp. 14212–14218, Jun. 2024, doi: 10.48084/etasr.7232.
[13] W. M. Abdul et al., “Named Clinical Entity Recognition Benchmark,” Oct. 2024, [Online]. Available: http://arxiv.org/abs/2410.05046
[14] R. Devika, S. Vairavasundaram, C. S. J. Mahenthar, V. Varadarajan, and K. Kotecha, “A Deep Learning Model Based on BERT and Sentence Transformer for Semantic Keyphrase Extraction on Big Social Data,” IEEE Access, vol. 9, pp. 165252–165261, 2021, doi: 10.1109/ACCESS.2021.3133651.
[15] Badan Pengawas Obat Dan Makanan Republik Indonesia, “Peraturan Badan Pengawas Obat Dan Makanan Nomor 31 Tahun 2018 Tentang Label Pangan Olahan,” 2018.
[16] D. A. Nugroho, C. Lubis, and N. J. Perdana, “Film Recommendation System using Neural Collaborative Filtering Method,” INTECOMS: Journal of Information Technology and Computer Science, vol. 7, no. 3, pp. 926–937, Jun. 2024, doi: 10.31539/intecoms.v7i3.8033.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Ni Putu Adnya Puspita Dewi, Desy Purnami Singgih Putri, I Nyoman Prayana Trisna

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).