TỪ TF-IDF ĐẾN CÁC MÔ HÌNH DỰA TRÊN TRANSFORMER: PHƯƠNG PHÁP PHÂN CỤM CHO TỔ CHỨC SẢN PHẨM THƯƠNG MẠI ĐIỆN TỬ
DOI: https://doi.org/10.58902/nckhpt.e-v2i1.366
Bishop, C. M. (2006). Pattern recognition and machine learning. Springer.
Caliński, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3(1), 1–27. https://doi.org/10.1080/03610927408827101
Chang, W.-C., Yu, F. X., Chang, Y.-W., Yang, Y., & Kumar, S. (2020). Pre-training Tasks for Embedding-based Large-scale Retrieval (arXiv:2002.03932). arXiv. https://doi.org/10.48550/arXiv.2002.03932
Davies, D. L., & Bouldin, D. W. (1979). A Cluster Separation Measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(2), 224–227. https://doi.org/10.1109/TPAMI.1979.4766909
Dat, N. Q., & Anh, N. T. (2020). PhoBERT: Pre-trained language models for Vietnamese. In T. Cohn & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 1037–1042). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.92
Dung, T. T., Tung, L. N., Dung, B. N., & Huan, V. (2024). Emotion recognition in learners with emoji sentiment accompaniment using the PhoBERT model. Journal of Science Natural Science, 46–56. https://doi.org/10.18 173/2354-1059.2024-0034
Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD’96, 226–231.
Gaussian Mixture Model. (2025). GeeksforGeeks. https://www.geeksforgeeks.org/machine-learning/gaussian-mixture-model/
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218. https://doi.org/10.1007/BF01908075
Linden, G., Smith, B., & York, J. (2003). Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing, 7(1), 76–80. https://doi.org/10.1109/MIC.2003.1167344
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics: 5.1 (pp. 281–298). University of California Press. https://digicoll.lib.berkeley.edu/record/113015/files/math_s5_v1_article-17.pdf
Manning, C. D., Raghavan, P., & Schütze, H. (2008, July 7). Introduction to Information Retrieval. Cambridge University Press. Cambridge Aspire Website. https://doi.org/10.1017/CBO9780511809071
McAuley, J., Targett, C., Shi, Q., & van den Hengel, A. (2015). Image-Based Recommendations on Styles and Substitutes. Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’15, 43–52. https://doi.org/10.1145/2766462.2767755
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65. https://doi.org/10.1016/0377-0427(87)90125-7
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513–523. https://doi.org/10.1016/0306-4573(88)90021-0
Vinh, N. X., Epps, J., Epps, J., & Bailey, J. (2010). Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance.
Wang, L., Yang, N., Huang, X., Jiao, B., Yang, L., Jiang, D., Majumder, R., & Wei, F. (2024). Text Embeddings by Weakly-Supervised Contrastive Pre-training (arXiv:2212.03533). arXiv. https://doi.org/10.48550/arXiv.2212.03533
Yulianton, H., & Santi, R. (2024). Product Matching using Sentence-BERT: A Deep Learning Approach to E-Commerce Product Deduplication. Engineering and Technology Journal, 09. https://doi.org/10.47191/etj/v9i12.14