Loading...
Please wait, while we are loading the content...
Similar Documents
Analysis and Evaluation of Similarity Metrics in Collaborative Filtering Recommender System
Content Provider | Semantic Scholar |
---|---|
Author | Guo, Shuhang |
Copyright Year | 2014 |
Abstract | KEMI-TORNIO UNIVERSITY OF APPLIED SCIENCES Degree programme: Business Information Technology Writer: Guo, Shuhang Thesis title: Analysis and evaluation of similarity metrics in collaborative filtering recommender system Pages (of which appendix): 62 (1) Date: May 15, 2014 Thesis instructor: Ryabov, Vladimir This research is focused on the field of recommender systems. The general aims of this thesis are to summary the state-of-the-art in recommendation systems, evaluate the efficiency of the traditional similarity metrics with varies of data sets, and propose an ideology to model new similarity metrics. The literatures on recommender systems were studied for summarizing the current development in this filed. The implementation of the recommendation and evaluation was achieved by Apache Mahout which provides an open source platform of recommender engine. By importing data information into the project, a customized recommender engine was built. Since the recommending results of collaborative filtering recommender significantly rely on the choice of similarity metrics and the types of the data, several traditional similarity metrics provided in Apache Mahout were examined by the evaluator offered in the project with five data sets collected by some academy groups. From the evaluation, I found out that the best performance of each similarity metric was achieved by optimizing the adjustable parameters. The features of each similarity metric were obtained and analyzed with practical data sets. In addition, an ideology by combining two traditional metrics was proposed in the thesis and it was proven applicable and efficient by the metrics combination of Pearson correlation and Euclidean distance. The observation and evaluation of traditional similarity metrics with practical data is helpful to understand their features and suitability, from which new models can be created. Besides, the ideology proposed for modeling new similarity metrics can be found useful both theoretically and practically. |
File Format | PDF HTM / HTML |
Alternate Webpage(s) | https://www.theseus.fi/bitstream/handle/10024/80193/Shuhang%20Guo_BIT10_K0951349_FinalThesis.pdf?sequence=1 |
Language | English |
Access Restriction | Open |
Content Type | Text |
Resource Type | Article |