Temu Kembali Informasi Big Data Menggunakan K-Means Clustering
Abstract
This day, human life is always associated with the data, where the data is created and sent each second worldwide. This causes the data in the network increases massive (massive). Hence the need for data management is increasing. One important part of data management is the process of finding information desired by a user or commonly referred to as information retrieval (information retrieval). The main purpose of information retrieval is to rediscover documents containing information relevant to the query that is fed by the user. There have been many proposed methods for information retrieval. But of the technique still has problems related to the speed and accuracy of searches. In this thesis, the authors propose the best methods of information retrieval in search of big data. Problems arise when the process of seeking information. This is due to big data is dominated by unstructured data. Unstructured data have properties difficult to organize. Therefore, we need a special technique to ov ercome. One solution to overcome these problems is to add clustering in indexing information. This study used a clustering method using k-means clustering. Based on the experimental results using k-means clustering obtained average value of 0.8 precision recall the average value of 0.741, and the average value of 0579 seconds of computing time.