Loading...
Please wait, while we are loading the content...
Similar Documents
Big Data and Hadoop : Improving MapReduce Performance using Clustering Algorithm
| Content Provider | Semantic Scholar |
|---|---|
| Copyright Year | 2015 |
| Abstract | “Big Data” is a popular term used to describe the exponential growth and availability of structured, unstructured data and semi-structured data that has potential to be mined for information. Data mining involves knowledge discovery from these large data sets. Hadoop is the core platform for storing the large volume of data into Hadoop Distributed File System (HDFS) and that data get processed by MapReduce model in parallel. Hadoop is designed to scale up from a single server to thousands of machines and with a very high degree of fault tolerance. This paper presents we have implemented our k-means algorithm in single and multi-node Hadoop cluster. We calculate the performance based on the time to execute the MapReduce job. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://www.ijsrd.com/articles/IJSRDV3I30951.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |