Loading...
Please wait, while we are loading the content...
Distributed Computing and Hadoop in Statistics
| Content Provider | CiteSeerX |
|---|---|
| Author | Lu, Xiaoling Zheng, Bing |
| Abstract | Big data is ubiquitous today. The big data challenges current numeric statistical and machine learning methods, visualization methods, computational methods and computational environments. Distributed Computing and Hadoop help solve these problems. Hadoop is an open source framework for writing and running distributed applications. It consists of the MapReduce distributed compute engine and the Hadoop Distributed File System (HDFS). Mahout produces machine-learning algorithms on the Hadoop platform. Rhadoop, Rhive and RHipe are the merger of R and Hadoop. Finally we show two examples of how enterprises use Hadoop to their big data processing problems. |
| File Format | |
| Access Restriction | Open |
| Subject Keyword | Computational Environment Ubiquitous Today Big Data Machine-learning Algorithm Machine Learning Method Visualization Method Computational Method Open Source Framework Compute Engine Hadoop Distributed File System Big Data Processing Problem Hadoop Platform |
| Content Type | Text |