Loading...
Please wait, while we are loading the content...
Similar Documents
Parallel Classification on SMP Systems (1998)
Content Provider | CiteSeerX |
---|---|
Author | Zaki, Mohammed |
Description | This paper presents fast scalable decision-tree-based classification algorithms targeting shared-memory systems. The algorithms are based on the sequential SPRINT classifier and span the gamut of data and task parallelism. The data parallelism is based on attribute scheduling among processors. This is extended with task pipelining and dynamic load balancing to yield more efficient schemes. The task parallel approach uses dynamic subtree partitioning among processors. These schemes are disk based and achieve excellent speedup, making them ideally suited for data mining in very large databases. 1 Introduction An important task of data mining is to assign objects to predefined categories or classes -- a process called Classification. The input to the classification system consists of a set of example records, called a training set, over several fields or attributes. Attributes are either continuous, coming from an ordered domain, or categorical, coming from an unordered domain. One of the... |
File Format | |
Language | English |
Publisher Date | 1998-01-01 |
Publisher Institution | In The 1st Workshop on High Performance Data Mining (in conjuction with IPPS'98 |
Access Restriction | Open |
Subject Keyword | Task Pipelining Smp System Data Mining Parallel Classification Dynamic Load Unordered Domain Efficient Scheme Sequential Sprint Classifier Task Parallelism Several Field Dynamic Subtree Partitioning Ordered Domain Shared-memory System Data Parallelism Large Database Fast Scalable Decision-tree-based Classification Example Record Classification System Task Parallel Approach Achieve Excellent Speedup Training Set Attribute Scheduling Important Task |
Content Type | Text |
Resource Type | Article |