Loading...
Please wait, while we are loading the content...
Similar Documents
Myriad: parallel data generation on shared-nothing architectures
| Content Provider | ACM Digital Library |
|---|---|
| Author | Alexandrov, Alexander Schiefer, Berni Ewen, Stephan Bodner, Thomas O. Poelman, John Markl, Volker |
| Abstract | The need for efficient data generation for the purposes of testing and benchmarking newly developed massively-parallel data processing systems has increased with the emergence of Big Data problems. As synthetic data model specifications evolve over time, the data generator programs implementing these models have to be adapted continuously -- a task that often becomes more tedious as the set of model constraints grows. In this paper we present Myriad - a new parallel data generation toolkit. Data generators created with the toolkit can quickly produce very large datasets in a shared-nothing parallel execution environment, while at the same time preserve with cross-partition dependencies, correlations and distributions in the generated data. In addition, we report on our efforts towards a benchmark suite for large-scale parallel analysis systems that uses Myriad for the generation of OLAP-style relational datasets. |
| Starting Page | 30 |
| Ending Page | 33 |
| Page Count | 4 |
| File Format | |
| ISBN | 9781450314398 |
| DOI | 10.1145/2377978.2377983 |
| Language | English |
| Publisher | Association for Computing Machinery (ACM) |
| Publisher Date | 2011-10-10 |
| Publisher Place | New York |
| Access Restriction | Subscribed |
| Subject Keyword | Scalable data generation myriad parallel data generator toolkit Scalable data generation Testing tools Testing and debugging Software engineering |
| Content Type | Text |
| Resource Type | Article |