Loading...
Please wait, while we are loading the content...
Similar Documents
MULT . GPU PROG . Parallel 3 D Fast Wavelet Transform comparison on CPUs and GPUs
| Content Provider | Semantic Scholar |
|---|---|
| Author | Bernabé, Gregorio |
| Copyright Year | 2015 |
| Abstract | Correspondence: gbernabe@ditec.um.es Computing Engineering, University of Murcia,Campus de Espinardo, 30071 Murcia, Spain Full list of author information is available at the end of the article Abstract We present in this paper several implementations of the 3D Fast Wavelet Transform (3DFWT) on multicore CPUs and manycore GPUs. On the GPU side, we focus on CUDA and OpenCL programming to develop methods for an efficient mapping on manycores. On multicore CPUs, OpenMP and Pthreads are used as counterparts to maximize parallelism, and renowned techniques like tiling and blocking are exploited to optimize the use of memory. We evaluate these proposals and make a comparison between a new Fermi Tesla C2050 and an Intel Core 2 Quad Q6700. Speedups of the CUDA version are the best results, improving the execution times on CPU, ranging from 5.3x to 7.4x for different image sizes, and up to 81 times faster when communications are neglected. Meanwhile, OpenCL obtains solid gains which range from 2x factors on small frame sizes to 3x factors on larger ones. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://revistaseug.ugr.es/index.php/amgp/article/download/2470/pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |