NDLI: Big Data Data Management Systems performance analysis using Aloja and BigBench

Please wait, while we are loading the content...

Big Data Data Management Systems performance analysis using Aloja and BigBench

Content Provider	Semantic Scholar
Author	Rivero, Alejandro Montero Pérez, David Carrera Poggi, Nicolás
Copyright Year	2018
Abstract	Traditional RDMBs cannot accommodate the need to analyze large volumes of data that may contain non-structured information, while also performing interactive applications that run through the data more than once. SQL-like Big Data infrastructure, offers the benefits of Big Data architectures with the ease of the SQL language. In this thesis project, the ALOJA benchmarking platform is expanded with the first standardized Big Data benchmark: BigBench. By making use of ALOJA and BigBench, it is possible to test SUTs and engines in discrete scenarios or discover possible bottlenecks. A proposed BigBench expansion, allows to test engine elasticity and how they react to workloads with diverse complexity. We demonstrate the capabilities of ALOJA and BigBench by analyzing the de facto SQL Big Data engine: Hive, against the on-growing Spark-SQL. Spark shows ahead in CPU intensive applications, while lacking performance in disk access when using rotational disks, even throttling network and CPU at high levels of concurrency. Hive requires more memory per CPU core than Spark, becoming unreliable when workload complexity grow. The expanded BigBench architecture allowed to detect a difference in task management between engines. Hive parallelizes independent tasks, assigning resources in function of their complexity. Spark, on the other hand, eliminates concurrency by executing tasks in a sequential order and assigning them the complete cluster resources.
File Format	PDF HTM / HTML
Alternate Webpage(s)	https://upcommons.upc.edu/bitstream/handle/2117/117985/131431.pdf?isAllowed=y&sequence=1
Language	English
Access Restriction	Open
Content Type	Text
Resource Type	Article

Central Library (ISO-9001:2015 Certified)
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India | PIN - 721302

See location in the Map
03222 282435
Mail: support@ndl.gov.in

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in