An Energy-Aware Resource Management Strategy Based on Spark and YARN in Heterogeneous Environments

dc.authorid Shabestari, Fatemeh/0000-0003-1926-4674
dc.authorscopusid 57204862467
dc.authorscopusid 55897274300
dc.contributor.author Jafari Navimipour, Nima
dc.contributor.author Navimipour, Nima Jafari
dc.contributor.other Computer Engineering
dc.date.accessioned 2024-06-23T21:38:08Z
dc.date.available 2024-06-23T21:38:08Z
dc.date.issued 2024
dc.department Kadir Has University en_US
dc.department-temp [Shabestari, Fatemeh] Islamic Azad Univ, Dept Comp Engn, Sofian Branch, Sofian, Iran; [Navimipour, Nima Jafari] Kadir Has Univ, Dept Comp Engn, TR-34083 Istanbul, Turkiye; [Navimipour, Nima Jafari] Natl Yunlin Univ Sci & Technol, Future Technol Res Ctr, Touliu 64002, Taiwan en_US
dc.description Shabestari, Fatemeh/0000-0003-1926-4674 en_US
dc.description.abstract Apache Spark is a popular framework for processing big data. Running Spark on Hadoop YARN allows it to schedule Spark workloads alongside other data-processing frameworks on Hadoop. When an application is deployed in a YARN cluster, its resources are given without considering energy efficiency. Furthermore, there is no way to enforce any user-specified deadline constraints. To address these issues, we propose a new deadline-aware resource management system and a scheduling algorithm to minimize the total energy consumption in Spark on YARN for heterogeneous clusters. First, a deadline-aware energy-efficient model for the considered problem is proposed. Then, using a locality-aware method, executors are assigned to applications. This algorithm sorts the nodes based on the performance per watt (PPW) metric, the number of application data blocks on nodes, and the rack locality. It also offers three ways to choose executors from different machines: greedy, random, and Pareto-based. Finally, the proposed heuristic task scheduler schedules tasks on executors to minimize total energy and tardiness. We evaluated the performance of the suggested algorithm regarding energy efficiency and satisfying the Service Level Agreement (SLA). The results showed that the method outperforms the popular algorithms regarding energy consumption and meeting deadlines. en_US
dc.identifier.citationcount 0
dc.identifier.doi 10.1109/TGCN.2023.3347276
dc.identifier.endpage 644 en_US
dc.identifier.issn 2473-2400
dc.identifier.issue 2 en_US
dc.identifier.scopus 2-s2.0-85181573774
dc.identifier.scopusquality Q1
dc.identifier.startpage 635 en_US
dc.identifier.uri https://doi.org/10.1109/TGCN.2023.3347276
dc.identifier.uri https://hdl.handle.net/20.500.12469/5749
dc.identifier.volume 8 en_US
dc.identifier.wos WOS:001230177900019
dc.identifier.wosquality N/A
dc.language.iso en en_US
dc.publisher Ieee-inst Electrical Electronics Engineers inc en_US
dc.relation.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.scopus.citedbyCount 3
dc.subject Sparks en_US
dc.subject Yarn en_US
dc.subject Task analysis en_US
dc.subject Resource management en_US
dc.subject Energy efficiency en_US
dc.subject Energy consumption en_US
dc.subject Clustering algorithms en_US
dc.subject Distributed computing en_US
dc.subject energy management en_US
dc.subject resource management en_US
dc.subject scheduling en_US
dc.title An Energy-Aware Resource Management Strategy Based on Spark and YARN in Heterogeneous Environments en_US
dc.type Article en_US
dc.wos.citedbyCount 2
dspace.entity.type Publication
relation.isAuthorOfPublication 0fb3c7a0-c005-4e5f-a9ae-bb163df2df8e
relation.isAuthorOfPublication.latestForDiscovery 0fb3c7a0-c005-4e5f-a9ae-bb163df2df8e
relation.isOrgUnitOfPublication fd8e65fe-c3b3-4435-9682-6cccb638779c
relation.isOrgUnitOfPublication.latestForDiscovery fd8e65fe-c3b3-4435-9682-6cccb638779c

Files