TY - JOUR
T1 - Big Data Resource Management Networks
T2 - Taxonomy, Survey, and Future Directions
AU - Awaysheh, Feras M.
AU - Alazab, Mamoun
AU - Garg, Sahil
AU - Niyato, Dusit
AU - Verikoukis, Christos
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2021/10
Y1 - 2021/10
N2 - Big Data (BD) platforms have a long tradition of leveraging trends and technologies from the broader computer network and communication community. For several years, dedicated servers of homogeneous clusters were employed as the dominant paradigm in BD networks. In recent years, the BD landscape has changed, porting different deployment architectures with various network models. This trend has resulted in various associated opportunities and challenges that induce BD practitioners to achieve the next-generation BD vision. In particular, addressing the BD velocity with batch and micro-batch processing. Nevertheless, the literature misses an extensive study of the associated impacts of adopting these new deployment architectures, giving it holds a significant research interest. This study addresses the previous concern, offering a comprehensive review of the architectural elements of BD batch query deployment models and environments. A novel taxonomy is proposed to classify these models based on their underlying communication systems. We first discuss the batch query processing requirements as comparison criteria of BD communication models and compare their salient features. The benefits/challenges of these environments away from BD traditional on-premise dedicated clusters are presented. Thereafter, we provide an extensive survey of the modern BD deployment architectures, categorizing them based on their underlying infrastructure. Finally, several directions are outlined for future research on improving the state-of-the-art of BD landscape and provide recommendations for the BD practitioners on emerging environments supporting BD applications and the general large-scale data analytics.
AB - Big Data (BD) platforms have a long tradition of leveraging trends and technologies from the broader computer network and communication community. For several years, dedicated servers of homogeneous clusters were employed as the dominant paradigm in BD networks. In recent years, the BD landscape has changed, porting different deployment architectures with various network models. This trend has resulted in various associated opportunities and challenges that induce BD practitioners to achieve the next-generation BD vision. In particular, addressing the BD velocity with batch and micro-batch processing. Nevertheless, the literature misses an extensive study of the associated impacts of adopting these new deployment architectures, giving it holds a significant research interest. This study addresses the previous concern, offering a comprehensive review of the architectural elements of BD batch query deployment models and environments. A novel taxonomy is proposed to classify these models based on their underlying communication systems. We first discuss the batch query processing requirements as comparison criteria of BD communication models and compare their salient features. The benefits/challenges of these environments away from BD traditional on-premise dedicated clusters are presented. Thereafter, we provide an extensive survey of the modern BD deployment architectures, categorizing them based on their underlying infrastructure. Finally, several directions are outlined for future research on improving the state-of-the-art of BD landscape and provide recommendations for the BD practitioners on emerging environments supporting BD applications and the general large-scale data analytics.
KW - batch query systems
KW - Big data
KW - cloud computing
KW - computer network comparison
KW - decentralized computing
KW - grid computing
KW - HPC
KW - hybrid computing
KW - resource management and communication
UR - http://www.scopus.com/inward/record.url?scp=85114374838&partnerID=8YFLogxK
U2 - 10.1109/COMST.2021.3094993
DO - 10.1109/COMST.2021.3094993
M3 - Article
AN - SCOPUS:85114374838
SN - 1553-877X
VL - 23
SP - 2098
EP - 2130
JO - IEEE Communications Surveys and Tutorials
JF - IEEE Communications Surveys and Tutorials
IS - 4
ER -