Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage

Nan Noon Noon; Janusz Roman Getta; Tianbing Xia

doi:doi:10.11648/j.ajist.20240803.14

Research Article |

| Peer-Reviewed

Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage

Nan Noon Noon^*

, Janusz Roman Getta

, Tianbing Xia

Published in American Journal of Information Science and Technology (Volume 8, Issue 3)

Received: 1 May 2024 Accepted: 27 May 2024 Published: 29 September 2024

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

A logical model of multi-tiered persistent storage provides a view of data where all available storage resources are distributed over a number of levels depending on the data transfer parameters and capacities. The efficient parallelization of data transfers in multi-tiered persistent storage is a significant challenge for a pipelined data processing model. This work examines a category of database applications implemented as sequences of operations that transfer data between the levels of multi-tiered persistent storage. The concept of EPN: Extended Petri Nets represents how database applications can be processed in parallel. A proposed transformation involves converting EPN into sequences of parallel data transfers. Additionally, a method is demonstrated for partitioning these sequences of data transfers, with the goal of reducing the total number of conflicts when data transfers occur between the levels of multi-tiered persistent storage. The paper proposes new rule-based algorithms for scheduling parallel data transfers that minimize total data transfer time. The objectives of the new algorithms are to evenly distribute the workload among the data transfer processes and reduce their idle time. Several experiments have confirmed the effectiveness of the new algorithms in generating parallel data transfer plans.

Published in	American Journal of Information Science and Technology (Volume 8, Issue 3)
DOI	10.11648/j.ajist.20240803.14
Page(s)	84-97
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Multi-tiered Persistent Storage, Scheduling, Parallel Data Processing, Performance Tuning, Database Management Systems

References

[1]	W. Reisig, Understanding Petri Nets: Modeling Techniques, Analysis Methods, Case Studies, Springer Publishing Company, Incorporated, 2013. https://doi.org/10.1007/978-3-642-33278-4
[2]	N. N. Noon, J. R. Getta and T. Xia, Optimization Query Processing for Multi-tiered Persistent Storage, 2021 IEEE 4th International Conference on Computer and Communication Engineering Technology (CCET), 2021, pp. 131–135. https://doi.org/10.1109/CCET52649.2021.9544285
[3]	N. N. Noon, and J. R. Getta, Optimisation of query processing with multilevel storage, In: Lecture Notes in Computer Science, 691–700. Da Nang, Vietnam Proceedings of the 8th Asian Conference, ACIIDS (2016). https://doi.org/10.1007/978-3-662-49390-8 67
[4]	N. N. Noon, and J. R. Getta, Automated Performance Tuning of Data Management Systems with Materializations and Indices, In: Journal of Computer and Communications, 4, pp. 46–52 (2016). https://doi.org/10.4236/jcc.2016.45007
[5]	T. Sthr, H. Mrtens and E. Rahm, Multi-dimensional database allocation for parallel data warehouses, In: Proceedings of the 26th International Conference on Very Large Databases, pp. 273–284, (2000).
[6]	J. Li, J. F. Naughton and R.V. Nehme, Resource bricolage and resource selection for parallel database systems, The VLDB Journal, vol. 26, no. 1, pp. 31–54 (2017). https://doi.org/10.1007/s00778-016-0435-4
[7]	R. Nehme and N. Bruno, N, Automated partitioning design in parallel database systems, in: SIGMOD, Association for Computing Machinery, New York, NY, USA, 1137–1148 (2011). https://doi.org/10.1145/1989323.1989444
[8]	K. Wang, S. H. Choi and H. Qin, A cluster-based scheduling model using SPT and SA for dynamic hybrid flow shop problems, International journal of advanced manufacturing technology, 67, 2243-2258 (2013). https://doi.org/10.1007/s00170-012-4645-7
[9]	J. Blazewicz, Klaus H. Ecker, E. Pesh, G. Schmidt, M. Sterna and J. Weglarz, Handbook on Scheduling From Theory to Practice, 2nd edn. Springer, Cham, (2019). https://doi.org/10.1007/978-3-319-99849-7
[10]	Y.Zhang, H.Franke, J.MoreiraandA.Sivasubramaniam, An integrated approach to parallel scheduling using gang-scheduling, backfilling, and migration, in IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 3, pp. 236-247, March 2003, https://doi.org/10.1109/TPDS.2003.1189582 .
[11]	E. Frachtenberg, G. Feitelson, F. Petrini, and J. Fernandez, Adaptive parallel job scheduling with flexible coscheduling, in IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 11, pp. 1066-1077, Nov. 2005, https://doi.org/10.1109/TPDS.2005.130 .
[12]	R. Sheldon, G. Kranz and D. Raffo, Evaluator Group, TieredStorage, (2021), https://searchstorage.techtarget.co m/definition/tiered-storage, last accessed on 09 June 2023.
[13]	P. Tsai, Spiceworks, Spiceworks Research Examines Storage Trends in 2020 and Beyond, (2020), https://community.spiceworks.com/blog/3240- spiceworks-research-examines-storage-trends-in-2020- and-beyond, last accessed 09 June 2023.
[14]	N. N. Noon, J. R. Getta and T. Xia, Scheduling Parallel Data Transfers in Multi-tiered Persistent Storage, inIntelligentInformationandDatabaseSystems, ACIIDS 2022, Communications in Computer and Information Science, vol 1716, Springer, Singapore. https://doi.org/10.1007/978-981-19-8234-7 34

Cite This Article

Plain Text BibTeX RIS

APA Style

Noon, N. N., Getta, J. R., Xia, T. (2024). Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage. American Journal of Information Science and Technology, 8(3), 84-97. https://doi.org/10.11648/j.ajist.20240803.14

Copy | Download

ACS Style

Noon, N. N.; Getta, J. R.; Xia, T. Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage. Am. J. Inf. Sci. Technol. 2024, 8(3), 84-97. doi: 10.11648/j.ajist.20240803.14

Copy | Download

AMA Style

Noon NN, Getta JR, Xia T. Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage. Am J Inf Sci Technol. 2024;8(3):84-97. doi: 10.11648/j.ajist.20240803.14

Copy | Download

@article{10.11648/j.ajist.20240803.14,
  author = {Nan Noon Noon and Janusz Roman Getta and Tianbing Xia},
  title = {Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage},
  journal = {American Journal of Information Science and Technology},
  volume = {8},
  number = {3},
  pages = {84-97},
  doi = {10.11648/j.ajist.20240803.14},
  url = {https://doi.org/10.11648/j.ajist.20240803.14},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajist.20240803.14},
  abstract = {A logical model of multi-tiered persistent storage provides a view of data where all available storage resources are distributed over a number of levels depending on the data transfer parameters and capacities. The efficient parallelization of data transfers in multi-tiered persistent storage is a significant challenge for a pipelined data processing model. This work examines a category of database applications implemented as sequences of operations that transfer data between the levels of multi-tiered persistent storage. The concept of EPN: Extended Petri Nets represents how database applications can be processed in parallel. A proposed transformation involves converting EPN into sequences of parallel data transfers. Additionally, a method is demonstrated for partitioning these sequences of data transfers, with the goal of reducing the total number of conflicts when data transfers occur between the levels of multi-tiered persistent storage. The paper proposes new rule-based algorithms for scheduling parallel data transfers that minimize total data transfer time. The objectives of the new algorithms are to evenly distribute the workload among the data transfer processes and reduce their idle time. Several experiments have confirmed the effectiveness of the new algorithms in generating parallel data transfer plans.},
 year = {2024}
}

Copy | Download

TY  - JOUR
T1  - Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage
AU  - Nan Noon Noon
AU  - Janusz Roman Getta
AU  - Tianbing Xia
Y1  - 2024/09/29
PY  - 2024
N1  - https://doi.org/10.11648/j.ajist.20240803.14
DO  - 10.11648/j.ajist.20240803.14
T2  - American Journal of Information Science and Technology
JF  - American Journal of Information Science and Technology
JO  - American Journal of Information Science and Technology
SP  - 84
EP  - 97
PB  - Science Publishing Group
SN  - 2640-0588
UR  - https://doi.org/10.11648/j.ajist.20240803.14
AB  - A logical model of multi-tiered persistent storage provides a view of data where all available storage resources are distributed over a number of levels depending on the data transfer parameters and capacities. The efficient parallelization of data transfers in multi-tiered persistent storage is a significant challenge for a pipelined data processing model. This work examines a category of database applications implemented as sequences of operations that transfer data between the levels of multi-tiered persistent storage. The concept of EPN: Extended Petri Nets represents how database applications can be processed in parallel. A proposed transformation involves converting EPN into sequences of parallel data transfers. Additionally, a method is demonstrated for partitioning these sequences of data transfers, with the goal of reducing the total number of conflicts when data transfers occur between the levels of multi-tiered persistent storage. The paper proposes new rule-based algorithms for scheduling parallel data transfers that minimize total data transfer time. The objectives of the new algorithms are to evenly distribute the workload among the data transfer processes and reduce their idle time. Several experiments have confirmed the effectiveness of the new algorithms in generating parallel data transfer plans.
VL  - 8
IS  - 3
ER  -

Copy | Download

Author Information

Nan Noon Noon

School of Computing and Information Technology, University of Wollongong, Wollongong, Australia

Contact Email

http://orcid.org/0000-0003-3985-5455
Janusz Roman Getta

School of Computing and Information Technology, University of Wollongong, Wollongong, Australia

Contact Email

http://orcid.org/0000-0001-6492-5641
Tianbing Xia

School of Computing and Information Technology, University of Wollongong, Wollongong, Australia

Contact Email

http://orcid.org/0000-0002-4520-5021

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Noon, N. N., Getta, J. R., Xia, T. (2024). Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage. American Journal of Information Science and Technology, 8(3), 84-97. https://doi.org/10.11648/j.ajist.20240803.14

Copy | Download

ACS Style

Noon, N. N.; Getta, J. R.; Xia, T. Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage. Am. J. Inf. Sci. Technol. 2024, 8(3), 84-97. doi: 10.11648/j.ajist.20240803.14

Copy | Download

AMA Style

Noon NN, Getta JR, Xia T. Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage. Am J Inf Sci Technol. 2024;8(3):84-97. doi: 10.11648/j.ajist.20240803.14

Copy | Download

@article{10.11648/j.ajist.20240803.14,
  author = {Nan Noon Noon and Janusz Roman Getta and Tianbing Xia},
  title = {Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage},
  journal = {American Journal of Information Science and Technology},
  volume = {8},
  number = {3},
  pages = {84-97},
  doi = {10.11648/j.ajist.20240803.14},
  url = {https://doi.org/10.11648/j.ajist.20240803.14},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajist.20240803.14},
  abstract = {A logical model of multi-tiered persistent storage provides a view of data where all available storage resources are distributed over a number of levels depending on the data transfer parameters and capacities. The efficient parallelization of data transfers in multi-tiered persistent storage is a significant challenge for a pipelined data processing model. This work examines a category of database applications implemented as sequences of operations that transfer data between the levels of multi-tiered persistent storage. The concept of EPN: Extended Petri Nets represents how database applications can be processed in parallel. A proposed transformation involves converting EPN into sequences of parallel data transfers. Additionally, a method is demonstrated for partitioning these sequences of data transfers, with the goal of reducing the total number of conflicts when data transfers occur between the levels of multi-tiered persistent storage. The paper proposes new rule-based algorithms for scheduling parallel data transfers that minimize total data transfer time. The objectives of the new algorithms are to evenly distribute the workload among the data transfer processes and reduce their idle time. Several experiments have confirmed the effectiveness of the new algorithms in generating parallel data transfer plans.},
 year = {2024}
}

Copy | Download

TY  - JOUR
T1  - Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage
AU  - Nan Noon Noon
AU  - Janusz Roman Getta
AU  - Tianbing Xia
Y1  - 2024/09/29
PY  - 2024
N1  - https://doi.org/10.11648/j.ajist.20240803.14
DO  - 10.11648/j.ajist.20240803.14
T2  - American Journal of Information Science and Technology
JF  - American Journal of Information Science and Technology
JO  - American Journal of Information Science and Technology
SP  - 84
EP  - 97
PB  - Science Publishing Group
SN  - 2640-0588
UR  - https://doi.org/10.11648/j.ajist.20240803.14
AB  - A logical model of multi-tiered persistent storage provides a view of data where all available storage resources are distributed over a number of levels depending on the data transfer parameters and capacities. The efficient parallelization of data transfers in multi-tiered persistent storage is a significant challenge for a pipelined data processing model. This work examines a category of database applications implemented as sequences of operations that transfer data between the levels of multi-tiered persistent storage. The concept of EPN: Extended Petri Nets represents how database applications can be processed in parallel. A proposed transformation involves converting EPN into sequences of parallel data transfers. Additionally, a method is demonstrated for partitioning these sequences of data transfers, with the goal of reducing the total number of conflicts when data transfers occur between the levels of multi-tiered persistent storage. The paper proposes new rule-based algorithms for scheduling parallel data transfers that minimize total data transfer time. The objectives of the new algorithms are to evenly distribute the workload among the data transfer processes and reduce their idle time. Several experiments have confirmed the effectiveness of the new algorithms in generating parallel data transfer plans.
VL  - 8
IS  - 3
ER  -

Copy | Download