Research Article | | Peer-Reviewed

Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage

Received: 1 May 2024     Accepted: 27 May 2024     Published: 29 September 2024
Views:       Downloads:
Abstract

A logical model of multi-tiered persistent storage provides a view of data where all available storage resources are distributed over a number of levels depending on the data transfer parameters and capacities. The efficient parallelization of data transfers in multi-tiered persistent storage is a significant challenge for a pipelined data processing model. This work examines a category of database applications implemented as sequences of operations that transfer data between the levels of multi-tiered persistent storage. The concept of EPN: Extended Petri Nets represents how database applications can be processed in parallel. A proposed transformation involves converting EPN into sequences of parallel data transfers. Additionally, a method is demonstrated for partitioning these sequences of data transfers, with the goal of reducing the total number of conflicts when data transfers occur between the levels of multi-tiered persistent storage. The paper proposes new rule-based algorithms for scheduling parallel data transfers that minimize total data transfer time. The objectives of the new algorithms are to evenly distribute the workload among the data transfer processes and reduce their idle time. Several experiments have confirmed the effectiveness of the new algorithms in generating parallel data transfer plans.

Published in American Journal of Information Science and Technology (Volume 8, Issue 3)
DOI 10.11648/j.ajist.20240803.14
Page(s) 84-97
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Multi-tiered Persistent Storage, Scheduling, Parallel Data Processing, Performance Tuning, Database Management Systems

References
[1] W. Reisig, Understanding Petri Nets: Modeling Techniques, Analysis Methods, Case Studies, Springer Publishing Company, Incorporated, 2013.
[2] N. N. Noon, J. R. Getta and T. Xia, Optimization Query Processing for Multi-tiered Persistent Storage, 2021 IEEE 4th International Conference on Computer and Communication Engineering Technology (CCET), 2021, pp. 131–135.
[3] N. N. Noon, and J. R. Getta, Optimisation of query processing with multilevel storage, In: Lecture Notes in Computer Science, 691–700. Da Nang, Vietnam Proceedings of the 8th Asian Conference, ACIIDS (2016).
[4] N. N. Noon, and J. R. Getta, Automated Performance Tuning of Data Management Systems with Materializations and Indices, In: Journal of Computer and Communications, 4, pp. 46–52 (2016).
[5] T. Sthr, H. Mrtens and E. Rahm, Multi-dimensional database allocation for parallel data warehouses, In: Proceedings of the 26th International Conference on Very Large Databases, pp. 273–284, (2000).
[6] J. Li, J. F. Naughton and R.V. Nehme, Resource bricolage and resource selection for parallel database systems, The VLDB Journal, vol. 26, no. 1, pp. 31–54 (2017).
[7] R. Nehme and N. Bruno, N, Automated partitioning design in parallel database systems, in: SIGMOD, Association for Computing Machinery, New York, NY, USA, 1137–1148 (2011).
[8] K. Wang, S. H. Choi and H. Qin, A cluster-based scheduling model using SPT and SA for dynamic hybrid flow shop problems, International journal of advanced manufacturing technology, 67, 2243-2258 (2013).
[9] J. Blazewicz, Klaus H. Ecker, E. Pesh, G. Schmidt, M. Sterna and J. Weglarz, Handbook on Scheduling From Theory to Practice, 2nd edn. Springer, Cham, (2019).
[10] Y.Zhang, H.Franke, J.MoreiraandA.Sivasubramaniam, An integrated approach to parallel scheduling using gang-scheduling, backfilling, and migration, in IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 3, pp. 236-247, March 2003,
[11] E. Frachtenberg, G. Feitelson, F. Petrini, and J. Fernandez, Adaptive parallel job scheduling with flexible coscheduling, in IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 11, pp. 1066-1077, Nov. 2005,
[12] R. Sheldon, G. Kranz and D. Raffo, Evaluator Group, TieredStorage, (2021), https://searchstorage.techtarget.co m/definition/tiered-storage, last accessed on 09 June 2023.
[13] P. Tsai, Spiceworks, Spiceworks Research Examines Storage Trends in 2020 and Beyond, (2020), https://community.spiceworks.com/blog/3240- spiceworks-research-examines-storage-trends-in-2020- and-beyond, last accessed 09 June 2023.
[14] N. N. Noon, J. R. Getta and T. Xia, Scheduling Parallel Data Transfers in Multi-tiered Persistent Storage, inIntelligentInformationandDatabaseSystems, ACIIDS 2022, Communications in Computer and Information Science, vol 1716, Springer, Singapore. https://doi.org/10.1007/978-981-19-8234-7 34
Cite This Article
  • APA Style

    Noon, N. N., Getta, J. R., Xia, T. (2024). Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage. American Journal of Information Science and Technology, 8(3), 84-97. https://doi.org/10.11648/j.ajist.20240803.14

    Copy | Download

    ACS Style

    Noon, N. N.; Getta, J. R.; Xia, T. Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage. Am. J. Inf. Sci. Technol. 2024, 8(3), 84-97. doi: 10.11648/j.ajist.20240803.14

    Copy | Download

    AMA Style

    Noon NN, Getta JR, Xia T. Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage. Am J Inf Sci Technol. 2024;8(3):84-97. doi: 10.11648/j.ajist.20240803.14

    Copy | Download

  • @article{10.11648/j.ajist.20240803.14,
      author = {Nan Noon Noon and Janusz Roman Getta and Tianbing Xia},
      title = {Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage},
      journal = {American Journal of Information Science and Technology},
      volume = {8},
      number = {3},
      pages = {84-97},
      doi = {10.11648/j.ajist.20240803.14},
      url = {https://doi.org/10.11648/j.ajist.20240803.14},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajist.20240803.14},
      abstract = {A logical model of multi-tiered persistent storage provides a view of data where all available storage resources are distributed over a number of levels depending on the data transfer parameters and capacities. The efficient parallelization of data transfers in multi-tiered persistent storage is a significant challenge for a pipelined data processing model. This work examines a category of database applications implemented as sequences of operations that transfer data between the levels of multi-tiered persistent storage. The concept of EPN: Extended Petri Nets represents how database applications can be processed in parallel. A proposed transformation involves converting EPN into sequences of parallel data transfers. Additionally, a method is demonstrated for partitioning these sequences of data transfers, with the goal of reducing the total number of conflicts when data transfers occur between the levels of multi-tiered persistent storage. The paper proposes new rule-based algorithms for scheduling parallel data transfers that minimize total data transfer time. The objectives of the new algorithms are to evenly distribute the workload among the data transfer processes and reduce their idle time. Several experiments have confirmed the effectiveness of the new algorithms in generating parallel data transfer plans.},
     year = {2024}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage
    AU  - Nan Noon Noon
    AU  - Janusz Roman Getta
    AU  - Tianbing Xia
    Y1  - 2024/09/29
    PY  - 2024
    N1  - https://doi.org/10.11648/j.ajist.20240803.14
    DO  - 10.11648/j.ajist.20240803.14
    T2  - American Journal of Information Science and Technology
    JF  - American Journal of Information Science and Technology
    JO  - American Journal of Information Science and Technology
    SP  - 84
    EP  - 97
    PB  - Science Publishing Group
    SN  - 2640-0588
    UR  - https://doi.org/10.11648/j.ajist.20240803.14
    AB  - A logical model of multi-tiered persistent storage provides a view of data where all available storage resources are distributed over a number of levels depending on the data transfer parameters and capacities. The efficient parallelization of data transfers in multi-tiered persistent storage is a significant challenge for a pipelined data processing model. This work examines a category of database applications implemented as sequences of operations that transfer data between the levels of multi-tiered persistent storage. The concept of EPN: Extended Petri Nets represents how database applications can be processed in parallel. A proposed transformation involves converting EPN into sequences of parallel data transfers. Additionally, a method is demonstrated for partitioning these sequences of data transfers, with the goal of reducing the total number of conflicts when data transfers occur between the levels of multi-tiered persistent storage. The paper proposes new rule-based algorithms for scheduling parallel data transfers that minimize total data transfer time. The objectives of the new algorithms are to evenly distribute the workload among the data transfer processes and reduce their idle time. Several experiments have confirmed the effectiveness of the new algorithms in generating parallel data transfer plans.
    VL  - 8
    IS  - 3
    ER  - 

    Copy | Download

Author Information
  • Sections