DOI QR코드

DOI QR Code

An Adaptive Workflow Scheduling Scheme Based on an Estimated Data Processing Rate for Next Generation Sequencing in Cloud Computing

  • 투고 : 2012.01.25
  • 심사 : 2012.09.06
  • 발행 : 2012.12.31

초록

The cloud environment makes it possible to analyze large data sets in a scalable computing infrastructure. In the bioinformatics field, the applications are composed of the complex workflow tasks, which require huge data storage as well as a computing-intensive parallel workload. Many approaches have been introduced in distributed solutions. However, they focus on static resource provisioning with a batch-processing scheme in a local computing farm and data storage. In the case of a large-scale workflow system, it is inevitable and valuable to outsource the entire or a part of their tasks to public clouds for reducing resource costs. The problems, however, occurred at the transfer time for huge dataset as well as there being an unbalanced completion time of different problem sizes. In this paper, we propose an adaptive resource-provisioning scheme that includes run-time data distribution and collection services for hiding the data transfer time. The proposed adaptive resource-provisioning scheme optimizes the allocation ratio of computing elements to the different datasets in order to minimize the total makespan under resource constraints. We conducted the experiments with a well-known sequence alignment algorithm and the results showed that the proposed scheme is efficient for the cloud environment.

키워드

참고문헌

  1. C. H. Youn, E. B. Shim, et al, A Cooperative Metabolic Syndrome Estimation with High Precision Sensing Unit, IEEE Transaction on Biomedical Engineering, Vol. 58, No. 3, pp809-813, March 2011. https://doi.org/10.1109/TBME.2010.2088397
  2. G. O. Young, Synthetic Structure of Industrial Plastics, in Plastics, 2nd ed. vol. 3, J. Peters, Ed. New York: McGraw-Hill, 1964, pp.1564.
  3. D. Sulakhe, M. D'Souza, M. Syed, A. Rodriguez, Y. Zhang, E. Glass, M. Romine, and N. Maltsev, GNARE - A Grid-based Server for the Analysis of User Submitted Genomes, Nucleic Acids Res. vol. NAR-00335-Web-B-2007.Rl, 2007.
  4. Maltsev N, Glass E, Sulakhe D, Rodriguez A, Syed MH, Bompada T, Zhang Y, D'Souza M. PUMA2 Grid-based High-throughput Analysis of Genomes and Metabolic Pathways. Nucleic Acids Res. Vol.34, 2006.
  5. C. S. Schuster, Next-generation Sequencing Transforms Today's Biology, Nature Methods, Vol.5, 2008.
  6. Apache Hadoop. http://hadoop.apache.org/
  7. Li, H., Homer, N. A Survey of Sequence Alignment Algorithms for Next-Generation Sequencing, Briefings in Bioinformatics 11(5) September, 2010.
  8. Li H. and Durbin R. Fast and Accurate Long-read Alignment with Buffows-Wheeler Transform, Bioinformatics, Epub, 2010.
  9. Ahn. S. M, et.al, The first Korean Genome Sequence and Analysis: Full Genome Sequencing for a Socio-ethnic Group, Genome Research, 2009.
  10. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J. Basic Local Alignment Search Tool, Journal of Molecular Biology 215(3) October 1990.
  11. Kepler Project. https://kepler-project.org/
  12. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, The Sequence Alignment/Map format and SAMtools, 1000 Genome project data processing subjroup, 2009.
  13. DNA Sequencing, http://en.wikipedia.org/wiki/DNA\_sequencing
  14. S. Deng, A Study on Policy Adjuster integrated Grid Workflow Management System, MS Thesis, Information and Communications University, Korea, 2008.
  15. C. H Han, C. H Youn, W. Jung, Web-Based System for Advanced Heart Disease Identification Using Grid Computing Technology, 21st IEEE International Symposium on Computer-Based Medical Systems, 2008.
  16. C. H Youn, B. Kim, and E. B Shim, Resource Reconfiguration Scheme Based on Temporal Quorum Status Estimation for Grid Management, IEICE Trans. Comm. E88 (11) (2005) 4378-4381.

피인용 문헌

  1. An Adaptive Procedure for Task Scheduling Optimization in Mobile Cloud Computing vol.2015, 2015, https://doi.org/10.1155/2015/969027
  2. Mapping discovery modeling and its empirical research for the scientific and technological knowledge concept in unified concept space vol.18, pp.1, 2015, https://doi.org/10.1007/s10586-013-0339-7
  3. Service models and pricing schemes for cloud computing vol.17, pp.2, 2014, https://doi.org/10.1007/s10586-013-0296-1
  4. Hierarchical structured data logging system for effective lifelog management in ubiquitous environment vol.74, pp.10, 2015, https://doi.org/10.1007/s11042-013-1671-x
  5. A Science Gateway Cloud With Cost-Adaptive VM Management for Computational Science and Applications vol.11, pp.1, 2017, https://doi.org/10.1109/JSYST.2015.2501750
  6. Performance analysis based resource allocation for green cloud computing vol.69, pp.3, 2014, https://doi.org/10.1007/s11227-013-1020-x
  7. Analysis and Improvement of a Robust User Authentication Framework for Ubiquitous Sensor Networks vol.10, pp.3, 2014, https://doi.org/10.1155/2014/637684
  8. Data modeling mobile augmented reality: integrated mind and body rehabilitation vol.74, pp.10, 2015, https://doi.org/10.1007/s11042-013-1649-8
  9. A Novel Approach for Optimal Multimedia Data Distribution in Mobile Cloud Computing vol.2014, 2014, https://doi.org/10.1155/2014/137296
  10. Guest Editorial: Advanced Technologies and Services for Multimedia Big Data Processing vol.74, pp.10, 2015, https://doi.org/10.1007/s11042-015-2586-5