• Title/Summary/Keyword: data-intensive tasks

Search Result 49, Processing Time 0.019 seconds

A Data Placement Scheme for the Characteristics of Data Intensive Scientific Workflow Applications (데이터 집약 과학 워크플로우 응용의 특성을 고려한 데이터 배치 기법)

  • Ahn, Julim;Kim, Yoonhee
    • KNOM Review
    • /
    • v.21 no.2
    • /
    • pp.46-52
    • /
    • 2018
  • For data-intensive scientific workflow application experiments that leverage the cloud computing environment, large amounts of data can be distributed across multiple data centers in the cloud. The generated intermediate data can also be transmitted through access between different data centers. When the application is executed, the execution result is changed according to the location of the data since the intermediate data generated is used. However, existing data placement strategies do not consider the characteristics of scientific applications. In this paper, we define a data-intensive tasks and propose runtime data placement in that interval. Through the proposed data placement scheme, we analyze the scenarios considering the number of times in the data intensive tasks defined in this study and derive the results. In addition, performance was compared by analyzing runtime data placement times and runtime data placement overhead.

Range Segmentation of Dynamic Offloading (RSDO) Algorithm by Correlation for Edge Computing

  • Kang, Jieun;Kim, Svetlana;Kim, Jae-Ho;Sung, Nak-Myoung;Yoon, Yong-Ik
    • Journal of Information Processing Systems
    • /
    • v.17 no.5
    • /
    • pp.905-917
    • /
    • 2021
  • In recent years, edge computing technology consists of several Internet of Things (IoT) devices with embedded sensors that have improved significantly for monitoring, detection, and management in an environment where big data is commercialized. The main focus of edge computing is data optimization or task offloading due to data and task-intensive application development. However, existing offloading approaches do not consider correlations and associations between data and tasks involving edge computing. The extent of collaborative offloading segmented without considering the interaction between data and task can lead to data loss and delays when moving from edge to edge. This article proposes a range segmentation of dynamic offloading (RSDO) algorithm that isolates the offload range and collaborative edge node around the edge node function to address the offloading issue.The RSDO algorithm groups highly correlated data and tasks according to the cause of the overload and dynamically distributes offloading ranges according to the state of cooperating nodes. The segmentation improves the overall performance of edge nodes, balances edge computing, and solves data loss and average latency.

A Light-weight, Adaptive, Reliable Processing Integrity Audit for e-Science Grid (e-Science 그리드를 위한 가볍고, 적응성있고, 신뢰성있는 처리 무결성 감사)

  • Jung, Im-Young;Jung, Eun-Jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.5
    • /
    • pp.181-188
    • /
    • 2008
  • E-Science Grid is designed to cope with computation-intensive tasks and to manage a huge volume of science data efficiently. However, certain tasks may involve more than one grid can offer in computation capability or incur a long wait time on other tasks. Resource sharing among Grids can solve this problem with proper processing-integrity check via audit. Due to their computing-intensive nature, the processing time of e-Science tasks tends to be long. This potential long wait before an audit failure encourages earlier audit mechanism during execution in order both to prevent resource waste and to detect any problem fast. In this paper, we propose a Light-weight, Adaptive and Reliable Audit, LARA, of processing Integrity for e-Science applications. With the LARA scheme. researchers can verify their processing earlier and fast.

Trends in Compute Express Link(CXL) Technology (CXL 인터커넥트 기술 연구개발 동향)

  • S.Y. Kim;H.Y. Ahn;Y.M. Park;W.J. Han
    • Electronics and Telecommunications Trends
    • /
    • v.38 no.5
    • /
    • pp.23-33
    • /
    • 2023
  • With the widespread demand from data-intensive tasks such as machine learning and large-scale databases, the amount of data processed in modern computing systems is increasing exponentially. Such data-intensive tasks require large amounts of memory to rapidly process and analyze massive data. However, existing computing system architectures face challenges when building large-scale memory owing to various structural issues such as CPU specifications. Moreover, large-scale memory may cause problems including memory overprovisioning. The Compute Express Link (CXL) allows computing nodes to use large amounts of memory while mitigating related problems. Hence, CXL is attracting great attention in industry and academia. We describe the overarching concepts underlying CXL and explore recent research trends in this technology.

Building an Annotated English-Vietnamese Parallel Corpus for Training Vietnamese-related NLPs

  • Dien Dinh;Kiem Hoang
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.103-109
    • /
    • 2004
  • In NLP (Natural Language Processing) tasks, the highest difficulty which computers had to face with, is the built-in ambiguity of Natural Languages. To disambiguate it, formerly, they based on human-devised rules. Building such a complete rule-set is time-consuming and labor-intensive task whilst it doesn't cover all the cases. Besides, when the scale of system increases, it is very difficult to control that rule-set. So, recently, many NLP tasks have changed from rule-based approaches into corpus-based approaches with large annotated corpora. Corpus-based NLP tasks for such popular languages as English, French, etc. have been well studied with satisfactory achievements. In contrast, corpus-based NLP tasks for Vietnamese are at a deadlock due to absence of annotated training data. Furthermore, hand-annotation of even reasonably well-determined features such as part-of-speech (POS) tags has proved to be labor intensive and costly. In this paper, we present our building an annotated English-Vietnamese parallel aligned corpus named EVC to train for Vietnamese-related NLP tasks such as Word Segmentation, POS-tagger, Word Order transfer, Word Sense Disambiguation, English-to-Vietnamese Machine Translation, etc.

  • PDF

A Task Scheduling Method after Clustering for Data Intensive Jobs in Heterogeneous Distributed Systems

  • Hajikano, Kazuo;Kanemitsu, Hidehiro;Kim, Moo Wan;Kim, Hee-Dong
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.1
    • /
    • pp.9-20
    • /
    • 2016
  • Several task clustering heuristics are proposed for allocating tasks in heterogeneous systems to achieve a good response time in data intensive jobs. However, one of the challenging problems is the process in task scheduling after task allocation by task clustering. We propose a task scheduling method after task clustering, leveraging worst schedule length (WSL) as an upper bound of the schedule length. In our proposed method, a task in a WSL sequence is scheduled preferentially to make the WSL smaller. Experimental results by simulation show that the response time is improved in several task clustering heuristics. In particular, our proposed scheduling method with the task clustering outperforms conventional list-based task scheduling methods.

A Study of Handwashing by Intensive Care Unit Nurses according to the Content of Nursing Faculty Practice (중환자실 간호사의 간호업무내용에 따른 손씻기에 관한 연구)

  • Kim Hyun-Ju;Kim Nam-Cho
    • Journal of Korean Academy of Fundamentals of Nursing
    • /
    • v.12 no.1
    • /
    • pp.121-130
    • /
    • 2005
  • Purpose: This study was done to determine the rate for handwashing by intensive care unit nurses according to the content of nursing tasks, to investigate the relationship between hand washing practice evaluated by nurses themselves and their actual practice observed, and finality to provide basic materials for strategy for hand washing education. Method: Data were collected by observing 27 nurses working in intensive care units of a hospital in Uijeongbu, Gyeonggi-do and by using observation and a structured self-assessment tool. Collected data were analyzed with SPSS and SAS. Results: The handwashing rate for the nurses was 4.3%. The handwashing rate was high in proportion to the risk of cross infection. In addition, the handwashing rate was highest in nurses working in the neurosurgery intensive care unit. The average score for self-assessment of handwashing was $49.42{\pm}3.78$ points and it was higher than their actual practice of handwashing. Conclusion: In order to improve handwashing by nurses, it is necessary to educate them on the importance of handwashing. In addition, there should be strategies for standardizing knowledge and attitudes to handwashing and inducing nurse:3 to practice hand washing in compliance with the policies and working conditions of the institution.

  • PDF

Job Analysis of Maternal Fetal Intensive Care Unit Nurses Using DACUM Technique (DACUM을 이용한 고위험 산모·신생아 통합센터 간호사의 직무분석)

  • Kim, Hee Jeong;Kim, Jeung-Im;Ahn, Sukhee;Kim, Myoung-Hee;Kim, Yunmi;Cho, Kyung Sook;Hwang, Namsuk;Choi, Jung Sun;Park, Soo Hye;Lee, Eun Hee
    • Journal of Korean Clinical Nursing Research
    • /
    • v.24 no.1
    • /
    • pp.10-22
    • /
    • 2018
  • Purpose: This study was performed to establish the role and to analyze the job of MFICU (Maternal Fetal Intensive Care Unit) nurses using DACUM (Developing a curriculum). Methods: A DACUM workshop was held to define MFICU nurses' role and identify their duties and tasks. A DACUM committee was consisted of 7 nurses, 2 nursing professors and 1 medical doctor and as a result, a survey was developed which contained duties and tasks of MFICU nurse. Pre-test was carried out for the validity, finally collected the data from 97 nurses who worked at 7 MFICU and 10 delivery room. Results: A total of 60 duties, 115 tasks and 822 elements of tasks were defined on the DACUM chart and survey. The importance, frequency and difficulty of the tasks were presented the determinant coefficient (DC), the highest DC duty was 'Manage maternal ventilator' (15.09) and the lowest DC was 'Provide nursing care for leisure to gestation extension mother' (6.52). Twenty-eight tasks were differentiated between MFICU and delivery nurses significantly. And the most important, frequently, difficulty task perceived by MFICU nurse was 'Check fetal heartbeat with electronic fetal heart monitor'. Conclusion: The organized educational program and policy was needed to develop for MFICU nurses.

Scalable Data Provisioning Scheme on Large-Scale Distributed Computing Environment (대규모 분산 컴퓨팅 환경에서 확장성을 고려한 실시간 데이터 공급 기법)

  • Kim, Byungs-Sang;Youn, Chan-Hyun
    • The KIPS Transactions:PartA
    • /
    • v.18A no.4
    • /
    • pp.123-128
    • /
    • 2011
  • As the global grid has grown in size, large-scale distributed data analysis schemes have gained momentum. Over the last few years, a number of methods have been introduced for allocating data intensive tasks across distributed and heterogeneous computing platforms. However, these approaches have a limited potential for scaling up computing nodes so that they can serve more tasks simultaneously. This paper tackles the scalability and communication delay for computing nodes. We propose a distributed data node for storing and allocating the data. This paper also provides data provisioning method based on the steady states for minimizing the communication delay between the data source and the computing nodes. The experimental results show that scalability and communication delay can be achieved in our system.

An Adaptive Workflow Scheduling Scheme Based on an Estimated Data Processing Rate for Next Generation Sequencing in Cloud Computing

  • Kim, Byungsang;Youn, Chan-Hyun;Park, Yong-Sung;Lee, Yonggyu;Choi, Wan
    • Journal of Information Processing Systems
    • /
    • v.8 no.4
    • /
    • pp.555-566
    • /
    • 2012
  • The cloud environment makes it possible to analyze large data sets in a scalable computing infrastructure. In the bioinformatics field, the applications are composed of the complex workflow tasks, which require huge data storage as well as a computing-intensive parallel workload. Many approaches have been introduced in distributed solutions. However, they focus on static resource provisioning with a batch-processing scheme in a local computing farm and data storage. In the case of a large-scale workflow system, it is inevitable and valuable to outsource the entire or a part of their tasks to public clouds for reducing resource costs. The problems, however, occurred at the transfer time for huge dataset as well as there being an unbalanced completion time of different problem sizes. In this paper, we propose an adaptive resource-provisioning scheme that includes run-time data distribution and collection services for hiding the data transfer time. The proposed adaptive resource-provisioning scheme optimizes the allocation ratio of computing elements to the different datasets in order to minimize the total makespan under resource constraints. We conducted the experiments with a well-known sequence alignment algorithm and the results showed that the proposed scheme is efficient for the cloud environment.