• Title/Summary/Keyword: Parallel/Distributed System

Search Result 385, Processing Time 0.022 seconds

Call-Site Tracing-based Shared Memory Allocator for False Sharing Reduction in DSM Systems (분산 공유 메모리 시스템에서 거짓 공유를 줄이는 호출지 추적 기반 공유 메모리 할당 기법)

  • Lee, Jong-Woo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.7
    • /
    • pp.349-358
    • /
    • 2005
  • False sharing is a result of co-location of unrelated data in the same unit of memory coherency, and is one source of unnecessary overhead being of no help to keep the memory coherency in multiprocessor systems. Moreover. the damage caused by false sharing becomes large in proportion to the granularity of memory coherency. To reduce false sharing in a page-based DSM system, it is necessary to allocate unrelated data objects that have different access patterns into the separate shared pages. In this paper we propose call-site tracing-based shared memory allocator. shortly CSTallocator. CSTallocator expects that the data objects requested from the different call-sites may have different access patterns in the future. So CSTailocator places each data object requested from the different call-sites into the separate shared pages, and consequently data objects that have the same call-site are likely to get together into the same shared pages. We use execution-driven simulation of real parallel applications to evaluate the effectiveness of our CSTallocator. Our observations show that by using CSTallocator a considerable amount of false sharing misses can be additionally reduced in comparison with the existing techniques.

Analysis of Factors for Korean Women's Cancer Screening through Hadoop-Based Public Medical Information Big Data Analysis (Hadoop기반의 공개의료정보 빅 데이터 분석을 통한 한국여성암 검진 요인분석 서비스)

  • Park, Min-hee;Cho, Young-bok;Kim, So Young;Park, Jong-bae;Park, Jong-hyock
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.10
    • /
    • pp.1277-1286
    • /
    • 2018
  • In this paper, we provide flexible scalability of computing resources in cloud environment and Apache Hadoop based cloud environment for analysis of public medical information big data. In fact, it includes the ability to quickly and flexibly extend storage, memory, and other resources in a situation where log data accumulates or grows over time. In addition, when real-time analysis of accumulated unstructured log data is required, the system adopts Hadoop-based analysis module to overcome the processing limit of existing analysis tools. Therefore, it provides a function to perform parallel distributed processing of a large amount of log data quickly and reliably. Perform frequency analysis and chi-square test for big data analysis. In addition, multivariate logistic regression analysis of significance level 0.05 and multivariate logistic regression analysis of meaningful variables (p<0.05) were performed. Multivariate logistic regression analysis was performed for each model 3.

Construction of a Sub-catchment Connected Nakdong-gang Flood Analysis System Using Distributed Model (분포형 모형을 이용한 소유역 연계 낙동강 홍수해석시스템 구축)

  • Choi, Yun-Seok;Won, Young-Jin;Kim, Kyung-Tak
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.202-202
    • /
    • 2018
  • 본 논문에서는 분포형 강우-유출 모형인 GRM(Grid based Rainfall-runoff Model)(최윤석, 김경탁, 2017)을 이용해서 낙동강 유역을 대상으로 대유역 홍수해석시스템을 구축하고, 유출해석을 위한 실행시간을 평가하였다. 유출모형은 낙동강의 주요 지류와 본류를 소유역으로 구분하여 모형을 구축하고, 각 소유역의 유출해석 결과를 실시간으로 연계할 수 있도록 하여 낙동강 전체 유역의 유출모형을 구축하였다. 이와 같이 하나의 대유역을 다수의 소유역시스템으로 분할하여 모형을 구축할 경우, 유출해석시스템 구성이 복잡해지는 단점이 있으나, 소유역별로 각기 다른 자료를 이용하여 다양한 해상도로 유출해석을 할 수 있으므로, 소유역별 특성에 맞는 유출모형 구축이 가능한 장점이 있다. 또한 각 소유역시스템은 별도의 프로세스로 계산이 진행되므로, 대유역을 고해상도로 해석하는 경우에도 계산시간을 단축할 수 있다. 본 연구에서는 낙동강 유역을 20개(본류 구간 3개, 1차 지류 13개, 댐상류 4개)의 소유역으로 분할하여 계산 시간을 검토하였으며, 최종적으로 21개(본류 구간 3개, 1차 지류 13개, 댐상류 5개)의 소유역으로 분할하여 유출해석시스템을 구축하였다. 댐 상류 유역은 댐하류와 유량전달이 없이 독립적으로 모의되고, 댐과 연결된 하류 유역은 관측 방류량을 상류단 하천의 경계조건으로 적용한다. 지류 유역은 본류 구간과 연결되고, 지류의 계산 유량은 본류와의 연결지점에 유량조건으로 실시간으로 입력된다. 이때 본류와 지류의 유량 연계는 데이터베이스를 매개로 하였다. 유출해석시스템의 성능을 평가하기 위해서 Microsoft 클라우드 서비스인 Azure를 이용하였다. 낙동강 유역을 20개 소유역으로 구성한 경우에서의 유출해석시스템의 속도 평가 결과 Azure virtual machine instance DS15 v2(OS : Windows Server 2012 R2, CPU : 2.4 GHz Intel $Xeon^{(R)}$ E5-2673 v3 20 cores)에서 1.5분이 소요 되었다. 계산시간 평가시 GRM은 'IsParallel=false' 옵션을 적용하였으며, 모의 기간은 24시간을 기준으로 하였다. 연구결과 분포형 모형을 이용한 대유역 유출해석시스템 구축이 가능했으며, 계산시간도 충분히 단축할 수 있었다. 또한 추가적인 CPU와 병렬계산을 적용할 경우, 계산시간은 더 단축될 수 있으며, 이러한 기법들은 분포형 모형을 이용한 대유역 유출해석시스템 구축시 유용하게 활용될 수 있을 것으로 판단된다.

  • PDF

Water Quality and Epilithic Diatom Community in the Lower Stream near the South Harbor System of Korean Peninsula (한반도 서남부 하천 하구역의 수질 및 부착돌말 군집 특성)

  • Kim, Ha-Kyung;Lee, Min-Hyuk;Kim, Yong-Jae;Won, Du-Hee;Hwang, Soon-Jin;Hwang, Su-Ok;Kim, Sang-Hoon;Kim, Baik-Ho
    • Korean Journal of Ecology and Environment
    • /
    • v.46 no.4
    • /
    • pp.551-560
    • /
    • 2013
  • Environmental factors and epilithic diatom communities in the lower streams near the harbor region of South Korean peninsula were examined during no monsoon period in May 2013. The sampling of water and epilithic diatoms was conducted at both streams, 19 regulated streams (RS) that there are one or several dams constructed in the river system, and 19 un-regulated streams (US) that there are no dams within the river. A cluster analysis based on the number of species and abundance of epilithic diatoms through the stations, divided into three groups such as groups I (mainly US), II (mixed with US and RS) and III (mainly RS), respectively. Group I showed that water quality is good and high diversity of diatom, while Group II and III was water quality is relatively poor, but not differed in biomass of diatom from Group I. In addition, Group II that had high conductivity, nitrogen and phosphorus, was the lowest in diatom diversity among them. Dominant species were Nitzschia palea (17%) and Navicula seminuloides (11%) in Group I, Nitzschia inconspicua (19%) and Navicula perminuta (9%) in Group II, and Nitzschia inconspicua (15%) and Nitzschia palea (14%) in Group III, respectively. These taxa were widely distributed in brackish water, and not closely related with specific water quality, like eutrophic water. However, the groups II and III belonged to RS, had not only little biomass, but bad water quality such as high concentrations of nutrient and chlorophyll-a. Therefore, to determine the effect of dam construction on the lower water ecosystem, the planktonic algae, which can occur algal bloom in the estuary, also was considered to be a parallel investigation.

Methods for Integration of Documents using Hierarchical Structure based on the Formal Concept Analysis (FCA 기반 계층적 구조를 이용한 문서 통합 기법)

  • Kim, Tae-Hwan;Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.63-77
    • /
    • 2011
  • The World Wide Web is a very large distributed digital information space. From its origins in 1991, the web has grown to encompass diverse information resources as personal home pasges, online digital libraries and virtual museums. Some estimates suggest that the web currently includes over 500 billion pages in the deep web. The ability to search and retrieve information from the web efficiently and effectively is an enabling technology for realizing its full potential. With powerful workstations and parallel processing technology, efficiency is not a bottleneck. In fact, some existing search tools sift through gigabyte.syze precompiled web indexes in a fraction of a second. But retrieval effectiveness is a different matter. Current search tools retrieve too many documents, of which only a small fraction are relevant to the user query. Furthermore, the most relevant documents do not nessarily appear at the top of the query output order. Also, current search tools can not retrieve the documents related with retrieved document from gigantic amount of documents. The most important problem for lots of current searching systems is to increase the quality of search. It means to provide related documents or decrease the number of unrelated documents as low as possible in the results of search. For this problem, CiteSeer proposed the ACI (Autonomous Citation Indexing) of the articles on the World Wide Web. A "citation index" indexes the links between articles that researchers make when they cite other articles. Citation indexes are very useful for a number of purposes, including literature search and analysis of the academic literature. For details of this work, references contained in academic articles are used to give credit to previous work in the literature and provide a link between the "citing" and "cited" articles. A citation index indexes the citations that an article makes, linking the articleswith the cited works. Citation indexes were originally designed mainly for information retrieval. The citation links allow navigating the literature in unique ways. Papers can be located independent of language, and words in thetitle, keywords or document. A citation index allows navigation backward in time (the list of cited articles) and forwardin time (which subsequent articles cite the current article?) But CiteSeer can not indexes the links between articles that researchers doesn't make. Because it indexes the links between articles that only researchers make when they cite other articles. Also, CiteSeer is not easy to scalability. Because CiteSeer can not indexes the links between articles that researchers doesn't make. All these problems make us orient for designing more effective search system. This paper shows a method that extracts subject and predicate per each sentence in documents. A document will be changed into the tabular form that extracted predicate checked value of possible subject and object. We make a hierarchical graph of a document using the table and then integrate graphs of documents. The graph of entire documents calculates the area of document as compared with integrated documents. We mark relation among the documents as compared with the area of documents. Also it proposes a method for structural integration of documents that retrieves documents from the graph. It makes that the user can find information easier. We compared the performance of the proposed approaches with lucene search engine using the formulas for ranking. As a result, the F.measure is about 60% and it is better as about 15%.