• Title/Summary/Keyword: Large-scale Scientific Data

Search Result 53, Processing Time 0.025 seconds

Analysis on NDN Testbeds for Large-scale Scientific Data: Status, Applications, Features, and Issues (과학 빅데이터를 위한 엔디엔 테스트베드 분석: 현황, 응용, 특징, 그리고 이슈)

  • Lim, Huhnkuk;Sin, Gwangcheon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.7
    • /
    • pp.904-913
    • /
    • 2020
  • As the data volumes and complexity rapidly increase, data-intensive science handling large-scale scientific data needs to investigate new techniques for intelligent storage and data distribution over networks. Recently, Named Data Networking (NDN) and data-intensive science communities have inspired innovative changes in distribution and management for large-scale experimental data. In this article, analysis on NDN testbeds for large-scale scientific data such as climate science data and High Energy Physics (HEP) data is presented. This article is the first attempt to analyze existing NDN testbeds for large-scale scientific data. NDN testbeds for large-scale scientific data are described and discussed in terms of status, NDN-based application, and features, which are NDN testbed instance for climate science, NDN testbed instance for both climate science and HEP, and the NDN testbed in SANDIE project. Finally various issues to prevent pitfalls in NDN testbed establishment for large-scale scientific data are analyzed and discussed, which are drawn from the descriptions of NDN testbeds and features on them.

Privacy Enhanced Data Security Mechanism in a Large-Scale Distributed Computing System for HTC and MTC

  • Rho, Seungwoo;Park, Sangbae;Hwang, Soonwook
    • International Journal of Contents
    • /
    • v.12 no.2
    • /
    • pp.6-11
    • /
    • 2016
  • We developed a pilot-job based large-scale distributed computing system to support HTC and MTC, called HTCaaS (High-Throughput Computing as a Service), which helps scientists solve large-scale scientific problems in areas such as pharmaceutical domains, high-energy physics, nuclear physics and bio science. Since most of these problems involve critical data that affect the national economy and activate basic industries, data privacy is a very important issue. In this paper, we implement a privacy enhanced data security mechanism to support HTC and MTC in a large-scale distributed computing system and show how this technique affects performance in our system. With this mechanism, users can securely store data in our system.

A High-rate GPS Data Processing for Large-scale Structure Monitoring (대형구조물 모니터링을 위한 high-rate GPS 자료처리)

  • Bae, Tea-Suk
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2010.04a
    • /
    • pp.181-182
    • /
    • 2010
  • For real-time displacement monitoring of large-scale structures, the high-rate (>1 Hz) GPS data processing is necessary, which is not possible even for the scientific GPS data processing softwares. Since the baseline is generally very short in this case, most of the atmospheric effects are removed, resulting in the unknowns of position and integer ambiguity. The number of unknowns in real-time kinematic GPS positioning makes the positioning impossible with usual approach, thus two-step approach is tested in this study.

  • PDF

A Multi-Application Controller for SAGE-enabled Tiled Display Wall in Wide-area Distributed Computing Environments

  • Fujiwara, Yuki;Date, Susumu;Ichikawa, Kohei;Takemura, Haruo
    • Journal of Information Processing Systems
    • /
    • v.7 no.4
    • /
    • pp.581-594
    • /
    • 2011
  • Due to the recent advancement of networking and high-performance computing technologies, scientists can easily access large-scale data captured by scientific measurement devices through a network, and use huge computational power harnessed on the Internet for their analyses of scientific data. However, visualization technology, which plays a role of great importance for scientists to intuitively understand the analysis results of such scientific data, is not fully utilized so that it can seamlessly benefit from recent high-performance and networking technologies. One of such visualization technologies is SAGE (Scalable Adaptive Graphics Environment), which allows people to build an arbitrarily sized tiled display wall and is expected to be applied to scientific research. In this paper, we present a multi-application controller for SAGE, which we have developed, in the hope that it will help scientists efficiently perform scientific research requiring high-performance computing and visualization. The evaluation in this paper indicates that the efficiency of completing a comparison task among multiple data is increased by our system.

Analyzing Characteristic of Business District in Urban Area Using GIS Methods - Focused on Large-Scale Store and Traditional Market - (GIS 기법을 활용한 도시지역 상권 특성 분석 - 대형할인점과 전통시장을 중심으로 -)

  • SONG, Bong-Geun;PARK, Kyung-Hun
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.20 no.2
    • /
    • pp.89-101
    • /
    • 2017
  • The study used GIS methods to analyze a business district consisting of traditional markets and large-scale stores, to determine the level of support needed for small enterprises in an urban area of Changwon-si, Gyeongsangnam-do. Data gathered on the area was analyzed using GIS tools such as Kernel density, Network analysis, and Huff modeling. Traditional markets are concentrated in areas where large-scale stores are located, and data analyses show that the number of consumer'use of large-scale stores (157,071) was three times that of traditional markets (59,953). One explanation for these results is that the large-scale stores are located either in densely populated areas or are adjacent to the traditional markets. Therefore, standards and regulations are needed to support small enterprise business districts. In the future, the results of this study can be used as a reference for planning and supporting traditional market business districts.

Frequency Analysis of Scientific Texts on the Hypoxia Using Bibliographic Data (논문 서지정보를 이용한 빈산소수괴 연구 분야의 연구용어 빈도분석)

  • Lee, GiSeop;Lee, JiYoung;Cho, HongYeon
    • Ocean and Polar Research
    • /
    • v.41 no.2
    • /
    • pp.107-120
    • /
    • 2019
  • The frequency analysis of scientific terms using bibliographic information is a simple concept, but as relevant data become more widespread, manual analysis of all data is practically impossible or only possible to a very limited extent. In addition, as the scale of oceanographic research has expanded to become much more comprehensive and widespread, the allocation of research resources on various topics has become an important issue. In this study, the frequency analysis of scientific terms was performed using text mining. The data used in the analysis is a general-purpose scholarship database, totaling 2,878 articles. Hypoxia, which is an important issue in the marine environment, was selected as a research field and the frequencies of related words were analyzed. The most frequently used words were 'Organic matter', 'Bottom water', and 'Dead zone' and specific areas showed high frequency. The results of this research can be used as a basis for the allocation of research resources to the frequency of use of related terms in specific fields when planning a large research project represented by single word.

DATA MINING AND PREDICTION OF SAI TYPE MATRIX PRECONDITIONER

  • Kim, Sang-Bae;Xu, Shuting;Zhang, Jun
    • Journal of applied mathematics & informatics
    • /
    • v.28 no.1_2
    • /
    • pp.351-361
    • /
    • 2010
  • The solution of large sparse linear systems is one of the most important problems in large scale scientific computing. Among the many methods developed, the preconditioned Krylov subspace methods are considered the preferred methods. Selecting a suitable preconditioner with appropriate parameters for a specific sparse linear system presents a challenging task for many application scientists and engineers who have little knowledge of preconditioned iterative methods. The prediction of ILU type preconditioners was considered in [27] where support vector machine(SVM), as a data mining technique, is used to classify large sparse linear systems and predict best preconditioners. In this paper, we apply the data mining approach to the sparse approximate inverse(SAI) type preconditioners to find some parameters with which the preconditioned Krylov subspace method on the linear systems shows best performance.

Mobile Monitoring System for Large Scale Scientific Computing Center (대규모 과학계산 컴퓨팅센터를 위한 모바일 모니터링 시스템)

  • Choi, Min
    • Journal of Convergence Society for SMB
    • /
    • v.2 no.1
    • /
    • pp.41-50
    • /
    • 2012
  • In this research, we developed a scalable resource monitoring system for large scale scientific computing data centers. Usually, there are limitations and overheads for keeping track of every computing nodes because of the huge number of computing nodes. So, this research proposes a layered summarizing techniques during collection of all system resource information. The technique results in improved scalability by reducing the amount of information at higher layer. Our prototype system which is implemented with web service is applicable with the HTML5 mobile web technology on smart devices.

  • PDF

Application of access control policy in ScienceDMZ-based network configuration (ScienceDMZ 기반의 네트워크 구성에서 접근제어정책 적용)

  • Kwon, Woo Chang;Lee, Jae Kwang;Kim, Ki Hyeon
    • Convergence Security Journal
    • /
    • v.21 no.2
    • /
    • pp.3-10
    • /
    • 2021
  • Nowadays, data-based scientific research is a trend, and the transmission of large amounts of data has a great influence on research productivity. To solve this problem, a separate network structure for transmitting large-scale scientific big data is required. ScienceDMZ is a network structure designed to transmit such scientific big data. In such a network configuration, it is essential to establish an access control list(ACL) for users and resources. In this paper, we describe the R&E Together project and the network structure implemented in the actual ScienceDMZ network structure, and define users and services to which access control policies are applied for safe data transmission and service provision. In addition, it presents a method for the network administrator to apply the access control policy to all network resources and users collectively, and through this, it was possible to achieve automation of the application of the access control policy.

VotingRank: A Case Study of e-Commerce Recommender Application Using MapReduce

  • Ren, Jian-Ji;Lee, Jae-Kee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.834-837
    • /
    • 2009
  • There is a growing need for ad-hoc analysis of extremely large data sets, especially at e-Commerce companies which depend on recommender application. Nowadays, as the number of e-Commerce web pages grow to a tremendous proportion; vertical recommender services can help customers to find what they need. Recommender application is one of the reasons for e-Commerce success in today's world. Compared with general e-Commerce recommender application, obviously, general e-Commerce recommender application's processing scope is greatly narrowed down. MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing, data mining, and scientific simulation. The objective of this paper is to explore MapReduce framework for the e-Commerce recommender application on major general and dedicated link analysis for e-Commerce recommender application, and thus the responding time has been decreased and the recommender application's accuracy has been improved.