• Title/Summary/Keyword: Data Scientists

Search Result 3,360, Processing Time 0.031 seconds

A Data Dependency Elimination Algorithm for Extracting Maximum Parallelism (최대 병렬성 추출을 위한 자료 종속성 제거 알고리즘)

  • 송월봉;박두순
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.1
    • /
    • pp.139-139
    • /
    • 1999
  • In most application programs, loops usually comprise most of the computation in a program and the most important source of parallelism. When the data dependency relation is uniformin terms of distance, several compile time parallelization methods were introduced. On the otherhand,when the data dependency relation is non-uniform in distance, the compile time extraction ofparallelism is much complicated. In this paper, a general method the extracting parallelism in nestedloops is presented. This algorithm can be applicable where the dependency relation is both uniform andnon-uniform in distance. According to execution repeatedly the statements in nested loops, thealgorithm which effectively removes these kind of data dependencies is developed in order to presentthe total parallelization of nested loops.

An Indexing Method to Prevent Attacks based on Frequency in Database as a Service (서비스로의 데이터베이스에서 빈도수 기반의 추론공격 방지를 위한 인덱싱 기법)

  • Jung, Kang-Soo;Park, Seog
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.8
    • /
    • pp.878-882
    • /
    • 2010
  • DaaS model that surrogates their data has a problem of privacy leakage by service provider. In this paper, we analyze inference attack that can occur on encrypted data that consist of multiple column through index, and we suggest b-anonymity to protect data against inference attack. We use R+-tree technique to minimize false-positive that can happen when we use an index for efficiency of data processing.

The Enhancement of a Power Control MAC Protocol for Ad Hoc Networks (에드혹 네트웍에서의 전력제어 MAC 프로토콜 향상)

  • 심은숙;김동균
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10c
    • /
    • pp.220-222
    • /
    • 2004
  • 이동 애드 혹 망을 구성하는 노드들은 일반적으로 배터리 전력을 사용하기 때문에 이들의 에너지 소모량을 줄이는 연구들이 각 계층에 대해 이루어져 왔다. 몇몇 연구들은 매체 접근 제어 프로토콜로 많이 이용되는 IEEE 802.11 DCF의 전력 소비량을 줄이기 위한 전력 제어 기법을 제시하였다. 기본 전력 제어기법(BASIC Power Control Protocol)은 RTS-CTS와 DATA-ACK 에 대해서 각각 다른 전력을 적용하는 것이다. RTS-CTS는 최대 전력으로 전송되고, DATA-ACK는 불필요한 에너지 낭비를 줄이기 위해 최소한 의 필요한 전력으로 전송하였다. 그러므로 DATA-ACK의 전송범위(transmission range)와 전송파 감지영역(carrier sensing range)은 RTS-CTS의 영역보다 작아진다. 전송파 감지영역에서 RTS-CTS를 감지한 노드들은 신호를 올바로 해석할 수 없으므로 NAV 값을 EIFS로 설정한다. 이 EIFS 구간은 충돌을 막기에는 너무나 짧기 때문에. EIFS 구간이 지난 후에 채널이 비어있는 상태로 간주하고 전송을 시도하게 된다. 이에 따라 DATA-ACK 수신에 있어서 충돌율이 증가하게 되고 네트웍 전체 처리량이 감소하게 된다. 본 논문에서는 기본 전력제어 기법이 가지는 문제점을 해결하고 전체 네트웍 처리량을 향상 시킬수 있는 새로운 전력제어(Power Control) MAC 프로토콜을 제안하였다.

  • PDF

Design and Implementation of a Distributed Data Mining Framework (분산된 데이터마이닝을 위한 프레임워크의 설계 및 구현)

  • Kadel, Prakash;Choi, Ho-Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.06c
    • /
    • pp.336-340
    • /
    • 2007
  • We envisage that grid computing environments allow us to implement distributed data mining services, that is, those applications which analyze large sets of geographically distributed databases and information using the computational power and resources of a grid environment. This paper describes an experimental framework towards such a distributed data mining approach, including design considerations and a prototype implementation. Based on the "Knowledge Grid" architecture suggested by Cannataro et al., we identify four major components - user node, broker node, data node, and computation node - and define their individual roles. For implementing the prototype, we have investigated methods for utilizing distributed resources within a grid computing environment, e.g., communication and coordination among the various resources available.

  • PDF

On the Bayesian Statistical Inference (베이지안 통계 추론)

  • Lee, Ho-Suk
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.06c
    • /
    • pp.263-266
    • /
    • 2007
  • This paper discusses the Bayesian statistical inference. This paper discusses the Bayesian inference, MCMC (Markov Chain Monte Carlo) integration, MCMC method, Metropolis-Hastings algorithm, Gibbs sampling, Maximum likelihood estimation, Expectation Maximization algorithm, missing data processing, and BMA (Bayesian Model Averaging). The Bayesian statistical inference is used to process a large amount of data in the areas of biology, medicine, bioengineering, science and engineering, and general data analysis and processing, and provides the important method to draw the optimal inference result. Lastly, this paper discusses the method of principal component analysis. The PCA method is also used for data analysis and inference.

  • PDF

Efficient Rendering Engine of Large Scale Terrain Data for Streaming Services (대용량 위성영상 지형 데이타의 스트리밍 서비스를 위한 효율적인 렌더링 모듈)

  • Park, Tae-Joo;Lee, Sang-Jun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.7
    • /
    • pp.748-752
    • /
    • 2008
  • Various services are developed from advancement of satellite imagery methodologies and internet infrastructure expansions. However, most of these services still rely upon low-resolution satellite images combined with DEM models. In this paper, we have implemented the raw data processing modules and other modules that transfer and render high-spatial resolution satellite images for efficient streaming services in web environments. By utilizing the Bukhan-mountain data as a pilot study, the paper has proposed the efficient approach to solve graphical problems in real time processing the large geographical area.

The Situation Lens: A Metaphor for Personal Task Management on Mobile Devices

  • Celentano, Augusto;Faralli, Stefano;Pittarello, Fabio
    • Journal of Computing Science and Engineering
    • /
    • v.3 no.4
    • /
    • pp.238-259
    • /
    • 2009
  • In this paper we discuss personal data management with mobile devices, an activity requiring the composition of services offered by standard suites of applications. We propose a data model and an interface model that allows users to define activities, tasks and services, to navigate among them according to the evolution of the personal situation as perceived and interpreted by the users themselves. The interface model acts as a lens exploring the situation, zooming into the details, covering different areas of the personal data, supporting the user in the role of a composer of personal services.

Virtual Data Grouping for Performance Enhancement of Multi-User Games (다중 사용자 게임 성능 향상을 위한 데이터 가상 그룹핑 방법)

  • 이철민;박홍성
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.3_4
    • /
    • pp.231-238
    • /
    • 2003
  • This paper presents a virtual grouping method used in multi-user network games, which reduces a response time and losses of response data. The proposed method divides each group into virtual groups and transmits data in them after dividing an overall map on a game into several fixed regions and grouping them. And this paper derives the optimal number of groups minimizing a given cost function. The proposed method if shown to be useful by comparing with a general grouping method.

Online Clustering Algorithms for Semantic-Rich Network Trajectories

  • Roh, Gook-Pil;Hwang, Seung-Won
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.4
    • /
    • pp.346-353
    • /
    • 2011
  • With the advent of ubiquitous computing, a massive amount of trajectory data has been published and shared in many websites. This type of computing also provides motivation for online mining of trajectory data, to fit user-specific preferences or context (e.g., time of the day). While many trajectory clustering algorithms have been proposed, they have typically focused on offline mining and do not consider the restrictions of the underlying road network and selection conditions representing user contexts. In clear contrast, we study an efficient clustering algorithm for Boolean + Clustering queries using a pre-materialized and summarized data structure. Our experimental results demonstrate the efficiency and effectiveness of our proposed method using real-life trajectory data.

Challenges and New Approaches in Genomics and Bioinformatics

  • Park, Jong Hwa;Han, Kyung Sook
    • Genomics & Informatics
    • /
    • v.1 no.1
    • /
    • pp.1-6
    • /
    • 2003
  • In conclusion, the seemingly fuzzy and disorganized data of biology with thousands of different layers ranging from molecule to the Internet have refused so far to be mapped precisely and predicted successfully by mathematicians, physicists or computer scientists. Genomics and bioinformatics are the fields that process such complex data. The insights on the nature of biological entities as complex interaction networks are opening a door toward a generalization of the representation of biological entities. The main challenge of genomics and bioinformatics now lies in 1) how to data mine the networks of the domains of bioinformatics, namely, the literature, metabolic pathways, and proteome and structures, in terms of interaction; and 2) how to generalize the networks in order to integrate the information into computable genomic data for computers regardless of the levels of layer. Once bioinformatists succeed to find a general principle on the way components interact each other to form any organic interaction network at genomic scale, true simulation and prediction of life in silico will be possible.