• Title/Summary/Keyword: 과학기술 데이터

Search Result 2,575, Processing Time 0.033 seconds

An Integrated Hierarchical Temporal Memory Network for Multi-interval Prediction of Data Streams (데이터 스트림의 다중-간격 예측을 위한 통합된 계층형 시간적 메모리 네트워크)

  • Diao, Jian-Hua;Bae, Sun-Gap;Sim, Myung-Sun;Bae, Jong-Min;Kang, Hyun-Syug
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.7
    • /
    • pp.558-567
    • /
    • 2010
  • There is a large body of ongoing research to develop efficient prediction methods for data streams. These methods provide single prediction with a fixed time interval. It is necessary to develop a method for multi-interval prediction (MIP) because different prediction results may be obtained based on different intervals in many cases. In this paper, we propose a solution for MIP based on the Hierarchical Temporal Memory (HTM) model. In order to solve the problem of MIP with HTM, we present an Integrated Hierarchical Temporal Memory (IHTM) network by introducing a new node type Zeta1LastNode to the original HTM network. Using the hierarchical characteristic of the IHTM network, different levels in the network learn and model the features of a data stream with different intervals and generate prediction results for different intervals. Performance evaluation shows that the IHTM is efficient in the memory and time consumption compared with the original HTM network in MIP.

A Study on the Prediction of Residual Probability of Fine Dust in Complex Urban Area (복잡한 도심에서의 유입된 미세먼지 잔류 가능성 예보 연구)

  • Park, Sung Ju;Seo, You Jin;Kim, Dong Wook;Choi, Hyun Jeong
    • Journal of the Korean earth science society
    • /
    • v.41 no.2
    • /
    • pp.111-128
    • /
    • 2020
  • This study presents a possibility of intensification of fine dust mass concentration due to the complex urban structure using data mining technique and clustering analysis. The data mining technique showed no significant correlation between fine dust concentration and regional-use public urban data over Seoul. However, clustering analysis based on nationwide-use public data showed that building heights (floors) have a strong correlation particularly with PM10. The modeling analyses using the single canopy model and the micro-atmospheric modeling program (ENVI-Met. 4) conducted that the controlled atmospheric convection in urban area leaded to the congested flow pattern depending on the building along the distribution and height. The complex structure of urban building controls convective activity resulted in stagnation condition and fine dust increase near the surface. Consequently, the residual effect through the changes in the thermal environment caused by the shape and structure of the urban buildings must be considered in the fine dust distribution. It is notable that the atmospheric congestion may be misidentified as an important implications for providing information about the residual probability of fine dust mass concentration in the complex urban area.

Qualitative and Quantitative Analysis for Microbiome Data Matching between Objects (마이크로바이옴 데이터 일치를 위한 물체들 사이의 정량 및 정성적 분석)

  • You, Hee Sang;Ok, Yeon Jeong;Lee, Song Hee;Lee, So Lip;Lee, Young Ju;Lee, Min Ho;Hyun, Sung Hee
    • Korean Journal of Clinical Laboratory Science
    • /
    • v.52 no.3
    • /
    • pp.202-213
    • /
    • 2020
  • Although technological advances have allowed the efficient collection of large amounts of microbiome data for microbiological studies, proper analysis tools for such big data are still lacking. Additionally, analyses of microbial communities using poor databases can lead to misleading results. Hence, this study aimed to design an appropriate method for the analysis of big microbial databases. Bacteria were collected from the fingertips and personal belongings (mobile phones and laptop keyboards) of individuals. The genomic DNA was extracted from these bacteria and subjected to next-generation sequencing by targeting the 16S rRNA gene. The accuracy of the bacterial matching percentage between the fingertips and personal belongings was verified using a formula and an environment-related and human-related database. To design appropriate analysis, the bacterial matching accuracy was calculated based on the following three categories: comparison between qualitative and quantitative analysis, comparisons within same-gender participants as well as all participants regardless of gender, and comparison between the use of a human-related bacterial database (hDB) and environment-related bacterial database (eDB). The results showed that qualitative analysis, comparisons within same-gender participants, and the use of hDB provided relatively accurate results. This study provides an analytical method to obtain accurate results when conducting studies involving big microbiological data using human-derived microorganisms.

Adaptive Service Mode Conversion to Minimize Buffer Space Requirement in VOD Server (주문형 비디오 서버의 버퍼 최소화를 위한 가변적 서비스 모드 변환)

  • Won, Yu-Jip
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.5
    • /
    • pp.213-217
    • /
    • 2001
  • Excessive memory buffer requirement in continuous media playback is a serious impediment of wide spread usage of on-line multimedia service. Skewed access frequency of available video files provides an opportunity of re-using the date blocks which has been loaded by one session for later usage. We present novel algorithm which minimizes the buffer requirement in multiple sessions of multimedia playbacks. In continuous media playback originated from the disk, a certain amount of memory buffer is required to synchronize asynchronous disk. Read operation and synchronous playback operation. As aggregate playback bandwodth increases, larger amount of buffer needs to be allocated for this synchronization purpose. The focus of this work is to study the asymptotic behavior of the synchronization buffer requirement and to develop an algorithm coping with this excessive buffer requirement under bandwidth congestioon. We argue that in a large scale continuous media server, it may not be necessary to read the blocks for each session directly from the disk. The beauty of our work lies in the fact that it dynamically adapts to disk utilization of the server and finds the optimal way of servicinh the individual sessions while minimizing the overall buffer space requirement. Optimality of the proposed algorithm is shown by proof. The effectiveness and performance of the proposed scheme is examined via simulation.

  • PDF

CNVDAT: A Copy Number Variation Detection and Analysis Tool for Next-generation Sequencing Data (CNVDAT : 차세대 시퀀싱 데이터를 위한 유전체 단위 반복 변이 검출 및 분석 도구)

  • Kang, Inho;Kong, Jinhwa;Shin, JaeMoon;Lee, UnJoo;Yoon, Jeehee
    • Journal of KIISE:Databases
    • /
    • v.41 no.4
    • /
    • pp.249-255
    • /
    • 2014
  • Copy number variations(CNVs) are a recently recognized class of human structural variations and are associated with a variety of human diseases, including cancer. To find important cancer genes, researchers identify novel CNVs in patients with a particular cancer and analyze large amounts of genomic and clinical data. We present a tool called CNVDAT which is able to detect CNVs from NGS data and systematically analyze the genomic and clinical data associated with variations. CNVDAT consists of two modules, CNV Detection Engine and Sequence Analyser. CNV Detection Engine extracts CNVs by using the multi-resolution system of scale-space filtering, enabling the detection of the types and the exact locations of CNVs of all sizes even when the coverage level of read data is low. Sequence Analyser is a user-friendly program to view and compare variation regions between tumor and matched normal samples. It also provides a complete analysis function of refGene and OMIM data and makes it possible to discover CNV-gene-phenotype relationships. CNVDAT source code is freely available from http://dblab.hallym.ac.kr/CNVDAT/.

An Efficient Clustering Algorithm based on Heuristic Evolution (휴리스틱 진화에 기반한 효율적 클러스터링 알고리즘)

  • Ryu, Joung-Woo;Kang, Myung-Ku;Kim, Myung-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.1_2
    • /
    • pp.80-90
    • /
    • 2002
  • Clustering is a useful technique for grouping data points such that points within a single group/cluster have similar characteristics. Many clustering algorithms have been developed and used in engineering applications including pattern recognition and image processing etc. Recently, it has drawn increasing attention as one of important techniques in data mining. However, clustering algorithms such as K-means and Fuzzy C-means suffer from difficulties. Those are the needs to determine the number of clusters apriori and the clustering results depending on the initial set of clusters which fails to gain desirable results. In this paper, we propose a new clustering algorithm, which solves mentioned problems. In our method we use evolutionary algorithm to solve the local optima problem that clustering converges to an undesirable state starting with an inappropriate set of clusters. We also adopt a new measure that represents how well data are clustered. The measure is determined in terms of both intra-cluster dispersion and inter-cluster separability. Using the measure, in our method the number of clusters is automatically determined as the result of optimization process. And also, we combine heuristic that is problem-specific knowledge with a evolutionary algorithm to speed evolutionary algorithm search. We have experimented our algorithm with several sets of multi-dimensional data and it has been shown that one algorithm outperforms the existing algorithms.

Application of Symbolic Representation Method for Fault Detection and Clustering in Semiconductor Fabrication Processes (반도체공정 이상탐지 및 클러스터링을 위한 심볼릭 표현법의 적용)

  • Loh, Woong-Kee;Hong, Sang-Jeen
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.11
    • /
    • pp.806-818
    • /
    • 2009
  • Since the invention of the integrated circuit (IC) in 1950s, semiconductor technology has undergone dramatic development up to these days. A complete semiconductor is manufactured through a diversity of processes. For better semiconductor productivity, fault detection and classification (FDC) has been rigorously studied for finding faults even before the processes are completed. For FDC, various kinds of sensors are attached in many semiconductor manufacturing devices, and sensor values are collected in a periodic manner. The collection of sensor values consists of sequences of real numbers, and hence is regarded as a kind of time-series data. In this paper, we propose an algorithm for detecting and clustering faults in semiconductor processes. The proposed algorithm is a modification of the existing anomaly detection algorithm dealing with symbolically-represented time-series. The contributions of this paper are: (1) showing that a modification of the existing anomaly detection algorithm dealing with general time-series could be used for semiconductor process data and (2) presenting experimental results for improving correctness of fault detection and clustering. As a result of our experiment, the proposed algorithm caused neither false positive nor false negative.

Analytical Methods for the Analysis of Structural Connectivity in the Mouse Brain (마우스 뇌의 구조적 연결성 분석을 위한 분석 방법)

  • Im, Sang-Jin;Baek, Hyeon-Man
    • Journal of the Korean Society of Radiology
    • /
    • v.15 no.4
    • /
    • pp.507-518
    • /
    • 2021
  • Magnetic resonance imaging (MRI) is a key technology that has been seeing increasing use in studying the structural and functional innerworkings of the brain. Analyzing the variability of brain connectome through tractography analysis has been used to increase our understanding of disease pathology in humans. However, there lacks standardization of analysis methods for small animals such as mice, and lacks scientific consensus in regard to accurate preprocessing strategies and atlas-based neuroinformatics for images. In addition, it is difficult to acquire high resolution images for mice due to how significantly smaller a mouse brain is compared to that of humans. In this study, we present an Allen Mouse Brain Atlas-based image data analysis pipeline for structural connectivity analysis involving structural region segmentation using mouse brain structural images and diffusion tensor images. Each analysis method enabled the analysis of mouse brain image data using reliable software that has already been verified with human and mouse image data. In addition, the pipeline presented in this study is optimized for users to efficiently process data by organizing functions necessary for mouse tractography among complex analysis processes and various functions.

Appropriate App Services and Acceptance for Contact Tracing: Survey Focusing on High-Risk Areas of COVID-19 in South Korea (코로나 19 동선 관리를 위한 적정 앱 서비스와 도입: 고위험 지역 설문 연구)

  • Rho, Mi Jung
    • Korea Journal of Hospital Management
    • /
    • v.27 no.2
    • /
    • pp.16-33
    • /
    • 2022
  • Purposes: Prompt evaluation of routes and contact tracing are very important for epidemiological investigations of coronavirus disease 2019 (COVID-19). To ensure better adoption of contact tracing apps, it is necessary to understand users' expectations, preferences, and concerns. This study aimed to identify main reasons why people use the apps, appropriate services, and basis for voluntary app services that can improve app participation rates and data sharing. Methodology/Approach: This study conducted an online survey from November 11 to December 6, 2020, and received a total of 1,048 survey responses. This study analyzed the questionnaire survey findings of 883 respondents in areas with many confirmed cases of COVID-19. This study used a multiple regression analysis. Findings: Respondents who had experience of using related apps showed a high intention to use contact-tracing apps. Participants wished for the contact tracking apps to be provided by the government or public health centers (74%) and preferred free apps (93.88%). The factors affecting the participants' intention to use these apps were their preventive value, performance expectancy, perceived risk, facilitative ability, and effort expectancy. The results highlighted the need to ensure voluntary participation to address participants' concerns regarding privacy protection and personal information exposure. Practical Implications: The results can be used to accurately identify user needs and appropriate services and thereby improve the development of contact tracking apps. The findings provide the basis for voluntary app that can enhance app participation rates and data sharing. The results will also serve as the basis for developing trusted apps that can facilitate epidemiological investigations.

A Study on the Prediction Model for Bioactive Components of Cnidium officinale Makino according to Climate Change using Machine Learning (머신러닝을 이용한 기후변화에 따른 천궁 생리 활성 성분 예측 모델 연구)

  • Hyunjo Lee;Hyun Jung Koo;Kyeong Cheol Lee;Won-Kyun Joo;Cheol-Joo Chae
    • Smart Media Journal
    • /
    • v.12 no.10
    • /
    • pp.93-101
    • /
    • 2023
  • Climate change has emerged as a global problem, with frequent temperature increases, droughts, and floods, and it is predicted that it will have a great impact on the characteristics and productivity of crops. Cnidium officinale is used not only as traditionally used herbal medicines, but also as various industrial raw materials such as health functional foods, natural medicines, and living materials, but productivity is decreasing due to threats such as continuous crop damage and climate change. Therefore, this paper proposes a model that can predict the physiologically active ingredient index according to the climate change scenario of Cnidium officinale, a representative medicinal crop vulnerable to climate change. In this paper, data was first augmented using the CTGAN algorithm to solve the problem of data imbalance in the collection of environment information, physiological reactions, and physiological active ingredient information. Column Shape and Column Pair Trends were used to measure augmented data quality, and overall quality of 88% was achieved on average. In addition, five models RF, SVR, XGBoost, AdaBoost, and LightBGM were used to predict phenol and flavonoid content by dividing them into ground and underground using augmented data. As a result of model evaluation, the XGBoost model showed the best performance in predicting the physiological active ingredients of the sacrum, and it was confirmed to be about twice as accurate as the SVR model.