• Title/Summary/Keyword: 집합 기반 분석

Search Result 536, Processing Time 0.031 seconds

Optimization-Based Pattern Generation for LAD (최적화에 기반을 둔 LAD의 패턴 생성 기법)

  • Jang, In-Yong;Ryoo, Hong-Seo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.1 s.39
    • /
    • pp.11-18
    • /
    • 2006
  • The logical analysis of data(LAD) is a Boolean-logic based data mining tool. A critical step in analyzing data by LAD is the pattern generation stage where useful knowledge and hidden structural information in data is discovered in the form of patterns. A conventional method for pattern generation in LAD is based on term enumeration that renders the generation of higher degree patterns practically impossible. In this paper, we present a novel optimization-based pattern generation methodology and propose two mathematical programming models, a mixed 0-1 integer and linear programming (MILP) formulation and a well-studied set covering problem (SCP) formulation for the generation of optimal and heuristic patterns, respectively. With benchmark datasets, we demonstrate the effectiveness of our models by automatically generating with ease patterns of high complexity that cannot be generated with the conventional approach.

  • PDF

Designand Implementation of Web-Based Blood-Cell Analysis System for Pathology Diagnosis (병리진단을 위한 웹기반 혈액영상 분석시스템의 설계 및 구현)

  • 김경수;이영신;김용국;이윤배;김판구
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 1998.10a
    • /
    • pp.333-337
    • /
    • 1998
  • 의학분야에서 컴퓨터 활용은 단순히 처리할 데이터의 자동화뿐만 아니라 각종 의학영상들을 자동으로 처리함으로서 의사의 진단을 도와주는 형태로 발전되어 가고 있다. 본 논문에서는 병원의 임상병리과에서 번번히 수행하는 혈액검사를 자동화하기 위한 것으로 혈액을 자동 분석하는 웹 기반 분석시스템을 구축하였다. 이를 위해 본 논문에서는 혈액 영상으로부터 특징을 추출하기 위한 단계를 서술하고 세포분류를 위한 다층 신경망을 이용해 구현한 내용을 보인다. 또한 본 연구의 결과로 신경망의 학습 효율을 높이기 위한 전처리로서 학습 데이터에 대해 러프 집합 이론을 적용하여 학습 데이터의 차원을 효과적으로 줄일 수 있었다.

  • PDF

Production of Low-illuminated Image Sets based on Spectral Data for Color Constancy Research (색 항등성을 위한 분광 데이터 기반의 저조도 영상 집합 생성)

  • Kim, Dal-Hyoun;Lee, Woo-Ram;Hwang, Dong-Guk;Jun, Byoung-Min
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.7
    • /
    • pp.3207-3213
    • /
    • 2011
  • Most methods of color constancy, which is the ability to determine the object color regardless of the scene illuminant, have failed to meet our expectation of their performance especially about low-illuminated scenes. Some methods with high performance need to be developed, but we must, above all else, obtain experimental images for analyzing the required circumstances or evaluating the methods. Therefore, the paper produces new sets of images so that they can be used in the development of color constancy methods suitable for low-illuminated scenes. These sets are composed of two parts: one part of images which are synthesized with spectral power distribution(SPD) of illuminants, spectral reflectance curve of reflectances, and sensor response functions of camera; the other part of images where the intensity of each image is adjusted at the uniform rate. In an experiment, the use of the sets takes an advantage that its result images are analyzed and evaluated quantitatively as their ground truth data are known in advance.

Multiple Cause Model-based Topic Extraction and Semantic Kernel Construction from Text Documents (다중요인모델에 기반한 텍스트 문서에서의 토픽 추출 및 의미 커널 구축)

  • 장정호;장병탁
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.5
    • /
    • pp.595-604
    • /
    • 2004
  • Automatic analysis of concepts or semantic relations from text documents enables not only an efficient acquisition of relevant information, but also a comparison of documents in the concept level. We present a multiple cause model-based approach to text analysis, where latent topics are automatically extracted from document sets and similarity between documents is measured by semantic kernels constructed from the extracted topics. In our approach, a document is assumed to be generated by various combinations of underlying topics. A topic is defined by a set of words that are related to the same topic or cooccur frequently within a document. In a network representing a multiple-cause model, each topic is identified by a group of words having high connection weights from a latent node. In order to facilitate teaming and inferences in multiple-cause models, some approximation methods are required and we utilize an approximation by Helmholtz machines. In an experiment on TDT-2 data set, we extract sets of meaningful words where each set contains some theme-specific terms. Using semantic kernels constructed from latent topics extracted by multiple cause models, we also achieve significant improvements over the basic vector space model in terms of retrieval effectiveness.

A Secure Frequency Computation Method over Multisets (안전한 다중집합 빈도 계산 기법)

  • Kim, Myungsun;Park, Jaesung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39B no.6
    • /
    • pp.370-378
    • /
    • 2014
  • It is well known that data mining plays a crucial role in varities of real-world applications, by which extracts knowledge from large volume of datasets. Among functionalties provided by data mining, frequency mining over given multisets is a basic and essential one. However, most of users would like to obtain the frequency over their multisets without revealing their own multisets. In this work, we come up with a novel way to achive this goal and prove its security rigorously. Our scheme has several advantages over existing work as follows: Firstly, our scheme has the most efficient computational complexity in the cardinality of multisets. Further our security proof is rigorously in the simulation paradigm. Lastly our system assumption is general.

Using rough set to support arbitrage box spread strategies in KOSPI 200 option markets (러프 집합을 이용한 코스피 200 주가지수옵션 시장에서의 박스스프레드 전략 실증분석 및 거래 전략)

  • Kim, Min-Sik;Oh, Kyong-Joo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.1
    • /
    • pp.37-47
    • /
    • 2011
  • Stock price index option market has various investment strategies that have been developed. Specially, arbitrage strategies are very important to be efficient in option market. The purpose of this study is to improve profit using rough set and Box spread by using past option trading data. Option trading data was based on an actual stock exchange market tick data ranging from 2001 to 2006. Validation process was carried out by transferring the tick data into one-minute intervals. Box spread arbitrage strategies is low risk but low profit. It can be accomplished by back-testing of the existing strategy of the past data and by using rough set, which limit the time line of dealing. This study can make more stable profits with lower risk if control the strategy that can produces a higher profit module compared to that of the same level of risk.

Determination of Intrusion Log Ranking using Inductive Inference (귀납 추리를 이용한 침입 흔적 로그 순위 결정)

  • Ko, Sujeong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.1
    • /
    • pp.1-8
    • /
    • 2019
  • Among the methods for extracting the most appropriate information from a large amount of log data, there is a method using inductive inference. In this paper, we use SVM (Support Vector Machine), which is an excellent classification method for inductive inference, in order to determine the ranking of intrusion logs in digital forensic analysis. For this purpose, the logs of the training log set are classified into intrusion logs and normal logs. The associated words are extracted from each classified set to generate a related word dictionary, and each log is expressed as a vector based on the generated dictionary. Next, the logs are learned using the SVM. We classify test logs into normal logs and intrusion logs by using the log set extracted through learning. Finally, the recommendation orders of intrusion logs are determined to recommend intrusion logs to the forensic analyst.

A Study on Recognition of Citation Metadata using Bidirectional GRU-CRF Model based on Pre-trained Language Model (사전학습 된 언어 모델 기반의 양방향 게이트 순환 유닛 모델과 조건부 랜덤 필드 모델을 이용한 참고문헌 메타데이터 인식 연구)

  • Ji, Seon-yeong;Choi, Sung-pil
    • Journal of the Korean Society for information Management
    • /
    • v.38 no.1
    • /
    • pp.221-242
    • /
    • 2021
  • This study applied reference metadata recognition using bidirectional GRU-CRF model based on pre-trained language model. The experimental group consists of 161,315 references extracted by 53,562 academic documents in PDF format collected from 40 journals published in 2018 based on rules. In order to construct an experiment set. This study was conducted to automatically extract the references from academic literature in PDF format. Through this study, the language model with the highest performance was identified, and additional experiments were conducted on the model to compare the recognition performance according to the size of the training set. Finally, the performance of each metadata was confirmed.

정책기반의 새로운 공격 탐지 방법

  • 김형훈
    • Review of KIISC
    • /
    • v.13 no.1
    • /
    • pp.64-67
    • /
    • 2003
  • 컴퓨팅 환경이 보다 신뢰성 있고 실질적으로 사용되기 위해서는 보안이 필수적인 기능으로 요구된다. 알려진 공격의 패턴을 이용한 침입탐지는 공격자의 여러 가지 변형된 방법이나 새로운 공격 방법에 의해 쉽게 공격당할 수 있다. 또한 각각의 보안정책을 교묘히 회피하는 많은 공격 방법들이 수시로 개발되어 시도되고 있다. 따라서 침입에 성공하는 많은 공격들은 기존의 공격 패턴과 보안정책 사이의 허점을 이용하여 발생된다고 볼 수 있다. 본 논문에서 제안된 방법은 새로운 공격을 탐지하기 위해 이를 탐지하기 위한 특징값을 규칙집합을 통해 획득한다. 규칙집합은 알려진 공격, 보안정책과 관리자의 경험적 지식에 대한 분석을 통해 공격의 특징을 감지할 수 있도록 작성된다. 이러한 규칙집합에 의해 획득된 특징값들은 훈련단계에서 Naive Bayes 분류기법을 통해 공격에 대한 통계적 특징값으로 사용한다. 제안된 방법은 훈련단계에서 얻어진 공격에 대한 통계적 특징값을 이용하여 변형된 공격이 나 새로운 공격을 탐지할 수 있다.

Design of GAS Identification model based on Rough Sets (러프집합 기반 GAS 식별 모델 설계)

  • Bang, Young-Keun;Zhao, Haibo;Kim, Nam-Suk;Lee, Chul-Heui
    • Proceedings of the KIEE Conference
    • /
    • 2011.07a
    • /
    • pp.1776-1777
    • /
    • 2011
  • 인간의 감각 중 후각에 해당하는 가스 센서들에 관한 연구가 현재 상당히 이루어지고 있다. 본 논문에서는 32개의 가스 센서들로 부터 측정된 각각의 값들과 GA를 이용하여, 4개의 센서로 구성되는 8개의 센서그룹을 결정한 후 각각의 그룹에서 나타나는 측정값들의 패턴과 러프집합이론을 이용하여 1차 식별 규칙을 생성하였다. 그 다음 8개 가스 그룹의 식별 패턴을 분석하여 다시 러프집합을 통한 2차 식별 규칙을 생성함으로써 보다 효율적이면서도 판단의 정확성을 높일 수 있는 식별 모델을 설계하는 방법을 다룬다.

  • PDF