• Title/Summary/Keyword: Redundant Feature

Search Result 88, Processing Time 0.026 seconds

Constructing Negative Links from Multi-facet of Social Media

  • Li, Lin;Yan, YunYi;Jia, LiBin;Ma, Jun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.5
    • /
    • pp.2484-2498
    • /
    • 2017
  • Various types of social media make the people share their personal experience in different ways. In some social networking sites. Some users post their reviews, some users can support these reviews with comments, and some users just rate the reviews as kind of support or not. Unfortunately, there is rare explicit negative comments towards other reviews. This means if there is a link between two users, it must be positive link. Apparently, the negative link is invisible in these social network. Or in other word, the negative links are redundant to positive links. In this work, we first discuss the feature extraction from social media data and propose new method to compute the distance between each pair of comments or reviews on social media. Then we investigate whether we can predict negative links via regression analysis when only positive links are manifested from social media data. In particular, we provide a principled way to mathematically incorporate multi-facet data in a novel framework, Constructing Negative Links, CsNL to predict negative links for discovering the hidden information. Additionally, we investigate the ways of solution to general negative link predication problems with CsNL and its extension. Experiments are performed on real-world data and results show that negative links is predictable with multi-facet of social media data by the proposed framework CsNL. Essentially, high prediction accuracy suggests that negative links are redundant to positive links. Further experiments are performed to evaluate coefficients on different kernels. The results show that user generated content dominates the prediction performance of CsNL.

Classifying Cancer Using Partially Correlated Genes Selected by Forward Selection Method (전진선택법에 의해 선택된 부분 상관관계의 유전자들을 이용한 암 분류)

  • 유시호;조성배
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.3
    • /
    • pp.83-92
    • /
    • 2004
  • Gene expression profile is numerical data of gene expression level from organism measured on the microarray. Generally, each specific tissue indicates different expression levels in related genes, so that we can classify cancer with gene expression profile. Because not all the genes are related to classification, it is needed to select related genes that is called feature selection. This paper proposes a new gene selection method using forward selection method in regression analysis. This method reduces redundant information in the selected genes to have more efficient classification. We used k-nearest neighbor as a classifier and tested with colon cancer dataset. The results are compared with Pearson's coefficient and Spearman's coefficient methods and the proposed method showed better performance. It showed 90.3% accuracy in classification. The method also successfully applied to lymphoma cancer dataset.

Task Reconstruction Method for Real-Time Singularity Avoidance for Robotic Manipulators : Dynamic Task Priority Based Analysis (로봇 매니플레이터의 실시간 특이점 회피를 위한 작업 재구성법: 동적 작업 우선도에 기초한 해석)

  • 김진현;최영진
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.10 no.10
    • /
    • pp.855-868
    • /
    • 2004
  • There are several types of singularities in controlling robotic manipulators: kinematic singularity, algorithmic singularity, semi-kinematic singularity, semi-algorithmic singularity, and representation singularity. The kinematic and algorithmic singularities have been investigated intensively because they are not predictable or difficult to avoid. The problem with these singularities is an unnecessary performance reduction in non-singular region and the difficulty in performance tuning. Tn this paper, we propose a method of avoiding kinematic and algorithmic singularities by applying a task reconstruction approach while maximizing the task performance by calculating singularity measures. The proposed method is implemented by removing the component approaching the singularity calculated by using singularity measure in real time. The outstanding feature of the proposed task reconstruction method (TR-method) is that it is based on a local task reconstruction as opposed to the local joint reconstruction of many other approaches. And, this method has dynamic task priority assignment feature which ensures the system stability under singular regions owing to the change of task priority. The TR-method enables us to increase the task controller gain to improve the task performance whereas this increase can destabilize the system for the conventional algorithms in real experiments. In addition, the physical meaning of tuning parameters is very straightforward. Hence, we can maximize task performance even near the singular region while simultaneously obtaining the singularity-free motion. The advantage of the proposed method is experimentally tested by using the 7-dof spatial manipulator, and the result shows that the new method improves the performance several times over the existing algorithms.

Authorship Attribution Framework Using Survival Network Concept : Semantic Features and Tolerances (서바이벌 네트워크 개념을 이용한 저자 식별 프레임워크: 의미론적 특징과 특징 허용 범위)

  • Hwang, Cheol-Hun;Shin, Gun-Yoon;Kim, Dong-Wook;Han, Myung-Mook
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1013-1021
    • /
    • 2020
  • Malware Authorship Attribution is a research field for identifying malware by comparing the author characteristics of unknown malware with the characteristics of known malware authors. The authorship attribution method using binaries has the advantage that it is easy to collect and analyze targeted malicious codes, but the scope of using features is limited compared to the method using source code. This limitation has the disadvantage that accuracy decreases for a large number of authors. This study proposes a method of 'Defining semantic features from binaries' and 'Defining allowable ranges for redundant features using the concept of survival network' to complement the limitations in the identification of binary authors. The proposed method defines Opcode-based graph features from binary information, and defines the allowable range for selecting unique features for each author using the concept of a survival network. Through this, it was possible to define the feature definition and feature selection method for each author as a single technology, and through the experiment, it was confirmed that it was possible to derive the same level of accuracy as the source code-based analysis with an improvement of 5.0% accuracy compared to the previous study.

VRIFA: A Prediction and Nonlinear SVM Visualization Tool using LRBF kernel and Nomogram (VRIFA: LRBF 커널과 Nomogram을 이용한 예측 및 비선형 SVM 시각화도구)

  • Kim, Sung-Chul;Yu, Hwan-Jo
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.5
    • /
    • pp.722-729
    • /
    • 2010
  • Prediction problems are widely used in medical domains. For example, computer aided diagnosis or prognosis is a key component in a CDSS (Clinical Decision Support System). SVMs with nonlinear kernels like RBF kernels, have shown superior accuracy in prediction problems. However, they are not preferred by physicians for medical prediction problems because nonlinear SVMs are difficult to visualize, thus it is hard to provide intuitive interpretation of prediction results to physicians. Nomogram was proposed to visualize SVM classification models. However, it cannot visualize nonlinear SVM models. Localized Radial Basis Function (LRBF) was proposed which shows comparable accuracy as the RBF kernel while the LRBF kernel is easier to interpret since it can be linearly decomposed. This paper presents a new tool named VRIFA, which integrates the nomogram and LRBF kernel to provide users with an interactive visualization of nonlinear SVM models, VRIFA visualizes the internal structure of nonlinear SVM models showing the effect of each feature, the magnitude of the effect, and the change at the prediction output. VRIFA also performs nomogram-based feature selection while training a model in order to remove noise or redundant features and improve the prediction accuracy. The area under the ROC curve (AUC) can be used to evaluate the prediction result when the data set is highly imbalanced. The tool can be used by biomedical researchers for computer-aided diagnosis and risk factor analysis for diseases.

A Study on Data Pre-filtering Methods for Fault Diagnosis (시스템 결함원인분석을 위한 데이터 로그 전처리 기법 연구)

  • Lee, Yang-Ji;Kim, Duck-Young;Hwang, Min-Soon;Cheong, Young-Soo
    • Korean Journal of Computational Design and Engineering
    • /
    • v.17 no.2
    • /
    • pp.97-110
    • /
    • 2012
  • High performance sensors and modern data logging technology with real-time telemetry facilitate system fault diagnosis in a very precise manner. Fault detection, isolation and identification in fault diagnosis systems are typical steps to analyze the root cause of failures. This systematic failure analysis provides not only useful clues to rectify the abnormal behaviors of a system, but also key information to redesign the current system for retrofit. The main barriers to effective failure analysis are: (i) the gathered data (event) logs are too large in general, and further (ii) they usually contain noise and redundant data that make precise analysis difficult. This paper therefore applies suitable pre-processing techniques to data reduction and feature extraction, and then converts the reduced data log into a new format of event sequence information. Finally the event sequence information is decoded to investigate the correlation between specific event patterns and various system faults. The efficiency of the developed pre-filtering procedure is examined with a terminal box data log of a marine diesel engine.

Reliable Gossip Zone for Real-Time Communications in Wireless Sensor Networks

  • Li, Bijun;Kim, Ki-Il
    • Journal of information and communication convergence engineering
    • /
    • v.9 no.2
    • /
    • pp.244-250
    • /
    • 2011
  • Gossip is a well-known protocol which was proposed to implement broadcast service with a high reliability in an arbitrarily connected network of sensor nodes. The probabilistic techniques employed in gossip have been used to address many challenges which are caused by flooding in wireless sensor networks (WSNs). However, very little work has yet been done on real-time wireless sensor networks which require not only highly reliable packets reception but also strict time constraint of each packet. Moreover, the unique energy constraining feature of sensor makes existing solutions unsuitable. Combined with unreliable links, redundant messages overhead in real-time wireless sensor networks is a new challenging issue. In this paper, we introduce a Reliable Gossip Zone, a novel fine-tailored mechanism for real-time wireless sensor networks with unreliable wireless links and low packet redundancy. The key idea is the proposed forwarding probability algorithm, which makes forwarding decisions after the realtime flooding zone is set. Evaluation shows that as an oracle broadcast service design, our mechanism achieves significantly less message overhead than traditional flooding and gossip protocols.

Application of Atomic Layer Deposition to Solid Oxide Fuel Cells

  • Kim, Eui-Hyun;Ko, Myeong-Hee;Hwang, Hee-Soo;Hwang, Jin-ha
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2014.02a
    • /
    • pp.478.2-478.2
    • /
    • 2014
  • Atomic layer deposition (ALD) provides self-limiting processes based on chemisorption-based reactions. Such unique features allow for superior step coverage, atomic-scale control in thickness, and surface-dependent reaction controls. Furthermore, the surface-limited deposition enables the artificial deposition of oxide and/or metallic materials onto the porous systems as long as the supply is guaranteed in terms of time in providing reactant species and removing the byproducts and redundant reactants. The unique feature of atomic layer deposition is applied to solid oxide fuel cells whose incorporates two porous cathode and anode compartments in addition to the ionic electrolyte. Specific materials are deposited to the surface sites of porous electrodes, with the aim to controlling the triple phase boundaries crucial for the optimized SOFC performances. The effect of ALD on the SOFC performance is characterized using current-voltage characteristics in addition to frequency-dependent impedance spectroscopy. The pros and cons of ALD-controlled SOFCs are discussed toward high-performance SOFC systems.

  • PDF

The Generation of Control Rules for Data Mining (데이터 마이닝을 위한 제어규칙의 생성)

  • Park, In-Kyoo
    • Journal of Digital Convergence
    • /
    • v.11 no.11
    • /
    • pp.343-349
    • /
    • 2013
  • Rough set theory comes to derive optimal rules through the effective selection of features from the redundancy of lots of information in data mining using the concept of equivalence relation and approximation space in rough set. The reduction of attributes is one of the most important parts in its applications of rough set. This paper purports to define a information-theoretic measure for determining the most important attribute within the association of attributes using rough entropy. The proposed method generates the effective reduct set and formulates the core of the attribute set through the elimination of the redundant attributes. Subsequently, the control rules are generated with a subset of feature which retain the accuracy of the original features through the reduction.

Analysis of Expressed Sequence Tags Generated from the Posterior Silkgland cDNA Clones of Antheraea yamamai (천잠 후부 견사선 유래 발현 유전자 꼬리표 작성 및 분석)

  • 윤은영;구태원;강석우;이혜원;황재삼;김호락
    • Journal of Life Science
    • /
    • v.10 no.2
    • /
    • pp.188-195
    • /
    • 2000
  • In order to understand molecular events during silk synthesis and provide genetic resources for molecular breeding, we had analyzed the cDNA library constructed from the posterior silkgland of Antheraea yamamai and partially sequenced 276 randomly selected genes from the cDNA library. Database comparisons of the expressed sequence tags (ESTs) revealed that 26 non-redundant clones showed a high similarity with previously identified genes. Among them, 17 clones exhibited a homology with previously identified insect genes and 9 clones were identical to genes that were previously identified from other organisms. A functional categorization showed that silk synthesis-defense- or stress-related genes, as well as genes involved in the metabolic pathways and in the transcriptional or translational apparatus are represented. In this report, the clone (AY479) which had high similarity with fibroin from A. pernyi was particularly analyzed in detail. The AY479 clone was carboxyl terminal region of fibroin. The 472 bp cDNA has 123 amino acids that shared 85% homology with the fibroin from A. pernyi and its deduced peptide had unique feature, that is, sites of alanine rich residues.

  • PDF