• Title/Summary/Keyword: Term weighting

Search Result 110, Processing Time 0.025 seconds

Automatic Document Classification by Term-Weighting Method (범주 대표어의 가중치 계산 방식에 의한 자동 문서 분류 시스템)

  • 이경찬;강승식
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04b
    • /
    • pp.475-477
    • /
    • 2002
  • 자동 문서 분류는 범주 특성 벡터와 입력 문서 벡터의 유사도 비교에 의해 가장 유사한 범주를 선택하는 방법이다. 문서 분류 시스템을 구현하기 위하여 각 범주의 특성 벡터를 정보 검색 시스템의 역파일 형태로 구축하였으며, 용어 가중치를 계산하는 방법을 달리하여 문서 분류 시스템의 정확도를 실험하였다. 실험 문서는 일간지의 신문기사들을 무작위로 추출한 문서 집합을 대상으로 하였으며, 정보 검색 모델에서 보편적으로 사용되는 TF-lDF 방식이 변형된 방식에 비해 더 나은 성능을 보였다.

  • PDF

A Study of Efficiency Information Filtering System using One-Hot Long Short-Term Memory

  • Kim, Hee sook;Lee, Min Hi
    • International Journal of Advanced Culture Technology
    • /
    • v.5 no.1
    • /
    • pp.83-89
    • /
    • 2017
  • In this paper, we propose an extended method of one-hot Long Short-Term Memory (LSTM) and evaluate the performance on spam filtering task. Most of traditional methods proposed for spam filtering task use word occurrences to represent spam or non-spam messages and all syntactic and semantic information are ignored. Major issue appears when both spam and non-spam messages share many common words and noise words. Therefore, it becomes challenging to the system to filter correct labels between spam and non-spam. Unlike previous studies on information filtering task, instead of using only word occurrence and word context as in probabilistic models, we apply a neural network-based approach to train the system filter for a better performance. In addition to one-hot representation, using term weight with attention mechanism allows classifier to focus on potential words which most likely appear in spam and non-spam collection. As a result, we obtained some improvement over the performances of the previous methods. We find out using region embedding and pooling features on the top of LSTM along with attention mechanism allows system to explore a better document representation for filtering task in general.

Multilingual Story Link Detection based on Properties of Event Terms (사건 어휘의 특성을 반영한 다국어 사건 연결 탐색)

  • Lee Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.12B no.1 s.97
    • /
    • pp.81-90
    • /
    • 2005
  • In this paper, we propose a novel approach which models multilingual story link detection by adapting the features such as timelines and multilingual spaces as weighting components to give distinctive weights to terms related to events. On timelines term significance is calculated by comparing term distribution of the documents on that day with that on the total document collection reported, and used to represent the document vectors on that day. Since two languages can provide more information than one language, term significance is measured on each language space and used to refer the other language space as a bridge on multilingual spaces. Evaluating the method on Korean and Japanese news articles, our method achieved $14.3{\%}\;and\;16.7{\%}$ improvement for mono- and multi-lingual story pairs, and for multilingual story pairs, respectively. By measuring the space density, the proposed weighting components are verified with a high density of the intra-event stories and a low density of the inter-events stories. This result indicates that the proposed method is helpful for multilingual story link detection.

Design Optimization of Dimple Shape to Enhance Turbulent Heat Transfer (난류열전달 증진을 위한 딤플형상의 최적설계)

  • Choi Ji-Yong;Kim Kwang-Yong
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.30 no.7 s.250
    • /
    • pp.700-706
    • /
    • 2006
  • This study presents a numerical procedure to optimize the shape of dimple surface to enhance turbulent heat transfer in a rectangular channel. The response surface based optimization method is used as an optimization technique with Reynolds-averaged Wavier-Stokes analysis of fluid flow and heat transfer with shear stress transport (SST) turbulence model. The dimple depth-to-dimple print diameter ratio, channel height-to-dimple print diameter ratio, and dimple print diameter-to-pitch ratio are chosen as design variables. The objective function is defined as a linear combination of heat transfer related term and friction loss related term with a weighting factor. full factorial method is used to determine the training points as a mean of design of experiment. The optimum shape shows remarkable performance in comparison with a reference shape.

Design Optimization of a Staggered Dimpled Channel Using Neural Network Techniques (신경회로망기법을 사용한 엇갈린 딤플 유로의 최적설계)

  • Shin, Dong-Yoon;Kim, Kwang-Yong
    • The KSFM Journal of Fluid Machinery
    • /
    • v.10 no.3 s.42
    • /
    • pp.39-46
    • /
    • 2007
  • This study presents a numerical procedure to optimize the shape of staggered dimple surface to enhance turbulent heat transfer in a rectangular channel. The RBNN method is used as an optimization technique with Reynolds-averaged Navier-Stokes analysis of fluid flow and heat transfer with shear stress transport (SST) turbulence model. The dimple depth-to-dimple print diameter (d/D), channel height-to-dimple print diameter ratio (H/D), and dimple print diameter-to-pitch ratio (D/S) are chosen as design variables. The objective function is defined as a linear combination of heat transfer related term and friction loss related term with a weighting factor. Latin Hypercube Sampling (LHS) is used to determine the training points as a mean of the design of experiment. The optimum shape shows remarkable performance in comparison with a reference shape.

An Unified Bayesian Total Variation Regularization Method and Application to Image Restoration (통합 베이즈 총변이 정규화 방법과 영상복원에 대한 응용)

  • Yoo, Jae-Hung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.1
    • /
    • pp.41-48
    • /
    • 2022
  • This paper presents the unified Bayesian Tikhonov regularization method as a solution to total variation regularization. The integrated method presents a formula for obtaining the regularization parameter by transforming the total variation term into a weighted Tikhonov regularization term. It repeats until the reconstructed image converges to obtain a regularization parameter and a new weighting factor based on it. The experimental results show the effectiveness of the proposed method for the image restoration problem.

HF-IFF: Applying TF-IDF to Measure Symptom-Medicinal Herb Relevancy and Visualize Medicinal Herb Characteristics - Studying Formulations in Cheongkangeuigam - (HF-IFF: TF-IDF를 응용한 병증-본초 연관성(relevancy) 측정과 본초 특성의 시각화 -청강의감 방제를 대상으로-)

  • Oh, Junho
    • The Korea Journal of Herbology
    • /
    • v.30 no.3
    • /
    • pp.63-68
    • /
    • 2015
  • Objectives : We applied the term weighting method used in the field of data search to quantify relevancy between symptoms and medicinal herbs, and, based on this, we aim to introduce a method of visualizing the characteristics of medicinal herbs. Methods : We proposed HF-IFF, an adaptation of TF-IDF, which is a term weighting measurement method adapted in the field of data search. Using this method, we deduced relevancy between symptoms and medicinal herbs In Cheongkangeuigam that was published in 1984 by organizing the medical theory of Cheongkang, Kim Younghoon, and visualized this as a graph in order to compare the characteristics of medicinal herbs used for different symptoms. Results : HF-IFF is the product of HF and IFF, where HF is the frequency of the relevant medicinal herb for a set of symptoms, and IFF is the inverse of the number of formulations (FF) containing that herb. A total of 251 types of medicinal herb are used in Cheongkangeuigam, and 1538 formulations are classified according to 67 types of symptom. The overall mean for HF-IFF was 0.491, with a maximum of 4.566 and a minimum of 0.013. Conclusions : In spite of several limitations, we were able to use HF-IFF to measure relevancy between symptoms and medicinal herbs, with formulations as an intermediate. We were able to use the quantified results to visually express the characteristics of the herbs used for symptoms by bubble chart and word-cloud from HF-IFF.

Solutions for the Effective 3D Character Skin Weight by converting Lattice Weight (래티스 웨이트 변환을 통한 효과적인 3D 캐릭터 스킨 웨이트 솔루션 제안)

  • Song, Bal-gum;Lee, Hyun-seok
    • Cartoon and Animation Studies
    • /
    • s.44
    • /
    • pp.33-56
    • /
    • 2016
  • As the rapid extension of the game and film industry, studies on developing natural movements on a 3D characters are increasing. Rigging a character with joints is essential to create realistic movements on a 3D character. The rapid development of the CG industry, rigging technologies and workflow is becoming more sophisticated. Despite the progress and the growth of rigging operations, has shown the limitations of such repetitive tasks. For this study, analyzes the issues and inefficiency of the old method of skin weights and propose a better approach. First, need to understand the general process of an animation pipeline and learn the technology term of skin weights. Second, comparing the traditional ways of skinning a character and applying other deformers to work properly. Third, testing out new ways of weighting a character by applying deformers such as lattice and finally converting lattice weights back to skin weights. Forth, verifying effectiveness of the new method of skin weights by comparing with the traditional skin weighting process. The study shows that the new method of skin weights, reduced working hours and a better final weighting outcome. Expecting this study to enhance the method of skin weights and able to utilize this new skinning technology.

Evaluation of Optimum Genetic Contribution Theory to Control Inbreeding While Maximizing Genetic Response

  • Oh, S.H.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.25 no.3
    • /
    • pp.299-303
    • /
    • 2012
  • Inbreeding is the mating of relatives that produce progeny having more homozygous alleles than non-inbred animals. Inbreeding increases numbers of recessive alleles, which is often associated with decreased performance known as inbreeding depression. The magnitude of inbreeding depression depends on the level of inbreeding in the animal. Level of inbreeding is expressed by the inbreeding coefficient. One breeding goal in livestock is uniform productivity while maintaining acceptable inbreeding levels, especially keeping inbreeding less than 20%. However, in closed herds without the introduction of new genetic sources high levels of inbreeding over time are unavoidable. One method that increases selection response and minimizes inbreeding is selection of individuals by weighting estimated breeding values with average relationships among individuals. Optimum genetic contribution theory (OGC) uses relationships among individuals as weighting factors. The algorithm is as follows: i) Identify the individual having the best EBV; ii) Calculate average relationships ($\bar{r_j}$) between selected and candidates; iii) Select the individual having the best EBV adjusted for average relationships using the weighting factor k, $EBV^*=EBV_j(1-k\bar{{r}_j})$ Repeat process until the number of individuals selected equals number required. The objective of this study was to compare simulated results based on OGC selection under different conditions over 30 generations. Individuals (n = 110) were generated for the base population with pseudo random numbers of N~ (0, 3), ten were assumed male, and the remainder female. Each male was mated to ten females, and every female was assumed to have 5 progeny resulting in 500 individuals in the following generation. Results showed the OGC algorithm effectively controlled inbreeding and maintained consistent increases in selection response. Difference in breeding values between selection with OGC algorithm and by EBV only was 8%, however, rate of inbreeding was controlled by 47% after 20 generation. These results indicate that the OGC algorithm can be used effectively in long-term selection programs.

Cognitive Virtual Network Embedding Algorithm Based on Weighted Relative Entropy

  • Su, Yuze;Meng, Xiangru;Zhao, Zhiyuan;Li, Zhentao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.1845-1865
    • /
    • 2019
  • Current Internet is designed by lots of service providers with different objects and policies which make the direct deployment of radically new architecture and protocols on Internet nearly impossible without reaching a consensus among almost all of them. Network virtualization is proposed to fend off this ossification of Internet architecture and add diversity to the future Internet. As an important part of network virtualization, virtual network embedding (VNE) problem has received more and more attention. In order to solve the problems of large embedding cost, low acceptance ratio (AR) and environmental adaptability in VNE algorithms, cognitive method is introduced to improve the adaptability to the changing environment and a cognitive virtual network embedding algorithm based on weighted relative entropy (WRE-CVNE) is proposed in this paper. At first, the weighted relative entropy (WRE) method is proposed to select the suitable substrate nodes and paths in VNE. In WRE method, the ranking indicators and their weighting coefficients are selected to calculate the node importance and path importance. It is the basic of the WRE-CVNE. In virtual node embedding stage, the WRE method and breadth first search (BFS) algorithm are both used, and the node proximity is introduced into substrate node ranking to achieve the joint topology awareness. Finally, in virtual link embedding stage, the CPU resource balance degree, bandwidth resource balance degree and path hop counts are taken into account. The path importance is calculated based on the WRE method and the suitable substrate path is selected to reduce the resource fragmentation. Simulation results show that the proposed algorithm can significantly improve AR and the long-term average revenue to cost ratio (LTAR/CR) by adjusting the weighting coefficients in VNE stage according to the network environment. We also analyze the impact of weighting coefficient on the performance of the WRE-CVNE. In addition, the adaptability of the WRE-CVNE is researched in three different scenarios and the effectiveness and efficiency of the WRE-CVNE are demonstrated.