• Title/Summary/Keyword: Paper Similarity Test

Search Result 216, Processing Time 0.022 seconds

Improving Hypertext Classification Systems through WordNet-based Feature Abstraction (워드넷 기반 특징 추상화를 통한 웹문서 자동분류시스템의 성능향상)

  • Roh, Jun-Ho;Kim, Han-Joon;Chang, Jae-Young
    • The Journal of Society for e-Business Studies
    • /
    • v.18 no.2
    • /
    • pp.95-110
    • /
    • 2013
  • This paper presents a novel feature engineering technique that can improve the conventional machine learning-based text classification systems. The proposed method extends the initial set of features by using hyperlink relationships in order to effectively categorize hypertext web documents. Web documents are connected to each other through hyperlinks, and in many cases hyperlinks exist among highly related documents. Such hyperlink relationships can be used to enhance the quality of features which consist of classification models. The basic idea of the proposed method is to generate a sort of ed concept feature which consists of a few raw feature words; for this, the method computes the semantic similarity between a target document and its neighbor documents by utilizing hierarchical relationships in the WordNet ontology. In developing classification models, the ed concept features are equated with other raw features, and they can play a great role in developing more accurate classification models. Through the extensive experiments with the Web-KB test collection, we prove that the proposed methods outperform the conventional ones.

Isolation of Aeromonas sobria Containing Hemolysin Gene from Arowana (Scleropages formosus) (Arowana(Scleropages formosus)에서 Hemolysin Gene을 지닌 Aeromonas sobria 분리 및 특성)

  • Jun, Jin-Woo;Kim, Ji-Hyung;Casiano, Choresca Jr.;Dennis, K. Gomez;Shin, Sang-Phil;Han, Jee-Eun;Park, Se-Chang
    • Journal of Veterinary Clinics
    • /
    • v.27 no.1
    • /
    • pp.62-65
    • /
    • 2010
  • Arowana (Scleropages formosus) is the most valuable group of ornamental fishes and very much in demand in the ornamental fish trade and commands high price ranging from hundreds to thousands of dollars per fish. In this paper, we described a case of mortality of arowana from a private aquarium in Korea. A bacterial pathogen from fish organs (brain, kidney, liver) was cultured, identified and confirmed using Vitek System 2, API 20E test, multiplex PCR and 16S rRNA gene sequencing. The morphological and biochemical properties of the bacterium isolated from the brain, kidney and liver of the fish were similar to Aeromonas sobria. Positive amplification products using the multiplex PCR assay for detection of A. sobria were obtained from these organs. The 16S rRNA gene of the isolates from fish was identical and exhibited 100% sequence similarity with A. sobria (AY987762.1) strain available from GenBank. This bacterium contained hemolysin gene, a virulence factor that plays an important role in outbreaks of disease and is pathogenic to humans as well as in fish. Although this opportunistic bacterium was isolated from a fish without any external symptoms, this pathogen may act as a reservoir and enhance chances of zoonosis to human such as during handling.

Sleep/Wake Dynamic Classifier based on Wearable Accelerometer Device Measurement (웨어러블 가속도 기기 측정에 의한 수면/비수면 동적 분류)

  • Park, Jaihyun;Kim, Daehun;Ku, Bonhwa;Ko, Hanseok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.6
    • /
    • pp.126-134
    • /
    • 2015
  • A sleep disorder is being recognized as one of the major health issues related to high levels of stress. At the same time, interests about quality of sleep are rapidly increasing. However, diagnosing sleep disorder is not a simple task because patients should undergo polysomnography test, which requires a long time and high cost. To solve this problem, an accelerometer embedded wrist-worn device is being considered as a simple and low cost solution. However, conventional methods determine a state of user to "sleep" or "wake" according to whether values of individual section's accelerometer data exceed a certain threshold or not. As a result, a high miss-classification rate is observed due to user's intermittent movements while sleeping and tiny movements while awake. In this paper, we propose a novel method that resolves the above problems by employing a dynamic classifier which evaluates a similarity between the neighboring data scores obtained from SVM classifier. A performance of the proposed method is evaluated using 50 data sets and its superiority is verified by achieving 88.9% accuracy, 88.9% sensitivity, and 88.5% specificity.

Development of Block-based Code Generation and Recommendation Model Using Natural Language Processing Model (자연어 처리 모델을 활용한 블록 코드 생성 및 추천 모델 개발)

  • Jeon, In-seong;Song, Ki-Sang
    • Journal of The Korean Association of Information Education
    • /
    • v.26 no.3
    • /
    • pp.197-207
    • /
    • 2022
  • In this paper, we develop a machine learning based block code generation and recommendation model for the purpose of reducing cognitive load of learners during coding education that learns the learner's block that has been made in the block programming environment using natural processing model and fine-tuning and then generates and recommends the selectable blocks for the next step. To develop the model, the training dataset was produced by pre-processing 50 block codes that were on the popular block programming language web site 'Entry'. Also, after dividing the pre-processed blocks into training dataset, verification dataset and test dataset, we developed a model that generates block codes based on LSTM, Seq2Seq, and GPT-2 model. In the results of the performance evaluation of the developed model, GPT-2 showed a higher performance than the LSTM and Seq2Seq model in the BLEU and ROUGE scores which measure sentence similarity. The data results generated through the GPT-2 model, show that the performance was relatively similar in the BLEU and ROUGE scores except for the case where the number of blocks was 1 or 17.

Durability Evaluation of Cement Concrete Using Ferrosilicon Industrial Byproduct (페로실리콘 산업부산물 활용 시멘트 콘크리트의 내구성능 평가)

  • Chang-Young Kim;Ki Yong Ann
    • Journal of the Korean Recycled Construction Resources Institute
    • /
    • v.11 no.1
    • /
    • pp.89-96
    • /
    • 2023
  • In this paper, a ferrosilicon by-product was evaluated to confirm the feasibility of recycling it as supplementary cementitious material of ordinary Portland cement in concrete. Three different levels of replacement ratio (10 %, 20 % and 30 % of total binder) were applied to find which is the most beneficial to be used as a binder. Ferrosilicon concrete was initially assessed at setting time and compressive strength. Durability was evaluated by the resistance to chloride penetration test(RCPT) and alkali-silica reaction(ASR) with a comparison to silica fume concrete due to their similarity in chemical composition. The porosimetry and X-ray diffraction analysis along with energy dispersive X-ray spectroscopy give information on the microstructural characteristics of the ferrosilicon concrete. It was found that 10 % ferrosilicon concrete has higher strength while 20 %, 30 % have lower strength than OPC concrete. However, chemical resistance to chloride attack is higher when replacement is increased. Compared to silica fume, the durability of ferrosilicon might be less efficient however, it is obviously beneficial than OPC. High SiO2 content in ferrosilicon results in producing more C-S-H gel which could make denser pore structure. Most of the risk of alkali silica reaction to silicate binders through length change tests was less than 0.2 %, and both mortar using ferrosilicon and silica fume showed better resistance to alkali silica reaction as the substitution rate increased.Reuse of industrial waste rather than producing highly refined additives might reduce environmental load during manufacture and save costs.

A Study on Music Summarization (음악요약 생성에 관한 연구)

  • Kim Sung-Tak;Kim Sang-Ho;Kim Hoi-Rin;Choi Ji-Hoon;Lee Han-Kyu;Hong Jin-Woo
    • Journal of Broadcast Engineering
    • /
    • v.11 no.1 s.30
    • /
    • pp.3-14
    • /
    • 2006
  • Music summarization means a technique which automatically generates the most importantand representative a part or parts ill music content. The techniques of music summarization have been studied with two categories according to summary characteristics. The first one is that the repeated part is provided as music summary and the second provides the combined segments which consist of segments with different characteristics as music summary in music content In this paper, we propose and evaluate two kinds of music summarization techniques. The algorithm using multi-level vector quantization which provides a repeated part as music summary gives fixed-length music summary is evaluated by overlapping ration between hand-made repeated parts and automatically generated summary. As results, the overlapping ratios of conventional methods are 42.2% and 47.4%, but that of proposed method with fixed-length summary is 67.1%. Optimal length music summary is evaluated by the portion of overlapping between summary and repeated part which is different length according to music content and the result shows that automatically-generated summary expresses more effective part than fixed-length summary with optimal length. The cluster-based algorithm using 2-D similarity matrix and k-means algorithm provides the combined segments as music summary. In order to evaluate this algorithm, we use MOS test consisting of two questions(How many similar segments are in summarized music? How many segments are included in same structure?) and the results show good performance.