• Title/Summary/Keyword: 유사 소프트웨어 필터링

Search Result 32, Processing Time 0.025 seconds

Taboo Word Matching System Using a Common Multilingual Phoneme System (다국어 공통 음소 체계를 이용한 금기어 매칭 시스템)

  • Kim, Da-Hee;Shin, Sa-Im;Jang, Dal-Won;Lee, Jong-Seol;Jang, Sei-Jin
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2015.07a
    • /
    • pp.155-158
    • /
    • 2015
  • 단어의 유사도 측정 알고리즘은 DB 인덱싱, 필터링, 소스코드 분석 소프트웨어, 음성 인식 등 다양한 분야에서 활용되고 있다. 하지만 기존의 단어의 유사도만 비교하는 시스템에는 발음이 비슷한 유사단어나 오타가 있는 유사단어들은 측정을 못하는 단점이 있다. 언어의 유사도 측정에서는 알파벳만으로 볼게 아니라 언어 발음의 발화적 특성 또한 고려되어야 한다. 본 논문에서는 글로벌 시장에서의 다국적 기업들의 제품이나 문화 수출 등의 도움이 되는 각 나라의 금기어와의 발화적 특성까지 고려한 단어 유사도를 측정 할 수 있는 시스템을 제안한다. 11개국의 4개 언어 총 21487개의 금기어 단어를 금기어 데이터로 사용하였다. 제안하는 방법의 성능을 평가하기 위하여 타 알고리즘과의 성능비교와 여러 나라의 다양한 언어의 사용자들로부터 사용자 평가를 수행하였고 제안하는 방법이 발음 유사도를 측정하지 않는 알고리즘보다 우수한 성능을 보임을 확인하였다.

  • PDF

Probabilistic Reinterpretation of Collaborative Filtering Approaches Considering Cluster Information of Item Contents (항목 내용물의 클러스터 정보를 고려한 협력필터링 방법의 확률적 재해석)

  • Kim, Byeong-Man;Li, Qing;Oh, Sang-Yeop
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.9
    • /
    • pp.901-911
    • /
    • 2005
  • With the development of e-commerce and the proliferation of easily accessible information, information filtering has become a popular technique to prune large information spaces so that users are directed toward those items that best meet their needs and preferences. While many collaborative filtering systems have succeeded in capturing the similarities among users or items based on ratings to provide good recommendations, there are still some challenges for them to be more efficient, especially the user bias problem, non-transitive association problem and cold start problem. Those three problems impede us to capture more accurate similarities among users or items. In this paper, we provide probabilistic model approaches for UCHM and ICHM which are suggested to solve the addressed problems in hopes of achieving better performance. In this probabilistic model, objects (users or items) are classified into groups and predictions are made for users considering the Gaussian distribution of user ratings. Experiments on a real-word data set illustrate that our proposed approach is comparable with others.

Content Recommendation Techniques for Personalized Software Education (개인화된 소프트웨어 교육을 위한 콘텐츠 추천 기법)

  • Kim, Wan-Seop
    • Journal of Digital Convergence
    • /
    • v.17 no.8
    • /
    • pp.95-104
    • /
    • 2019
  • Recently, software education has been emphasized as a key element of the fourth industrial revolution. Many universities are strengthening the software education for all students according to the needs of the times. The use of online content is an effective way to introduce SW education for all students. However, the provision of uniform online contents has limitations in that it does not consider individual characteristics(major, sw interest, comprehension, interests, etc.) of students. In this study, we propose a recommendation method that utilizes the directional similarity between contents in the boolean view history data environment. We propose a new item-based recommendation formula that uses the confidence value of association rule analysis as the similarity level and apply it to the data of domestic paid contents site. Experimental results show that the recommendation accuracy is improved than when using the traditional collaborative recommendation using cosine or jaccard for similarity measurements.

SMS Text Messages Filtering using Word Embedding and Deep Learning Techniques (워드 임베딩과 딥러닝 기법을 이용한 SMS 문자 메시지 필터링)

  • Lee, Hyun Young;Kang, Seung Shik
    • Smart Media Journal
    • /
    • v.7 no.4
    • /
    • pp.24-29
    • /
    • 2018
  • Text analysis technique for natural language processing in deep learning represents words in vector form through word embedding. In this paper, we propose a method of constructing a document vector and classifying it into spam and normal text message, using word embedding and deep learning method. Automatic spacing applied in the preprocessing process ensures that words with similar context are adjacently represented in vector space. Additionally, the intentional word formation errors with non-alphabetic or extraordinary characters are designed to avoid being blocked by spam message filter. Two embedding algorithms, CBOW and skip grams, are used to produce the sentence vector and the performance and the accuracy of deep learning based spam filter model are measured by comparing to those of SVM Light.

An Item-based Collaborative Filtering Technique by Associative Relation Clustering in Personalized Recommender Systems (개인화 추천 시스템에서 연관 관계 군집에 의한 아이템 기반의 협력적 필터링 기술)

  • 정경용;김진현;정헌만;이정현
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.467-477
    • /
    • 2004
  • While recommender systems were used by a few E-commerce sites former days, they are now becoming serious business tools that are re-shaping the world of I-commerce. And collaborative filtering has been a very successful recommendation technique in both research and practice. But there are two problems in personalized recommender systems, it is First-Rating problem and Sparsity problem. In this paper, we solve these problems using the associative relation clustering and “Lift” of association rules. We produce “Lift” between items using user's rating data. And we apply Threshold by -cut to the association between items. To make an efficiency of associative relation cluster higher, we use not only the existing Hypergraph Clique Clustering algorithm but also the suggested Split Cluster method. If the cluster is completed, we calculate a similarity iten in each inner cluster. And the index is saved in the database for the fast access. We apply the creating index to predict the preference for new items. To estimate the Performance, the suggested method is compared with existing collaborative filtering techniques. As a result, the proposed method is efficient for improving the accuracy of prediction through solving problems of existing collaborative filtering techniques.

Course recommendation system using deep learning (딥러닝을 이용한 강좌 추천시스템)

  • Min-Ah Lim;Seung-Yeon Hwang;Dong-Jin Shin;Jae-Kon Oh;Jeong-Joon Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.3
    • /
    • pp.193-198
    • /
    • 2023
  • We study a learner-customized lecture recommendation project using deep learning. Recommendation systems can be easily found on the web and apps, and examples using this feature include recommending feature videos by clicking users and advertising items in areas of interest to users on SNS. In this study, the sentence similarity Word2Vec was mainly used to filter twice, and the course was recommended through the Surprise library. With this system, it provides users with the desired classification of course data conveniently and conveniently. Surprise Library is a Python scikit-learn-based library that is conveniently used in recommendation systems. By analyzing the data, the system is implemented at a high speed, and deeper learning is used to implement more precise results through course steps. When a user enters a keyword of interest, similarity between the keyword and the course title is executed, and similarity with the extracted video data and voice text is executed, and the highest ranking video data is recommended through the Surprise Library.

Design and Implementation of Margin Push Multi-agent System using Margin Generation Algorithm (마진 생성 알고리즘을 이용한 마진 푸쉬 멀티 에이전트 시스템 설계 및 구현)

  • Kim, Jung-Jae;Hu, Jae-Hyung;Lee, Jong-Hee;Oh, Hae-Seok
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.04a
    • /
    • pp.465-468
    • /
    • 2001
  • 현재 전자상거래에서의 이용률이 저조한 경매시스템을 지능적인 소프트웨어 에이전트를 이용하여 사용자 측면에서 더욱 효율적이고 효과적인 경매시스템을 연구 및 개발은 커다란 이슈가 되고 있다. 따라서, 단순한 게시판 형식의 인터넷 경매 시스템의 인공지능 에이전트를 도입하여 해당 경매 상품에 대해 판매자에게 적정한 경매 시기와 초기값을 계산 및 예측하여 최대한의 마진을 남길 수 있도록 해주는 에이전트 시스템의 연구가 본 논문의 목적이다. 상품을 인터넷 경매에 올리는 판매자가 판매 하고자 하는 경매 상품에 대한 정보를 인터넷 경매 시스템의 에이전트에게 메일로 보내면 에이전트는 해당 상품과 유사한 상품에 대해 필터링하여 이미 학습되어져 있는 유사 상품에 대한 정보 즉, 데이터베이스에 저장되어 있는 경매 상품에 대한 입찰 히스토리와 경매시간, 경매방법, 낙찰가격 등을 계산하여 해당 상품에 대해 판매자가 어느 시기에 얼마의 초기 가격으로 경매를 시작하면 최대한의 마진을 남길 수 있는지에 대해 정보를 메일로 푸쉬해 주는 시스템을 설계 및 구현한다.

  • PDF

A Dynamic Recommendation System Using User Log Analysis and Document Similarity in Clusters (사용자 로그 분석과 클러스터 내의 문서 유사도를 이용한 동적 추천 시스템)

  • 김진수;김태용;최준혁;임기욱;이정현
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.5
    • /
    • pp.586-594
    • /
    • 2004
  • Because web documents become creation and disappearance rapidly, users require the recommend system that offers users to browse the web document conveniently and correctly. One largely untapped source of knowledge about large data collections is contained in the cumulative experiences of individuals finding useful information in the collection. Recommendation systems attempt to extract such useful information by capturing and mining one or more measures of the usefulness of the data. The existing Information Filtering system has the shortcoming that it must have user's profile. And Collaborative Filtering system has the shortcoming that users have to rate each web document first and in high-quantity, low-quality environments, users may cover only a tiny percentage of documents available. And dynamic recommendation system using the user browsing pattern also provides users with unrelated web documents. This paper classifies these web documents using the similarity between the web documents under the web document type and extracts the user browsing sequential pattern DB using the users' session information based on the web server log file. When user approaches the web document, the proposed Dynamic recommendation system recommends Top N-associated web documents set that has high similarity between current web document and other web documents and recommends set that has sequential specificity using the extracted informations and users' session information.

Distributed Processing System Design and Implementation for Feature Extraction from Large-Scale Malicious Code (대용량 악성코드의 특징 추출 가속화를 위한 분산 처리 시스템 설계 및 구현)

  • Lee, Hyunjong;Euh, Seongyul;Hwang, Doosung
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.2
    • /
    • pp.35-40
    • /
    • 2019
  • Traditional Malware Detection is susceptible for detecting malware which is modified by polymorphism or obfuscation technology. By learning patterns that are embedded in malware code, machine learning algorithms can detect similar behaviors and replace the current detection methods. Data must collected continuously in order to learn malicious code patterns that change over time. However, the process of storing and processing a large amount of malware files is accompanied by high space and time complexity. In this paper, an HDFS-based distributed processing system is designed to reduce space complexity and accelerate feature extraction time. Using a distributed processing system, we extract two API features based on filtering basis, 2-gram feature and APICFG feature and the generalization performance of ensemble learning models is compared. In experiments, the time complexity of the feature extraction was improved about 3.75 times faster than the processing time of a single computer, and the space complexity was about 5 times more efficient. The 2-gram feature was the best when comparing the classification performance by feature, but the learning time was long due to high dimensionality.

A Robust Pattern Watermarking Method by Invisibility and Similarity Improvement (비가시성과 유사도 증가를 통한 강인한 패턴 워터마킹 방법)

  • 이경훈;김용훈;이태홍
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.10
    • /
    • pp.938-943
    • /
    • 2003
  • In this paper, we Propose a method using the Tikhonov-Miller process to improve the robustness of watermarking under various attacks. A visually recognizable pattern watermark is embedded in the LH2, HL2 and HH2 subband of wavelet transformed domain using threshold and besides watermark is embeded by utilizing HVS(Human Visual System) feature. The pattern watermark was interlaced after random Permutation for a security and an extraction rate. To demonstrate the improvement of robustness and similarity of the proposed method, we applied some basic algorithm of image processing such as scaling, filtering, cropping, histogram equalizing and lossy compression(JPEG, gif). As a result of experiment, the proposed method was able to embed robust watermark invisibility and extract with an excellent normalized correlation of watermark under various attacks.