• Title/Summary/Keyword: 키워드 필터링

Search Result 89, Processing Time 0.028 seconds

Assocate Object Extraction Using personalized user Learning (개인화된 사용자 학습을 위한 연관 객체 추출 설계 및 구현)

  • 유수경;김교정
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2004.05a
    • /
    • pp.636-639
    • /
    • 2004
  • 본 논문은 웹 도큐먼트를 기반으로 사용자에게 의미 있는 정보를 찾아주기 위한 연관 객체 추출 기법인 PMPL(Personalized Multi-Strategey Pattern Loaming) 시스템을 제안하고자 한다. PMPL 모듈은 인터넷의 정보를 여과하여 필터링하고, 사용자 개인화의 키워드를 중심으로 연관된 객체를 추출한다. 이때 연관된 객체 추출 시 대용량 데이터에서 시간적, 공간적면에서 효율적인 연관 탐색 기법인 Fp-Tree와 Fp-Growth 알고리즘을 적용시켰으며, 연관규칙 탐색을 보완하기 위해 가중치 기법인 만유인력 기법을 적용시켰다. PMPL 시스템을 실행한 결과 개인화된 사용자 중심어 기초로 기존의 단일 학습 기법에 비해 더 많은 의미 있는 연관 지식을 추출한 결과가 보였다.

  • PDF

Document Content Similarity Detection Algorithm Using Word Cooccurrence Statistical Information Based Keyword Extraction (단어 공기 통계 정보 기반 색인어 추출을 활용한 문서 유사도 검사 알고리즘)

  • Kim, Jinkyu;Yi, Seungchul;Park, Kibong;Haing, Huhduck
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2016.01a
    • /
    • pp.111-113
    • /
    • 2016
  • 빠른 속도로 쏟아지고 있는 각종 발행물, 논문들에 대한 표절 검토는 표절 검출 알고리즘을 통해 직접적인 복제, 짜깁기, 말 바꾸어 쓰기 등을 검토하거나 표절 검토자가 직접 해당 문서의 키워드를 검색하여 확인하는 방식으로 이루어지고 있다. 하지만 점점 더 늘어나는 방대한 양의 문서들에 대한 표절 검토 작업은 더욱 정교한 검토 방법론을 필요로 하고 있으며, 이를 돕기 위해 문서의 직접적인 단어나 복제 비교에서 더 나아가 문서의 내용을 비교하여 비슷한 내용의 문서들을 필터링 및 검출할 수 있는 방법을 제안한다. 문서의 내용을 비교하기 위해 키워드 추출 알고리즘을 선행하며, 이를 통해 문서의 핵심 내용을 비교할 수 있는 기반을 마련하여 표절 검토자의 작업의 정확성과 속도를 향상시키고자 한다.

  • PDF

Opinion Mining based Broadcasting-Consumption Impact Modeling (오피니언 마이닝기반 방송-소비 영향 모델링)

  • Kim, Jinah;Shin, Yoonmi;Moon, Nammee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.592-595
    • /
    • 2018
  • 소비자의 행동 예측을 하는 데 있어 기존의 소비 행동과 더불어 외부 환경 요인 중 하나인 방송 미디어에 대한 영향 반영이 요구되며, 이 때, '스낵컬처' 시대에 알맞은 분석이 요구된다. 본 논문에서는 네이버 TV에서의 국내 방송 영상 콘텐츠를 활용하여 방송이 소비에 끼치는 영향에 대한 모델링을 진행하였다. 월별 선호도가 높은 방송들을 대상으로 텍스트 마이닝을 통해 방송 영상 콘텐츠의 제목, 내용, 태그, 댓글을 활용하여 주요 키워드를 추출하였으며, 이를 바탕으로 SO-PMI 기반의 오피니언 마이닝을 통해 소비 성향 키워드를 필터링하여 소비 감성 지수를 계산하였다. 이때, 소비 선호를 파악 가능한 소비 감성 사전을 새로 구축하여 활용하였다. 최종적으로, 소비자의 연령과 성별을 분류하여 방송 콘텐츠의 조회수 및 좋아요수를 반영한 방송 선호율과 소비 감성지수를 바탕으로 방송-소비 영향 모델링을 설계 및 구현하였다.

Research Trends on Defects of Apartment Building by Keyword Network Analysis (키워드 네트워크 분석을 이용한 공동주택 하자 연구 동향 분석)

  • Jang, Ho-myun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.9
    • /
    • pp.403-410
    • /
    • 2017
  • Apartment housing has rapidly increased since the housing supply policy implemented in the late 1980s. However, various defects have occurred because the policy focused only on quantity supply, while neglected quality control. In addition, disputes related to various defects are increasing. ; accordingly, studies defects of apartment houses have been continuously conducted to solve various problems. In this study, I analyzed the research trends regarding long-term accumulated defects of apartment buildings by keyword network analysis, and suggest implications. As ananalysis method, I collected journal articles using the portal of the Korea Educational and Scientific Information Agency and constructed data analysis by filtering collected academic papers and keyword refinement. Ialso performed visualization modeling for keyword network relationships, connection degree centrality analysis, and mediation centrality analysis. The results revealed that Mortgage, Dispute, Repair, Case, Response, Condensation, Cost, Institution, Standard, and Valuation are the main keywords that characterize apartment housing defects.

Web Site Keyword Selection Method by Considering Semantic Similarity Based on Word2Vec (Word2Vec 기반의 의미적 유사도를 고려한 웹사이트 키워드 선택 기법)

  • Lee, Donghun;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.2
    • /
    • pp.83-96
    • /
    • 2018
  • Extracting keywords representing documents is very important because it can be used for automated services such as document search, classification, recommendation system as well as quickly transmitting document information. However, when extracting keywords based on the frequency of words appearing in a web site documents and graph algorithms based on the co-occurrence of words, the problem of containing various words that are not related to the topic potentially in the web page structure, There is a difficulty in extracting the semantic keyword due to the limit of the performance of the Korean tokenizer. In this paper, we propose a method to select candidate keywords based on semantic similarity, and solve the problem that semantic keyword can not be extracted and the accuracy of Korean tokenizer analysis is poor. Finally, we use the technique of extracting final semantic keywords through filtering process to remove inconsistent keywords. Experimental results through real web pages of small business show that the performance of the proposed method is improved by 34.52% over the statistical similarity based keyword selection technique. Therefore, it is confirmed that the performance of extracting keywords from documents is improved by considering semantic similarity between words and removing inconsistent keywords.

Video Evaluation System Using Scene Change Detection and User Profile (장면전환검출과 사용자 프로파일을 이용한 비디오 학습 평가 시스템)

  • Shin, Seong-Yoon
    • The KIPS Transactions:PartD
    • /
    • v.11D no.1
    • /
    • pp.95-104
    • /
    • 2004
  • This paper proposes an efficient remote video evaluation system that is matched well with personalized characteristics of students using information filtering based on user profile. For making a question in forms of video, a key frame extraction method based on coordinate, size and color information is proposed. And Question-mating intervals are extracted using gray-level histogram difference and time window. Also, question-making method that combined category-based system with keyword-based system is used for efficient evaluation. Therefore, students can enhance their study achievement through both supplementing their inferior area and preserving their interest area.

Spam-mail Filtering based on Lexical Information and Thesaurus (어휘정보와 시소러스에 기반한 스팸메일 필터링)

  • Kang Shin-Jae;Kim Jong-Wan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.11 no.1
    • /
    • pp.13-20
    • /
    • 2006
  • In this paper, we constructed a spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the legitimate mil. The definite information is the mail sender's information, URL, a certain spam keyword list, and the less definite information is the word lists and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning. According to our results the spam precision was increased if more lexical information was used as features, and the spam recall was increased when the concept codes were included in features as well.

  • PDF

A Design of Similar Video Recommendation System using Extracted Words in Big Data Cluster (빅데이터 클러스터에서의 추출된 형태소를 이용한 유사 동영상 추천 시스템 설계)

  • Lee, Hyun-Sup;Kim, Jindeog
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.172-178
    • /
    • 2020
  • In order to recommend contents, the company generally uses collaborative filtering that takes into account both user preferences and video (item) similarities. Such services are primarily intended to facilitate user convenience by leveraging personal preferences such as user search keywords and viewing time. It will also be ranked around the keywords specified in the video. However, there is a limit to analyzing video similarities using limited keywords. In such cases, the problem becomes serious if the specified keyword does not properly reflect the item. In this paper, I would like to propose a system that identifies the characteristics of a video as it is by the system without human intervention, and analyzes and recommends similarities between videos. The proposed system analyzes similarities by taking into account all words (keywords) that have different meanings from training videos, and in such cases, the methods handled by big data clusters are applied because of the large scale of data and operations.

Exploring the possibility of using ChatGPT and Stable Diffusion as a tool to recommend picture materials for teaching and learning

  • Soo-Hwan Lee;Ki-Sang Song
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.209-216
    • /
    • 2023
  • In this paper, artificial intelligence agents ChatGPT and Stable Diffusion were used to explore the possibility of educational use by implementing a program to recommend picture materials for teaching and learning according to the class topic entered by teachers. The average time spent recommending all picture materials is about 6 minutes. In general, pictures related to keywords were recommended, and the letters in the recommended pictures could only know the intention to represent the letters, and the letters could not be recognized and the meaning could not be known. However, further research seems to be needed on the fact that the type or content of the recommended picture depends entirely on the response of ChatGPT and that it is not possible to accurately recommend the picture for all keywords. In addition, it was concluded that it is true that the recommended picture is related to the keyword, but the evaluation of whether it has educational value is the subject of discussion that should be left to the judgment of human teachers.

Implementation of Web Based Video Learning Evaluation System Using User Profiles (사용자 프로파일을 이용한 웹 기반 비디오 학습 평가 시스템의 구현)

  • Shin Seong-Yoon;Kang Il-Ko;Lee Yang-Won
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.6 s.38
    • /
    • pp.137-152
    • /
    • 2005
  • In this Paper, we Propose an efficient web-based video learning evaluation system that is tailored to individual student's characteristics through the use of user profile-based information filtering. As a means of giving video-based questions, keyframes are extracted based on the location, size, and color information, and question-making intervals are extracted by means of differences in gray-level histograms as well as time windows. In addition, through a combination of the category-based system and the keyword-based system, questions for examination are given in order to ensure efficient evaluation. Therefore, students can enhance school achievement by making up for weak areas while continuing to identify their areas of interest.

  • PDF