• Title/Summary/Keyword: 키워드추출 시스템

Search Result 288, Processing Time 0.023 seconds

GPT-enabled SNS Sentence writing support system Based on Image Object and Meta Information (이미지 객체 및 메타정보 기반 GPT 활용 SNS 문장 작성 보조 시스템)

  • Dong-Hee Lee;Mikyeong Moon;Bong-Jun, Choi
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.3
    • /
    • pp.160-165
    • /
    • 2023
  • In this study, we propose an SNS sentence writing assistance system that utilizes YOLO and GPT to assist users in writing texts with images, such as SNS. We utilize the YOLO model to extract objects from images inserted during writing, and also extract meta-information such as GPS information and creation time information, and use them as prompt values for GPT. To use the YOLO model, we trained it on form image data, and the mAP score of the model is about 0.25 on average. GPT was trained on 1,000 blog text data with the topic of 'restaurant reviews', and the model trained in this study was used to generate sentences with two types of keywords extracted from the images. A survey was conducted to evaluate the practicality of the generated sentences, and a closed-ended survey was conducted to clearly analyze the survey results. There were three evaluation items for the questionnaire by providing the inserted image and keyword sentences. The results showed that the keywords in the images generated meaningful sentences. Through this study, we found that the accuracy of image-based sentence generation depends on the relationship between image keywords and GPT learning contents.

Toward Preventing Cold-start Problem: Basis Recommendation System (콜드스타트 문제 완화를 위한 기저속성 추출 기반 추천시스템 제안)

  • Jungseob Lee;Hyeonseok Moon;Chanjun Park;Myunghoon Kang;Seungjun Lee;Sungmin Ahn;Jeongbae Park;Heuiseok Lim
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.427-430
    • /
    • 2022
  • 추천시스템에서 콜드스타트 문제를 해결하기 위해 다양한 연구들이 진행되고 있다. 하지만, 대부분의 연구는 아직도 사용자 기반의 히스토리 데이터셋을 반드시 필요로 하여, 콜드스타트 문제를 완벽히 해결하지 못하고 있다. 이에 본 논문은 콜드스타트 문제를 완화할 수 있는 기저속성 기반의 추천시스템을 제안한다. 제안하는 방법론을 검증하기 위해, 직접 수집한 한국어 영화 리뷰 데이터셋을 기반으로 성능을 검증하였으며, 평가 결과 제안한 방법론이 키워드와 사용자의 리뷰 점수를 효과적으로 반영한 추천시스템임을 확인할 수 있었고, 데이터 희소성 및 콜드스타트 문제를 완화하여 기존의 텍스트 기반 랭킹 시스템의 성능을 압도하는 것을 확인하였다. 더 나아가 제안된 기저속성 추천시스템은 추론 시에 GPU 컴퓨팅 자원을 요구하지 않기에 서비스 측면에서도 많은 이점이 있음을 확인하였다.

  • PDF

Event Detection System Using Twitter Data (트위터를 이용한 이벤트 감지 시스템)

  • Park, Tae Soo;Jeong, Ok-Ran
    • Journal of Internet Computing and Services
    • /
    • v.17 no.6
    • /
    • pp.153-158
    • /
    • 2016
  • As the number of social network users increases, the information on event such as social issues and disasters receiving attention in each region is promptly posted by the bucket through social media site in real time, and its social ripple effect becomes huge. This study proposes a detection method of events that draw attention from users in specific region at specific time by using twitter data with regional information. In order to collect Twitter data, we use Twitter Streaming API. After collecting data, We implemented event detection system by analyze the frequency of a keyword which contained in a twit in a particular time and clustering the keywords that describes same event by exploiting keywords' co-occurrence graph. Finally, we evaluates the validity of our method through experiments.

Content-based Extended CAN to Support Keyword Search (키워드 검색 지원을 위한 컨텐츠 기반의 확장 CAN)

  • Park, Jung-Soo;Lee, Hyuk-ro;U, Uk-dong;Jo, In-june
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.103-109
    • /
    • 2005
  • Research about P2P system have recently a lot of attention in connection of form that pass early Centralized P2P and is Decentralized P2P. Specially, Structured P2P System of DHT base have a attention to scalability and systematic search and high search efficiency by routing. But, Structured P2P System of DHT base have problem, file can be located only their unique File IDs that although user may wish to search for files using a set descriptive keyword or do not have the exact File ID of the files. This paper propose extended-CAN mechanism that creates File ID of Contents base and use KID and CKD for commonness keyword processing to support keyword search in P2P System of DHT base.

  • PDF

Analysis of Research Trends in Data Curation Using Text Mining Techniques (텍스트 마이닝을 활용한 국외 데이터 큐레이션 연구 동향 분석)

  • Jaeeun Choi
    • Journal of the Korean Society for information Management
    • /
    • v.41 no.3
    • /
    • pp.85-107
    • /
    • 2024
  • This study analyzes trends in data curation research. A total of 1,849 scholarly records were extracted from Scopus and WoS, with 1,797 papers selected after removing duplicates. Titles, keywords, and abstracts were analyzed through keyword frequency analysis, LDA topic modeling, and network analysis. Frequent keywords like 'research' and 'information' suggest that data curation is widely applied in medical research, biomedical research, data management, and infrastructure. LDA modeling identified five main topics: improving medical data quality, enhancing big data management, managing scientific data and repositories, annotating and modeling medical data, and gene/protein database research. Network analysis showed that 'analysis' was central in global discussions, while 'gene' and 'system' were locally central. These findings highlight the importance of data curation in various research areas.

Identifying Topics of LIS Curricula by Keyword Analysis - Focused on Information Technology Classes of US and Korea (교과 키워드 분석을 통한 문헌정보학과 교육 주제 연구 - 한국·미국 정보기술관련 교과 중심으로 -)

  • Choi, Sanghee
    • Journal of Korean Library and Information Science Society
    • /
    • v.50 no.2
    • /
    • pp.43-60
    • /
    • 2019
  • Since information technology such as database or network technology was brought into the information and library science fields, the functions and services of libraries have drastically changed. To cope with the changes of fields, library schools have been improving curricula. This study collected curricula of library and information science in US and Korea and selected classes related to information technology. It also investigated the title keywords and keywords of class description statistically. As a result, 'system, 'database', 'network', 'programing', 'web' are major topic keywords for both countries, but 'library'shows high frequency pnly in Korea.

Natural language based Information Retrieval System considering the focus of the question (의문의 초점을 고려한 자연어 기반의 정보검색 시스템)

  • Park, Hong-Won
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.37-43
    • /
    • 1997
  • 본 논문에서는 기존의 키워드 검색 시스템의 불편함과 비효율성을 지적하고 이를 극복하기 위해 한국어 의문문 자체를 질의어로 채택하여 정보를 검색하는 자연어 기반의 정보검색 시스템을 제안하였다. 본 시스템은 주격 주제어와 서술격 주제어는 물론 의문의 초점과 초점 관련 어구에 대해서도 질의어 분석단계에서 분석하여 검색자의 요구에 부응하는 응답문 검색이 가능하도록 설계하였다. 본 논문에서는 의문문 질의 시스템에 적합하도록 의문사를 5형태로 분류하고 실제 한국어 문장에서 이들 각각에 대한 처리를 규칙화시켜 질의어의 체계적인 분석을 시도하였다. 한편, 후보 문장 검색을 위한 색인어로 사용되는 주격 주제어와 서술격 주제어를 정해진 규칙을 통해 추출함으로써 체계적이고 정확도 높은 질의어 분석이 이루어지도록 했다. 뿐만 아니라 의문의 초점과 초점 관련 어구또한 정해진 규칙을 통해 분석 추출함으로써 응답문 검색의 정확성을 높였다.

  • PDF

Sentiment Analysis of Foot-and-Mouth Disease Using Tweet Text-Mining Technique (트윗 텍스트 마이닝 기법을 이용한 구제역의 감성분석)

  • Chae, Heechan;Lee, Jonguk;Choi, Yoona;Park, Daihee;Chung, Yongwha
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.11
    • /
    • pp.419-426
    • /
    • 2018
  • Due to the FMD(foot-and-mouth disease), the domestic animal husbandry and related industries suffer enormous damage every year. Although various academic researches related to FMD are ongoing, engineering studies on the social effects of FMD are very limited. In this study, we propose a systematic methodology to analyze emotional responses of regular citizens on FMD using text mining techniques. The proposed system first collects data related to FMD from the tweets posted on Twitter, and then performs a polarity classification process using a deep-learning technique. Second, keywords are extracted from the tweet using LDA, which is one of the typical techniques of topic modeling, and a keyword network is constructed from the extracted keywords. Finally, we analyze the various social effects of regular citizens on FMD through keyword network. As a case study, we performed the emotional analysis experiment of regular citizens about FMD from July 2010 to December 2011 in Korea.

Implementation of a Content-Based Image Retrieval System with Color Assignments (칼라 지정을 이용한 내용기반 화상검색 시스템 구현)

  • Kim, Cheol-Won;Choi, Ki-Ho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.4
    • /
    • pp.933-943
    • /
    • 1997
  • In this paper, a conernt-based image retrival system with color assigments has been stueide and implment-ed. The color of images has been extracted after changing RGB color space to HSV(hue, saturation, value)that is the most compatible color for peop]e's feeling. In the color extracting, an image is divided into 9 different areasand 3 major colors for each area are selected by using color histograms. It is possible to chose the class of umages by keywords. We are evaluate four different types of queries such as an image input, keywords with color assignments, combining an image input and keywords with color assinments, and selecting specific part of an umage. Experimental rusults show that four different query types privide precision/recall 0.55/0.37, 0.57/0.43, 0.59/0.45 and 0.63/0.61, respectively. With color assignments, the retrieval system has been able to obtain high performance and validity.

  • PDF

A Study on Graph-based Topic Extraction from Microblogs (마이크로블로그를 통한 그래프 기반의 토픽 추출에 관한 연구)

  • Choi, Don-Jung;Lee, Sung-Woo;Kim, Jae-Kwang;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.5
    • /
    • pp.564-568
    • /
    • 2011
  • Microblogs became popular information delivery ways due to the spread of smart phones. They have the characteristic of reflecting the interests of users more quickly than other medium. Particularly, in case of the subject which attracts many users, microblogs can supply rich information originated from various information sources. Nevertheless, it has been considered as a hard problem to obtain useful information from microblogs because too much noises are in them. So far, various methods are proposed to extract and track some subjects from particular documents, yet these methods do not work effectively in case of microblogs which consist of short phrases. In this paper, we propose a graph-based topic extraction and partitioning method to understand interests of users about a certain keyword. The proposed method contains the process of generating a keyword graph using the co-occurrences of terms in the microblogs, and the process of splitting the graph by using a network partitioning method. When we applied the proposed method on some keywords. our method shows good performance for finding a topic about the keyword and partitioning the topic into sub-topics.