• Title/Summary/Keyword: 키워드 학습

Search Result 269, Processing Time 0.027 seconds

A SVM-based Method for Classifying Tagged Web Resources using Tag Stability of Folksonomy in Categories (범주별 태그 안정성을 이용한 태그 부착 자원의 SVM 기반 분류 기법)

  • Koh, Byung-Gul;Lee, Kang-Pyo;Kim, Hyoung-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.6
    • /
    • pp.414-423
    • /
    • 2009
  • Folksonomy, which is collaborative classification created by freely selected keywords, is one of the driving factors of the web 2.0. Folksonomy has advantage of being built at low cost while its weakness is lack of hierarchical or systematic structure in comparison with taxonomy. If we can build classifier that is able to classify web resources from collective intelligence in taxonomy, we can build taxonomy at low cost. In this paper, targeting folksonomy in Slashdot.org, we define a general model and show that collective intelligence, which can build classifier, really exists in folksonomy using a stability value. We suggest method that builds SVM classifier using stability that is result from this collective intelligence. The experiment shows that our proposed method managed to build taxonomy from folksonomy with high accuracy.

Title Generation Model for which Sequence-to-Sequence RNNs with Attention and Copying Mechanisms are used (주의집중 및 복사 작용을 가진 Sequence-to-Sequence 순환신경망을 이용한 제목 생성 모델)

  • Lee, Hyeon-gu;Kim, Harksoo
    • Journal of KIISE
    • /
    • v.44 no.7
    • /
    • pp.674-679
    • /
    • 2017
  • In big-data environments wherein large amounts of text documents are produced daily, titles are very important clues that enable a prompt catching of the key ideas in documents; however, titles are absent for numerous document types such as blog articles and social-media messages. In this paper, a title-generation model for which sequence-to-sequence RNNs with attention and copying mechanisms are employed is proposed. For the proposed model, input sentences are encoded based on bi-directional GRU (gated recurrent unit) networks, and the title words are generated through a decoding of the encoded sentences with keywords that are automatically selected from the input sentences. Regarding the experiments with 93631 training-data documents and 500 test-data documents, the attention-mechanism performances are more effective (ROUGE-1: 0.1935, ROUGE-2: 0.0364, ROUGE-L: 0.1555) than those of the copying mechanism; in addition, the qualitative-evaluation radiative performance of the former is higher.

Academic Conference Categorization According to Subjects Using Topical Information Extraction from Conference Websites (학회 웹사이트의 토픽 정보추출을 이용한 주제에 따른 학회 자동분류 기법)

  • Lee, Sue Kyoung;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.22 no.2
    • /
    • pp.61-77
    • /
    • 2017
  • Recently, the number of academic conference information on the Internet has rapidly increased, the automatic classification of academic conference information according to research subjects enables researchers to find the related academic conference efficiently. Information provided by most conference listing services is limited to title, date, location, and website URL. However, among these features, the only feature containing topical words is title, which causes information insufficiency problem. Therefore, we propose methods that aim to resolve information insufficiency problem by utilizing web contents. Specifically, the proposed methods the extract main contents from a HTML document collected by using a website URL. Based on the similarity between the title of a conference and its main contents, the topical keywords are selected to enforce the important keywords among the main contents. The experiment results conducted by using a real-world dataset showed that the use of additional information extracted from the conference websites is successful in improving the conference classification performances. We plan to further improve the accuracy of conference classification by considering the structure of websites.

An Analysis of Keywords on 'School Space Innovation' Policies using Text Mining - Focused on News Articles - (텍스트 마이닝을 활용한 '학교 공간 혁신' 정책 키워드 분석 - 뉴스 기사를 중심으로 -)

  • Lee, Dongkuk
    • The Journal of Sustainable Design and Educational Environment Research
    • /
    • v.19 no.2
    • /
    • pp.11-20
    • /
    • 2020
  • The goal of this study was to investigate the implementation and related issues of the school space innovation issued by key Korean mass media using text mining. To accomplish this goal, this study collected 519 news articles associated with the school space innovation issued by 54 Korean mass media companies. Based on this data, this study performed the frequency analysis and network analysis regarding the keywords. Based on the findings, the characteristics of school space innovation are summarized as follows: First, school space innovation has progressed in response to future education. Second, users are actively participating in school space innovation. Third, experts are supporting the innovation of school space by establishing a cooperative system. Fourth, the community is actively considering the innovation of school space. Fifth, the main projects of the Ministry of Education and the Provincial Offices of Education are actively conducted in a mix of top-down and bottom-up approaches. The findings of this study will contribute to providing a clear direction for contemporary school space innovation and implications for future research agenda and implementation.

A Distinction Technology for Harmful Web Documents by Rates (등급에 따른 웹 유해 문서 분류 기술)

  • Kim, Yong-Soo;Nam, Taek-Yong;Won, Dong-Ho
    • The KIPS Transactions:PartC
    • /
    • v.13C no.7 s.110
    • /
    • pp.859-864
    • /
    • 2006
  • The openness of the Web allows any user to access almost any type of information easily at any time and anywhere. However, with function of easy access for useful information, internet has dysfunctions of providing users with harmful contents indiscriminately. Some information, such as adult content, is not appropriate for all users, notably children. Additionally for adults, some contents included in abnormal porn sites can do ordinary people's mental health harm. In the meantime, since Internet is a worldwide open network it has a limit to regulate users providing harmful contents through each countrie's national laws or systems. Additionally it is not a desirable way of developing a certain system-specific classification technology for harmful contents, because internet users can contact with them in diverse way, for example, porn sites, harmful spams, or peer-to-peer networks, etc. Therefore, it is being emphasized to research and develop context-based core technologies for classifying harmful contents. In this paper, we propose an efficient text filter for blocking harmful texts of web documents using context-based technologies.

A Study on Character Design through Successful Cases of OSMU in Early Childhood Educational Contents (유아 교육 콘텐츠에서 OSMU 성공사례를 통한 캐릭터 디자인 연구)

  • Lee, Yu-Seop;Chung, Jean-Hun
    • Journal of Digital Convergence
    • /
    • v.17 no.11
    • /
    • pp.451-457
    • /
    • 2019
  • Reflecting the current craze for early childhood education, the demand for learning contents has soared, kids content is emerging as a key topic of the contents industry. The industry developing digital content for young learners is called the 'angel industry', it is attracting a lot of attention because of the increased demand for early childhood education. This paper selects characters used in successful digital products to study character design for OSMU children's educational contents. Through advanced research, analysis criteria were prepared and analyzed to derive general success strategies for character design. As a result, common design features in the analyzed characters were found and confirmed the need for further research. Hopefully, this study will contribute to OSMU character design and lead to improved development of educational contents and commercialization of various characters.

A Study on the Autonomous Decision Right of Emotional AI based on Analysis of 4th Wave Technology Availability in the Hyper-Linkage (무한연결시 4차 산업기술의 이용 가능성 분석을 통한 감성 인공 지능의 자율 결정권에 관한 연구)

  • Seo, Dae-Sung
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.8
    • /
    • pp.9-19
    • /
    • 2019
  • The effects of artificial intelligence technology is social science research as research on the impact on industry and changes in daily life, etc. This means that developing 'emotion AI' will prepare 'next-generation 3D-vector-sensitive AI'. This suggests the main keywords of the tertiary AI decision-making power. Particularly important results will be achieved because of the importance of current unethical learning and the implementation of decision-making systems that reflect ethical value judgments. This is a data based simulation, and required (1)Available data, (2)the technology for the goal of simulation. This takes into account the general content of the intended simulation based research. Currently, existing researches focus on meaningful research motivation, but this study presents the direction of technology. So, empirical analysis is consistent with the decision-making power of each country vs. new technology firms for AI on ehtic responsibility. As a result, there is a need for a concrete contribution and interpretation that can be achieved for the ethic Responsibility, on the technical side of AI / ML. In AI decision making, analytic power of human empathy should be included tech own trust.

An Automatically Extracting Formal Information from Unstructured Security Intelligence Report (비정형 Security Intelligence Report의 정형 정보 자동 추출)

  • Hur, Yuna;Lee, Chanhee;Kim, Gyeongmin;Jo, Jaechoon;Lim, Heuiseok
    • Journal of Digital Convergence
    • /
    • v.17 no.11
    • /
    • pp.233-240
    • /
    • 2019
  • In order to predict and respond to cyber attacks, a number of security companies quickly identify the methods, types and characteristics of attack techniques and are publishing Security Intelligence Reports(SIRs) on them. However, the SIRs distributed by each company are huge and unstructured. In this paper, we propose a framework that uses five analytic techniques to formulate a report and extract key information in order to reduce the time required to extract information on large unstructured SIRs efficiently. Since the SIRs data do not have the correct answer label, we propose four analysis techniques, Keyword Extraction, Topic Modeling, Summarization, and Document Similarity, through Unsupervised Learning. Finally, has built the data to extract threat information from SIRs, analysis applies to the Named Entity Recognition (NER) technology to recognize the words belonging to the IP, Domain/URL, Hash, Malware and determine if the word belongs to which type We propose a framework that applies a total of five analysis techniques, including technology.

Sensitivity of abacus and Chasdaq in the Chinese stock market through analysis of Weibo sentiment related to Corona-19 (코로나-19관련 웨이보 정서 분석을 통한 중국 주식시장의 주판 및 차스닥의 민감도 예측 기법)

  • Li, Jiaqi;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.1
    • /
    • pp.1-7
    • /
    • 2021
  • Investor mood from social media is gaining increasing attention for leading a price movement in stock market. Based on the behavioral finance theory, this study argues that sentiment extracted from social media using big data technique can predict a real-time (short-run) price momentum in Chinese stock market. Collecting Sina Weibo posts that related to COVID-19 using keyword method, a daily influential weighted sentiment factors is extracted from the sizable raw data of over 2 millions of posts. We examine one supervised and 4 unsupervised sentiment analysis model, and use the best performed word-frequency and BiLSTM mdoel. The test result shows a similar movement between stock price change and sentiment factor. It indicates that public mood extracted from social media can in some extent represent the investors' sentiment and make a difference in stock market fluctuation when people are concentrating on a special events that can cause effect on the stock market.

Analysis of whether the feeling of relative deprivation is shown in the comments of the Luxury Howl YouTube video - Focusing on modern sentiment analysis using TF-IDF, Word2vec, LDA and LSTM - (명품 하울 유튜브 영상 댓글에 나타난 상대적 박탈감 여부와 특징 분석 - TF-IDF, Word2vec, LDA, LSTM을 이용한 현대인의 감정 분석을 중심으로 -)

  • Choi, Jung Min;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.3
    • /
    • pp.355-360
    • /
    • 2021
  • Recently Youtube has been more popular. As many studies show the comparative deprivation of the Social Medeia, this study looks into whether the comparative deprivation is expressed on the YouTube comments. It focuses on the Luxury Haul contents, videos about huge amounts of luxurious products, of which Youtubers'economic feature are demonstrative. The comments of the videos are analyzed with LDA TF-IDF and Word2Vec. Additionally, the comments were classified into positive and negative groups by the LSTM model as well. As a result of the study, even though many comments turned out positive, the negative keywords were indicated related to comparative deprivation. Also it was found that the viewers compared themselves with Youtubers. In particular, some YouTubers are more criticized if they are younger or does not seem to afford the luxurious products themselves. This study suggests that the users express the comparative deprivation on YouTube as well like on the other Social Media.