• Title/Summary/Keyword: 의미 기반 정보 추출

Search Result 676, Processing Time 0.024 seconds

A Study on the Outcomes Measurement of a Public Library's Reading Program for Children Using the Evaluation Framework Based-on the Logic Model (로직모델 기반 평가 프레임워크를 이용한 공공도서관 어린이 독서 프로그램 성과 측정 연구)

  • Han, Sang Woo;Park, Sung Jae
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.3
    • /
    • pp.271-286
    • /
    • 2018
  • The purpose of this study is to measure the outcomes of a program provided by a public library using the evaluation framework based on Logic Model. A reading program for children which was operated by a public library in Seoul was selected. The outcome evaluation was started with the analysis of the reading program process including planning, operation, and evaluation. Based on the analysis, a logic model framework for outcome evaluation was developed. For evaluation, user, bibliography, and circulation data were collected from the library KOLAS system. Additionally, the participant information were extracted from the final report drafted after the program. The research results show that the number of circulation of program participants was increased after the program. In addition, the range of reading topic was expanded. These findings indicate that the reading program is an effective program for promoting children's reading habit and that outcome evaluation might be a valid tool to measure the effectiveness of public library programs.

Robust Part-of-Speech Tagger using Statistical and Rule-based Approach (통계와 규칙을 이용한 강인한 품사 태거)

  • Shim, Jun-Hyuk;Kim, Jun-Seok;Cha, Jong-Won;Lee, Geun-Bae
    • Annual Conference on Human and Language Technology
    • /
    • 1999.10d
    • /
    • pp.60-75
    • /
    • 1999
  • 품사 태깅은 자연 언어 처리의 가장 기본이 되는 부분으로 상위 자연 언어 처리 부분인 구문 분석, 의미 분석의 전처리로 사용되고, 독립된 응용으로 언어의 정보를 추출하거나 정보 검색 등의 응용에 사용되어 진다. 품사 태깅은 크게 통계에 기반한 방법, 규칙에 기반한 방법, 이 둘을 모두 이용하는 혼합형 방법 등으로 나누어 연구되고 있다. 포항공대 자연언어처리 연구실의 자연 언어 처리 엔진(SKOPE)의 품사 태깅 시스템 POSTAG는 미등록어 추정이 강화된 혼합형 품사 태깅 시스템이다 본 시스템은 형태소 분석기, 통계적 품사 태거, 에러 수정 규칙 후처리기로 구성되어 있다. 이들은 각각 단순히 직렬 연결되어 있는 것이 아니라 형태소 접속 테이블을 기준으로 분석 과정에서 형태소 접속 그래프를 생성하고 처리하면서 상호 밀접한 연관을 가진다. 그리고, 미등록어용 패턴사전에 의해 등록어와 동일한 방법으로 미등록어를 처리함으로써 효율적이고 강건한 품사 태깅을 한다. 한편, POSTAG에서 사용되는 태그세트와 한국전자통신연구원(ETRI)의 표준 태그세트 간에 양방향으로 태그세트 매핑을 함으로써, 표준 태그세트로 태깅된 코퍼스로부터 POSTAC를 위한 대용량 학습자료를 얻고 POSTAG에서 두 가지 태그세트로 품사 태깅 결과 출력이 가능하다. 본 시스템은 MATEC '99'에서 제공된 30000어절에 대하여 표준 태그세트로 출력한 결과 95%의 형태소단위 정확률을 보였으며, 태그세트 매핑을 제외한 POSTAG의 품사 태깅 결과 97%의 정확률을 보였다.

  • PDF

A Review Study of the Success Factors Based the Information Systems Success Model (정보시스템 성공모델 기반 성공요인에 관한 문헌적 고찰)

  • Nam, Soo-Tai;Jin, Chan-Yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.05a
    • /
    • pp.123-125
    • /
    • 2016
  • Big data analysis refers the ability to store, manage and analyze collected data from an existing database management tool. In addition, extract value from large amounts of structured or unstructured data set and means the technology to analyze the results. Meta-analysis refers to a statistical literature synthesis method from the quantitative results of many known empirical studies. We conducted a meta-analysis and review of between success factors based the information systems success model researches. This study focused a total of 14 research papers that established causal relationships between success factors based the information systems success model published in Korea academic journals during 2000 and 2016. Based on these findings, several theoretical and practical implications were suggested and discussed with the difference from previous researches.

  • PDF

Performance Evaluation of a Machine Learning Model Based on Data Feature Using Network Data Normalization Technique (네트워크 데이터 정형화 기법을 통한 데이터 특성 기반 기계학습 모델 성능평가)

  • Lee, Wooho;Noh, BongNam;Jeong, Kimoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.4
    • /
    • pp.785-794
    • /
    • 2019
  • Recently Deep Learning technology, one of the fourth industrial revolution technologies, is used to identify the hidden meaning of network data that is difficult to detect in the security arena and to predict attacks. Property and quality analysis of data sources are required before selecting the deep learning algorithm to be used for intrusion detection. This is because it affects the detection method depending on the contamination of the data used for learning. Therefore, the characteristics of the data should be identified and the characteristics selected. In this paper, the characteristics of malware were analyzed using network data set and the effect of each feature on performance was analyzed when the deep learning model was applied. The traffic classification experiment was conducted on the comparison of characteristics according to network characteristics and 96.52% accuracy was classified based on the selected characteristics.

Smartphone-User Interactive based Self Developing Place-Time-Activity Coupled Prediction Method for Daily Routine Planning System (일상생활 계획을 위한 스마트폰-사용자 상호작용 기반 지속 발전 가능한 사용자 맞춤 위치-시간-행동 추론 방법)

  • Lee, Beom-Jin;Kim, Jiseob;Ryu, Je-Hwan;Heo, Min-Oh;Kim, Joo-Seuk;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.2
    • /
    • pp.154-159
    • /
    • 2015
  • Over the past few years, user needs in the smartphone application market have been shifted from diversity toward intelligence. Here, we propose a novel cognitive agent that plans the daily routines of users using the lifelog data collected by the smart phones of individuals. The proposed method first employs DPGMM (Dirichlet Process Gaussian Mixture Model) to automatically extract the users' POI (Point of Interest) from the lifelog data. After extraction, the POI and other meaningful features such as GPS, the user's activity label extracted from the log data is then used to learn the patterns of the user's daily routine by POMDP (Partially Observable Markov Decision Process). To determine the significant patterns within the user's time dependent patterns, collaboration was made with the SNS application Foursquare to record the locations visited by the user and the activities that the user had performed. The method was evaluated by predicting the daily routine of seven users with 3300 feedback data. Experimental results showed that daily routine scheduling can be established after seven days of lifelogged data and feedback data have been collected, demonstrating the potential of the new method of place-time-activity coupled daily routine planning systems in the intelligence application market.

Cotent-based Image Retrieving Using Color Histogram and Color Texture (컬러 히스토그램과 컬러 텍스처를 이용한 내용기반 영상 검색 기법)

  • Lee, Hyung-Goo;Yun, Il-Dong
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.9
    • /
    • pp.76-90
    • /
    • 1999
  • In this paper, a color image retrieval algorithm is proposed based on color histogram and color texture. The representative color vectors of a color image are made from k-means clustering of its color histogram, and color texture is generated by centering around the color of pixels with its color vector. Thus the color texture means texture properties emphasized by its color histogram, and it is analyzed by Gaussian Markov Random Field (GMRF) model. The proposed algorithm can work efficiently because it does not require any low level image processing such as segmentation or edge detection, so it outperforms the traditional algorithms which use color histogram only or texture properties come from image intensity.

  • PDF

Personalized Media Control Method using Probabilistic Fuzzy Rule-based Learning (확률적 퍼지 룰 기반 학습에 의한 개인화된 미디어 제어 방법)

  • Lee, Hyeong-Uk;Kim, Yong-Hwi;Lee, Tae-Yeop;Park, Gwang-Hyeon;Kim, Yong-Su;Jo, Jun-Myeon;Byeon, Jeung-Nam
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2006.11a
    • /
    • pp.25-28
    • /
    • 2006
  • 사용자 의도 파악 (intention reading) 기술은 스마트 홈과 같은 복잡한 유비쿼터스(ubiquitous) 환경에서 사용자에게 보다 편리하고 개인화된(personalized) 서비스 제공이 가능하도록 해준다. 또한 학습 기능(learning capability)은 지식 발견(knowledge discovery)의 관점에서 의도 파악 기술의 핵심 요소 기술의 하나로 자리 매김 하고 있다. 본 논문에서는 스마트 홈 환경에서 제공 가능한 개인화된 서버스(personalized service) 중의 하나로, 개인화된 미디어 제어 방법에 대한 내용을 다룬다. 특히, 이러한 사람의 행동 패턴과 같은 데이터는 패턴 분류의 관점에서 구분해야 할 클래스(class)에 비해 입력 정보가 불충분할 경우가 많으므로 비일관적인(inconsistent) 데이터가 많으므로, 퍼지 논리(fuzzy logic)와 확률(probability)의 개념을 효과적으로 병행해야 의미 있는 지식을 추출해 낼 수 있다. 이를 위하여 반복 퍼지 지도 클러스터링 (IFCS; Iterative Fuzzy Clustering with Supervision) 알고리즘에 기반하여 주어진 데이터 패턴으로부터 확률적 퍼지 룰(probabilistic fuzzy rule)을 얻어 내는 방법에 대해 설명한다. 또한 이를 포함하는 학습 제어 시스템을 통해 개인화된 미디어 서비스를 추천해 줄 수 있는 방법에 대해서 설명하도록 한다.

  • PDF

A Recovery Technique of PDF File in the Unit of Page (PDF 파일의 페이지단위 복구 기법)

  • Jang, Jeewon;Bang, Seung Gyu;Han, Jaehyeok;Lee, Sang Jin
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.1
    • /
    • pp.25-30
    • /
    • 2017
  • The influence of the data deletion method which is one of anti-forensic techniques is substantial in terms of forensic analysis compared to its simplicity of the act. In academic world, recovery techniques on deleted files have been continuously studied in response to the data deletion method and representatively, the file system-based file recovery technique and file format based recovery technique exist. If there's metadata of deleted file in file system, the file can be easily recovered by using it, but if there's no metadata, the file is recovered by using the signature-based carving technique or the file format based recovery technique has to be applied. At this time, in the file format based recovery technique, the file structure analysis and possible recovery technique should be provided. This paper proposes the page recovery technique on deleted PDF file based on the structural characteristics of PDF file. This technique uses the tag value of page object which constitutes one page of PDF file. Object is extracted by utilizing each tag value as a kind of signature and by analyzing extracted object, the metadata of PDF file is recombined and then it's reconfigured page by page. Recovering by page means that even if deleted PDF file is damaged, even some pages consisting of PDF file can be recovered. Generally, if the file system based file is not recoverable, deleted file is recovered by applying the signature based carving technique. The technique which we proposed in this paper can recover PDF files that are damaged. In the digital forensic perspective, it can be utilized to recover more data than previously.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

A Literature Review on Media-Based Learning in Science (과학과 미디어 기반 학습 관련 문헌 연구)

  • Byun, Taejin
    • Journal of The Korean Association For Science Education
    • /
    • v.37 no.3
    • /
    • pp.417-427
    • /
    • 2017
  • Media is the medium that impart information beyond time and space. They refer to characters or images that serve as means to convey information. From old media such as newspapers and television to new media such as the internet and smart phones, media has developed cumulatively with the development of technology. The goal of media education is to develop the understanding of the properties of media, the ability of critical interpretation of media and selective acceptance. Furthermore it is to cultivate the ability to express meaning creatively and communicate through media. I carried out 'the research of Korean classroom instruction models based on media' with Korean language and social studies education researchers from July 2016 to December 2016. This study is a fundamental study of the project. Based on 58 research papers published between 2006 and 2016, research trends and factors were extracted through literature studies related to media-based science learning. The Result has shown that the studies related to media-based science learning is on the rise, and more than half of all researchers studied about elementary school students. The studies were divided into research on students, research on teachers and pre-service teachers, research on smart devices or media contents, and research on the development of digital textbooks. Among the four variables, there were many researches related to students' cognitive and affective development, and the development and application of media contents.