• Title/Summary/Keyword: Text Summarize

Search Result 47, Processing Time 0.032 seconds

CAD system for 3D modeling using engineering drawings (도면을 이용한 3D 모델링 CAD 시스템)

  • 이창조;김창헌;황종선
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1995.04a
    • /
    • pp.891-895
    • /
    • 1995
  • This paper describes a solid modeling system based on a systematic description of techniques for analyzing and understanding on engineering drawings. Main stress is placed on clarifying the difference between the drawing understanding and the drawing recognition. The former, in which we feel major interest, is intrinsically a difficult problem because it inherent contains combinatorial search to require more than polynomial time. Actually, understanding drawings is regarded as a process to recover the information lost in projection 3-D objects to 2-D drawings. But, solid modeling by automatic understanding of the given drawings is one of the promising approach, which is described precisely in the text. Reviewing the studies performed so far, we summarize the future direction of the project and inevitable open problems left.

  • PDF

Multi-Document Summarization Method of Reviews Using Word Embedding Clustering (워드 임베딩 클러스터링을 활용한 리뷰 다중문서 요약기법)

  • Lee, Pil Won;Hwang, Yun Young;Choi, Jong Seok;Shin, Young Tae
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.535-540
    • /
    • 2021
  • Multi-document refers to a document consisting of various topics, not a single topic, and a typical example is online reviews. There have been several attempts to summarize online reviews because of their vast amounts of information. However, collective summarization of reviews through existing summary models creates a problem of losing the various topics that make up the reviews. Therefore, in this paper, we present method to summarize the review with minimal loss of the topic. The proposed method classify reviews through processes such as preprocessing, importance evaluation, embedding substitution using BERT, and embedding clustering. Furthermore, the classified sentences generate the final summary using the trained Transformer summary model. The performance evaluation of the proposed model was compared by evaluating the existing summary model, seq2seq model, and the cosine similarity with the ROUGE score, and performed a high performance summary compared to the existing summary model.

Topic change monitoring study based on Blue House national petition using a control chart (관리도를 활용한 국민청원 토픽 모니터링 연구)

  • Lee, Heeyeon;Choi, Jieun;Lee, Sungim;Son, Won
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.5
    • /
    • pp.795-806
    • /
    • 2021
  • Recently, as text data through online channels have become vast, there is a growing interest in research that summarizes and analyzes them. One of the fundamental analyses of text data is to extract potential topics. Although the researcher may read all the data and summarize the contents one by one, it is not easy to deal with large amounts of data. Blei and Lafferty (2007) and Blei et al. (2003) proposed topic modeling methods for extracting topics using a statistical model. Since the text data is generally collected over time, it is worthwhile to monitor the topic's changes. In this study, we propose a topic index based on the results of the topic model. In addition, a control chart, a representative tool for statistical process management, is applied to monitor the topic index over time. As a practical example, we use text data collected from Blue House National Petition boards between March 5, 2018, and March 5, 2020.

Text Mining and Visualization of Unstructured Data Using Big Data Analytical Tool R (빅데이터 분석 도구 R을 이용한 비정형 데이터 텍스트 마이닝과 시각화)

  • Nam, Soo-Tai;Shin, Seong-Yoon;Jin, Chan-Yong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.9
    • /
    • pp.1199-1205
    • /
    • 2021
  • In the era of big data, not only structured data well organized in databases, but also the Internet, social network services, it is very important to effectively analyze unstructured big data such as web documents, e-mails, and social data generated in real time in mobile environment. Big data analysis is the process of creating new value by discovering meaningful new correlations, patterns, and trends in big data stored in data storage. We intend to summarize and visualize the analysis results through frequency analysis of unstructured article data using R language, a big data analysis tool. The data used in this study was analyzed for total 104 papers in the Mon-May 2021 among the journals of the Korea Institute of Information and Communication Engineering. In the final analysis results, the most frequently mentioned keyword was "Data", which ranked first 1,538 times. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.

A Study on the Modeling of Teaching Methods of Acting Using Brecht's Acting Tools - An Alternative to the Loss of Presence of Repetitive Representational Acting - (브레히트 연기실행도구를 이용한 연기교수법 모형 개발 연구 - 반복적 재현연기의 현존성 상실의 대안으로 -)

  • Lee, Ji-Eun
    • Journal of Korea Entertainment Industry Association
    • /
    • v.14 no.8
    • /
    • pp.103-116
    • /
    • 2020
  • This paper starts with the recognition of the problem of the need for a link between text-centered acting and body-centered acting. This study is focused on Brecht's theory of acting to overcome loss of presence by repetition which have been discussed many times by not only actors, but also acting educators. Brecht's acting theory has already been mentioned by many researchers as an alternative to conventional actor training. However, not many studies have been conducted on practical applicable methods. The purpose of this study is to provide the basis for the actual practice of Brecht acting and possibility that his acting theory can serve as a link between text and body-centered acting theory. As a research method, we first conduct theoretical considerations on the concepts and limitations of text-centered representational acting and body-centered post-drama acting. Then distinguish between text and body-centered acting tools among Brecht's epic theatre, to summarize the terms and concepts he uses and to identify the existing effects he reaches while acting. Finally, this paper proposes an teaching model that transforms and develops Brecht's acting theory through the writer's teaching experience. However, there are limitations in generalizing its effectiveness because this study is based on the writer's experience. We hope that further research will help the diversity of acting education by developing in-depth insights on Brecht acting theory and various models of acting teaching methods.

Changes of prevalence of food allergy in elementary school student and perception of it in school nutritionist in Korea, 1995~2015 (우리나라 초등학생의 식품알레르기 현황과 영양(교)사의 식품알레르기 인식 변화에 대한 고찰, 1995~2015)

  • Han, Sun-Mi;Heo, Young-Ran
    • Journal of Nutrition and Health
    • /
    • v.49 no.1
    • /
    • pp.8-17
    • /
    • 2016
  • Purpose: The aim of this study is to summarize and report on the change of food allergy in elementary school students and perception and practices in school nutritionists in Korea from 1995 to 2015. Methods: The search strategy was "(food allergy AND elementary school AND Korea) AND (nutritionist OR perception OR practice)". The search was conducted via KISS, DBPIA, RISS, NDSL, PubMed, Scopus, and Google scholar and full text and abstracts on the topic of food allergy evaluating prevalence, allergen, symptom, perception and practices were included in this review. Results: Out of 1379 records found in the sources, 13 related studies were included in the final analysis. The results showed that the number of students who had experienced food allergy was increasing. The two frequent allergenic foods were eggs and milk. The perception and practices of food allergy in school nutritionists was gradually increased. Conclusion: Further objective evaluations are required to confirm the food allergy status and its management in school.

Feature-Based Summarization Method for a Large Opinion Documents Collection (대용량 오피니언 문서에 대한 특성 기반 요약 기법)

  • Chang, Jae-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.33-42
    • /
    • 2016
  • Recently, an environment in which public opinions are expressed about various areas is expanded around SNSs or internet potals, thus, opinion documents get bigger rapidly. Under these circumstances, it is essential to utilize automatic summarization techniques for understanding whole contents of large opinion documents. However, it is hard to summarize efficiently those documents with traditional text summarization technologies since the documents include subject expressions as well as features of targets objects. Proposed method in this paper defines features of opinion documents, and designed to retrieve representative sentences expressing opinions of those features. In addition, through experiments, we prove the usefulness of proposed method.

Research on Brand Value Dimensions of Employers: Based on Online Reviews by the Employees

  • XU, Meng
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.9 no.10
    • /
    • pp.215-225
    • /
    • 2022
  • This study investigates employees' online reviews, conducts in-depth text topic mining, effectively summarizes the dimensions of employer brand value, and seeks effective ways to build employer brands from a multi-dimensional perspective. This study employs samples of employer reviews, filter keywords according to word frequency-inverse document frequency, builds a review network containing the same keywords, explore the community and summarize the theme dimensions. Simultaneously, it makes a dynamic comparison and analysis of the employer brand value dimension of different industries and enterprises. The study shows that the community exploration theme can be summarized into 11 dimensions of employer brand value, and the dimensions of employer brand value are significantly different across industries and among different enterprises within the industry. The attention to the employer brand value dimension has a significant time change. Various industries pay increasing attention to the dimension of work intensity and career development, while employers pay steady attention to the dimension of welfare benefits. The findings of this study suggest that seeking the heterogeneity of employer brand resources from the multi-dimensional differences and changes is an effective way to improve the competitiveness of enterprises in the human capital market.

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

  • Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.125-148
    • /
    • 2018
  • Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.

Topic Modeling of News Article about International Construction Market Using Latent Dirichlet Allocation (Latent Dirichlet Allocation 기법을 활용한 해외건설시장 뉴스기사의 토픽 모델링(Topic Modeling))

  • Moon, Seonghyeon;Chung, Sehwan;Chi, Seokho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.38 no.4
    • /
    • pp.595-599
    • /
    • 2018
  • Sufficient understanding of oversea construction market status is crucial to get profitability in the international construction project. Plenty of researchers have been considering the news article as a fine data source for figuring out the market condition, since the data includes market information such as political, economic, and social issue. Since the text data exists in unstructured format with huge size, various text-mining techniques were studied to reduce the unnecessary manpower, time, and cost to summarize the data. However, there are some limitations to extract the needed information from the news article because of the existence of various topics in the data. This research is aimed to overcome the problems and contribute to summarization of market status by performing topic modeling with Latent Dirichlet Allocation. With assuming that 10 topics existed in the corpus, the topics included projects for user convenience (topic-2), private supports to solve poverty problems in Africa (topic-4), and so on. By grouping the topics in the news articles, the results could improve extracting useful information and summarizing the market status.