• Title/Summary/Keyword: Text data

Search Result 2,959, Processing Time 0.026 seconds

Methodology for Classifying Hierarchical Data Using Autoencoder-based Deeply Supervised Network (오토인코더 기반 심층 지도 네트워크를 활용한 계층형 데이터 분류 방법론)

  • Kim, Younha;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.3
    • /
    • pp.185-207
    • /
    • 2022
  • Recently, with the development of deep learning technology, researches to apply a deep learning algorithm to analyze unstructured data such as text and images are being actively conducted. Text classification has been studied for a long time in academia and industry, and various attempts are being performed to utilize data characteristics to improve classification performance. In particular, a hierarchical relationship of labels has been utilized for hierarchical classification. However, the top-down approach mainly used for hierarchical classification has a limitation that misclassification at a higher level blocks the opportunity for correct classification at a lower level. Therefore, in this study, we propose a methodology for classifying hierarchical data using the autoencoder-based deeply supervised network that high-level classification does not block the low-level classification while considering the hierarchical relationship of labels. The proposed methodology adds a main classifier that predicts a low-level label to the autoencoder's latent variable and an auxiliary classifier that predicts a high-level label to the hidden layer of the autoencoder. As a result of experiments on 22,512 academic papers to evaluate the performance of the proposed methodology, it was confirmed that the proposed model showed superior classification accuracy and F1-score compared to the traditional supervised autoencoder and DNN model.

Quantification of Schedule Delay Risk of Rain via Text Mining of a Construction Log (공사일지의 텍스트 마이닝을 통한 우천 공기지연 리스크 정량화)

  • Park, Jongho;Cho, Mingeon;Eom, Sae Ho;Park, Sun-Kyu
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.1
    • /
    • pp.109-117
    • /
    • 2023
  • Schedule delays present a major risk factor, as they can adversely affect construction projects, such as through increasing construction costs, claims from a client, and/or a decrease in construction quality due to trims to stages to catch up on lost time. Risk management has been conducted according to the importance and priority of schedule delay risk, but quantification of risk on the depth of schedule delay tends to be inadequate due to limitations in data collection. Therefore, this research used the BERT (Bidirectional Encoder Representations from Transformers) language model to convert the contents of aconstruction log, which comprised unstructured data, into WBS (Work Breakdown Structure)-based structured data, and to form a model of classification and quantification of risk. A process was applied to eight highway construction sites, and 75 cases of rain schedule delay risk were obtained from 8 out of 39 detailed work kinds. Through a K-S test, a significant probability distribution was derived for fourkinds of work, and the risk impact was compared. The process presented in this study can be used to derive various schedule delay risks in construction projects and to quantify their depth.

A study on Korean tourism trends using social big data -Focusing on sentiment analysis- (소셜 빅데이터를 활용한 한국관광 트렌드에 관한연구 -감성분석을 중심으로-)

  • Youn-hee Choi;Kyoung-mi Yoo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.97-109
    • /
    • 2024
  • In the field of domestic tourism, tourism trend analysis of tourism consumers, both international tourists and domestic tourists, is essential not only for the Korean tourism market but also for local and governmental tourism policy makers. e will explore the keywords and sentiment analysis on social media to establish a marketing strategy plan and revitalize the domestic tourism industry through communication and information from tourism consumers. This study utilized TEXTOM 6.0 to analyze recent trends in Korean tourism. Data was collected from September 31, 2022, to August 31, 2023, using 'Korean tourism' and 'domestic tourism' as keywords, targeting blogs, cafes, and news provided by Naver, Daum, and Google. Through text mining, 100 key words and TF-IDF were extracted in order of frequency, and then CONCOR analysis and sentiment analysis were conducted. For Korean tourism keywords, words related to tourist destinations, travel companions and behaviors, tourism motivations and experiences, accommodation types, tourist information, and emotional connections ranked high. The results of the CONCOR analysis were categorized into five clusters related to tourist destinations, tourist information, tourist activities/experiences, tourism motivation/content, and inbound related. Finally, the sentiment analysis showed a high level of positive documents and vocabulary. This study analyzes the rapidly changing trends of Korean tourism through text mining on Korean tourism and is expected to provide meaningful data to promote domestic tourism not only for Koreans but also for foreigners visiting Korea.

A Statistical Study on Sikryo-chanryo by Applying Database (데이터베이스를 이용한 식료찬요(食療纂要)의 통계적 연구)

  • Lee, Byung Wook;Kim, Ki Wook;Hwang, Su-Jung
    • Culinary science and hospitality research
    • /
    • v.21 no.4
    • /
    • pp.251-270
    • /
    • 2015
  • This study was, based on traditional know-how indigenous to Korea, to systemize the knowledge on how to improve health by dining, and to make the best of it statistically. For this purpose, the knowledge in the Sikryo-chanryo(食療纂要), in Korean pronunciation and Siglyochan-yo in Chinese characters, which is an old text referring to diet therapy peculiar to Korea, was compiled into a database and analyzed statistically. Data processing was used as a 'Relational data model'. In addition, we have used nine data table to express diet therapy peculiar to Korea in the Siglyochan-yo. The software used for data construction was Microsoft Access 2014. As a result, the Sikryo-chanryo database, which can provide information on both disease treatment by food, medicines, and gourmet ingredients applicable to every kind of symptom, as well as the names of disease, was set up at in a PC interface. By employing the 'Relational data model', we can replace researching in the conventional method by employing the database.

Development of data collection education programs for lower grades in elementary school students (초등학교 저학년을 위한 데이터 수집 교육 프로그램 개발)

  • Yi, Seul;Ma, Daisung
    • 한국정보교육학회:학술대회논문집
    • /
    • 2021.08a
    • /
    • pp.275-281
    • /
    • 2021
  • Much of our lives are closely related to artificial intelligence, and society is changing more rapidly. Reflecting this era, the need for artificial intelligence education has emerged and various learning methods have been proposed, but guidance on artificial intelligence teaching and learning activities for lower grades elementary school students is insufficient. Therefore, in this study, the data collection education program for the lower grades of elementary school was developed based on the contents standards of the Korea Foundation for the Advancement of Science & Creativity. Focusing on the principles of artificial intelligence and the detailed data area of the utilization area, the focus was on expressing numbers and letters in various ways, such as colors and pictures, and finding various types of data in life to learn the principles of artificial intelligence. Through this program, it is expected that lower-grade elementary school students will be able to understand the importance of data collection in artificial intelligence through the process of knowing about data and collecting sound, picture, and text data.

  • PDF

Multimodal Approach for Summarizing and Indexing News Video

  • Kim, Jae-Gon;Chang, Hyun-Sung;Kim, Young-Tae;Kang, Kyeong-Ok;Kim, Mun-Churl;Kim, Jin-Woong;Kim, Hyung-Myung
    • ETRI Journal
    • /
    • v.24 no.1
    • /
    • pp.1-11
    • /
    • 2002
  • A video summary abstracts the gist from an entire video and also enables efficient access to the desired content. In this paper, we propose a novel method for summarizing news video based on multimodal analysis of the content. The proposed method exploits the closed caption data to locate semantically meaningful highlights in a news video and speech signals in an audio stream to align the closed caption data with the video in a time-line. Then, the detected highlights are described using MPEG-7 Summarization Description Scheme, which allows efficient browsing of the content through such functionalities as multi-level abstracts and navigation guidance. Multimodal search and retrieval are also within the proposed framework. By indexing synchronized closed caption data, the video clips are searchable by inputting a text query. Intensive experiments with prototypical systems are presented to demonstrate the validity and reliability of the proposed method in real applications.

  • PDF

The Development of Forest Fire Statistical Management System using Web GIS Technology

  • Jo, Myung-Hee;Kim, Joon-Bum;Kim, Hyun-Sik;Jo, Yun-Won
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.183-190
    • /
    • 2002
  • In this paper forest fire statistical information management system is constructed on web environment using web based GIS(Geographic Information System) technology. Though this system, general users can easily access forest fire statistical information and obtain them in visual method such as maps, graphs, and text if they have web browsers. Moreover, officials related to forest fire can easily control and manage all information in domestic by accessing input interface, retrieval interface, and out interface. In order to implement this system, IIS 5.0 of Microsoft is used as web server and Oracle 8i and ASP(Active Server Page) are used for database construction and dynamic web page operation, respectively. Also, Arc IMS of ESRI is used to serve map data using Java and HTML as system development language. Through this system, general users can obtain the whole information related to forest fire visually in real time also recognize forest fire prevention. In addition, Forest officials can manage the domestic forest resource and control forest fire dangerous area efficiently and scientifically by analyzing and retrieving huge forest data through this system. So, they can save their manpower, time and cost to collect and manage data.

  • PDF

A Study of the Method for Building up 3D Right Objects

  • Lee, Woo-Jin;Suh, Yong-Cheol
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.33 no.6
    • /
    • pp.527-536
    • /
    • 2015
  • Recently, the demand for three-dimensional spatial information has continuously been increasing, and especially, studies of indoor/outdoor spatial information or data construction have actively been conducted. However, utilization of spatial information does not universally spread to the private sector, but it is mostly used for the government offices. Thus, this study deals with the creation of three-dimensional right objects and the technique of expression to further vitalize the private sector, three-dimensional right objects, aiming to create and express three-dimensional right spaces in a particular system or open platform more conveniently. Unlike a plane text or apartment building used in existing maps was iconified and displayed simply, this study proposes a method of extracting data from the outer border of the building by the relevant level based on the existing structured three-dimensional building, a method of providing two-dimensional right spatial objects in XML, and expressing them as three-dimensional right objects efficiently. In addition, this study will discuss a method of creating right objects in a way in which an owner who was provided with a cross section of a building involves the direct detailed right objects in additional production or reproduction to utilize three-dimensional data (right objects) produced through this study.

The improvement of maritime data communication systems for e-Navigation (e-Navigation 대응 해상 데이터통신 시스템 개선)

  • Jung, Sung-Hun;Yang, Gyu-Sik;Jeong, Gi-Ryong;Park, Dong-Kook;Kim, Jeong-Chang
    • Journal of Advanced Navigation Technology
    • /
    • v.15 no.6
    • /
    • pp.939-945
    • /
    • 2011
  • We show a new scheme and implementation of maritime data communication systems for GMDSS ship which performs e-Navigation and removes the functional limitation of those systems through comparing to service fee, call processing reliability, and bit rates of all of those systems within communication range at sea. We confirmed available each frequency band communication and application services at sea by experimental result with proposed new system, MF/HF band being useful to a short text message service, VHF band to 9600bps email service, Fleet Broadband Maritime Satellite system to one or more Mbps multimedia service each.

Customized recommendation system through product review analysis (상품 리뷰 분석을 통한 사용자 맞춤형 추천 시스템)

  • Hwang, Doyeun;Bae, Sangjung;Kim, Changsoo;Jung, Heokyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.05a
    • /
    • pp.460-461
    • /
    • 2018
  • The traditional recommendation system is developed on the assumption that users behave independently, and have problem of readability and efficiency are inferior due to simply sort products or lack of function for associate product attributes with user's taste. To solve this problem in this study we propose a system that provides user customized information that the analysis of the unstructured review data with the purchase histories of users processed with meaningful information after crawling product review data using text mining with R. This allows to help user make decisions can be provided only necessary information without analyze massive amounts of products review data.

  • PDF