• Title/Summary/Keyword: 정보 수집 및 추출

Search Result 752, Processing Time 0.027 seconds

Performance Comparison of Neural Network and Gradient Boosting Machine for Dropout Prediction of University Students

  • Hyeon Gyu Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.8
    • /
    • pp.49-58
    • /
    • 2023
  • Dropouts of students not only cause financial loss to the university, but also have negative impacts on individual students and society together. To resolve this issue, various studies have been conducted to predict student dropout using machine learning. This paper presents a model implemented using DNN (Deep Neural Network) and LGBM (Light Gradient Boosting Machine) to predict dropout of university students and compares their performance. The academic record and grade data collected from 20,050 students at A University, a small and medium-sized 4-year university in Seoul, were used for learning. Among the 140 attributes of the collected data, only the attributes with a correlation coefficient of 0.1 or higher with the attribute indicating dropout were extracted and used for learning. As learning algorithms, DNN (Deep Neural Network) and LightGBM (Light Gradient Boosting Machine) were used. Our experimental results showed that the F1-scores of DNN and LGBM were 0.798 and 0.826, respectively, indicating that LGBM provided 2.5% better prediction performance than DNN.

Development of Very Short-term Rainfall-Runoff Forecast system Using Radar and Rainfall Numerical Weather Prediction Data (레이더 및 강우수치예보자료를 이용한 초단기강우-유출예측시스템 개발)

  • Park, Jin-Hyeog;Kang, Boo-Sik
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2007.05a
    • /
    • pp.281-285
    • /
    • 2007
  • 본 연구에서는 보다 신뢰성 있고 정확한 정량적 강우예측자료를 생성하기 위하여 레이더강우 및 강우수치예보자료를 합성하는 기법을 제시하였고, 레이더 전처리 및 예측시스템, GIS와 연계한 물리적기반의 분포형모형인 Vflo모형 등 최신 수자원 IT기술을 활용하여 홍수기 돌발홍수에 대응한 초단기 정량적 강우-유출예측을 목적으로 향후 실시간으로 적용 가능한 분포형유출예측시스템의 기반을 구축하고자 하였다. 대상유역은 국지적인 고해상도 지형효과를 고려한 QPM이 개발되어 있는 금강권역의 용담댐유역이며, 예측 강우에 대한 호우사상은 2005년 이후 발생한 3개 강우사상을 대상으로 하였다. 한편, 기상 레이더 자료로부터 산정된 강수량의 수문학적 적용을 위하여 DEM, 토지피복도, 토양도 등의 기본 GIS자료들을 수집 및 구축하였고 물리적기반의 분포형모형(Vflo)의 입력인자로 사용하기 위한 12개의 공간분포형 수문매개변수들을 대표적인 GIS 소프트웨어인 ArcGIS 및 ArcView를 활용하여 추출하였으며, Vflo모형의 현업 적용가능성을 오프라인 상에서 검증해보았다. 모형 검증결과, GIS를 이용한 지형, 토양, 토지피복과 같은 물리적 특성을 사용한 모형의 초기 설정을 향상시킴에 의해 첨두유량, 유출량, 첨두도달시간차 등에서 만족할만한 결과를 보여주었다고 사료된다. 레이더 및 수치예보자료와 합성한 4가지의 형태(QPE, JQPE, QPM, BQPF)의 분포형 입력강우를 이용하여 적용해 본 결과 Nowcasting기법을 이용한 JQPF는 자료의 특성상 초기 1시간30분동안은 비교적 양호한 결과를 얻었으나 3시간 전후로 가면서 예측강우의 질이 저하되기 시작하였으나 QPM을 합성함으로써 생산한 BQPF는 보다 신뢰성있고 양호한 결과를 얻을 수 있었다. 이러한 결과들은 향후 정량적 분포형강우 예측을 이용한 실시간 홍수유출 예측시 댐운영자는 리드타임(홍수선행시간)을 충분히 확보함으로서 안정적이고 예측 가능한 홍수조절을 하는데 도움을 줄 수 있을 것으로 기대된다. 이와 같이 다양한 단기저수지 유입량의 예측정보 제공으로 다목적댐 저수지 운영모형의 효용성을 제고하여 향후 실제 저수지 유입량 예측에 이용함으로써 저수지 단기운영효율 개선에 기여할 수 있을 것으로 사료된다.

  • PDF

A Study on Research Trends of Library Science and Information Science Through Analyzing Subject Headings of Doctoral Dissertations Recently Published in the U.S. (학위논문 분석을 통한 미국 도서관학 및 정보과학 최근 연구 동향에 관한 연구)

  • Kim, Hyunjung
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.3
    • /
    • pp.11-39
    • /
    • 2018
  • The study examines the research trends of doctoral dissertations in Library Science and Information Science published in the U.S. for the last 5 years. Data collected from PQDT Global includes 1,016 doctoral dissertations containing "Library Science" or "Information Science" as subject headings, and keywords extracted from those dissertations were used for a network analysis, which helps identifying the intellectual structure of the dissertations. Also, the analysis using 103 subject heading keywords resulted in various centrality measures, including triangle betweenness centrality and nearest neighbor centrality, as well as 26 clusters of associated subject headings. The most frequently studied subjects include computer-related subjects, education-related subjects, and communication-related subjects, and a cluster with information science as the most central subject contains most of the computer-related keywords, while a cluster with library science as the most central subject contains many of the education-related keywords. Other related subjects include various user groups for user studies, and subjects related to information systems such as management, economics, geography, and biomedical engineering.

DNN Model for Calculation of UV Index at The Location of User Using Solar Object Information and Sunlight Characteristics (태양객체 정보 및 태양광 특성을 이용하여 사용자 위치의 자외선 지수를 산출하는 DNN 모델)

  • Ga, Deog-hyun;Oh, Seung-Taek;Lim, Jae-Hyun
    • Journal of Internet Computing and Services
    • /
    • v.23 no.2
    • /
    • pp.29-35
    • /
    • 2022
  • UV rays have beneficial or harmful effects on the human body depending on the degree of exposure. An accurate UV information is required for proper exposure to UV rays per individual. The UV rays' information is provided by the Korea Meteorological Administration as one component of daily weather information in Korea. However, it does not provide an accurate UVI at the user's location based on the region's Ultraviolet index. Some operate measuring instrument to obtain an accurate UVI, but it would be costly and inconvenient. Studies which assumed the UVI through environmental factors such as solar radiation and amount of cloud have been introduced, but those studies also could not provide service to individual. Therefore, this paper proposes a deep learning model to calculate UVI using solar object information and sunlight characteristics to provide an accurate UVI at individual location. After selecting the factors, which were considered as highly correlated with UVI such as location and size and illuminance of sun and which were obtained through the analysis of sky images and solar characteristics data, a data set for DNN model was constructed. A DNN model that calculates the UVI was finally realized by entering the solar object information and sunlight characteristics extracted through Mask R-CNN. In consideration of the domestic UVI recommendation standards, it was possible to accurately calculate UVI within the range of MAE 0.26 compared to the standard equipment in the performance evaluation for days with UVI above and below 8.

A Study on Analysis of Topic Modeling using Customer Reviews based on Sharing Economy: Focusing on Sharing Parking (공유경제 기반의 고객리뷰를 이용한 토픽모델링 분석: 공유주차를 중심으로)

  • Lee, Taewon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.25 no.3
    • /
    • pp.39-51
    • /
    • 2020
  • This study will examine the social issues and consumer awareness of sharing parking through the method text mining. In this experiment, the topic by keyword was extracted and analyzed using TFIDF (Term frequency inverse document frequency) and LDA (Latent dirichlet allocation) technique. As a result of categorization by topic, citizens' complaints such as local government agreements, parking space negotiations, parking culture improvement, citizen participation, etc., played an important role in implementing shared parking services. The contribution of this study highly differentiated from previous studies that conducted exploratory studies using corporate and regional cases, and can be said to have a high academic contribution. In addition, based on the results obtained by utilizing the LDA analysis in this study, there is a practical contribution that it can be applied or utilized in establishing a sharing economy policy for revitalizing the local economy.

The study on Quantitative Analysis of Emotional Reaction Related with Step and Sound (스텝과 사운드의 정량적 감성반응 분석에 관한 연구)

  • Jeong, Jae-Wook
    • Archives of design research
    • /
    • v.18 no.2 s.60
    • /
    • pp.211-218
    • /
    • 2005
  • As digital Information equipment is new arrival, new paradigm such as 'function exist but form don't' is needed in the field of design. Therefore, the activity of design is focused on the relationship of human and machine against visual form. For that reason, it is involved emotional factor in the relationship and studied on new field, the emotional interlace. The goal of this paper is to suggest the way of emotional interface on searching multimedia data. The main target of paper is effect sound and human's step and the main way of research is visualization after measuring and analyzing numerically similarity level among emotion-words. This paper suggests the theoretical bad(ground such as personal opinion, the character of auditory information and human's step and case studies on the emotion research. The experimental content about sound is fueled from my previous research and the main experimental content about human's step is made with regression-expression to substitute Quantification method 1 for value about stimulation. The realistic prototype to apply the research result will is suggested on the next research after studying the search environment.

  • PDF

Multi-Modal Wearable Sensor Integration for Daily Activity Pattern Analysis with Gated Multi-Modal Neural Networks (Gated Multi-Modal Neural Networks를 이용한 다중 웨어러블 센서 결합 방법 및 일상 행동 패턴 분석)

  • On, Kyoung-Woon;Kim, Eun-Sol;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.2
    • /
    • pp.104-109
    • /
    • 2017
  • We propose a new machine learning algorithm which analyzes daily activity patterns of users from multi-modal wearable sensor data. The proposed model learns and extracts activity patterns using input from wearable devices in real-time. Inspired by cue integration of human's property, we constructed gated multi-modal neural networks which integrate wearable sensor input data selectively by using gate modules. For the experiments, sensory data were collected by using multiple wearable devices in restaurant situations. As an experimental result, we first show that the proposed model performs well in terms of prediction accuracy. Then, the possibility to construct a knowledge schema automatically by analyzing the activation patterns in the middle layer of our proposed model is explained.

Construction of Case-based System for the Cause Diagnosis of an Electrical Fires (전기화재 원인진단을 위한 사례기반 시스템 구축)

  • Lee, Jong-Ho;Kim, Doo-Hyun;Kim, Sung-Chul
    • Fire Science and Engineering
    • /
    • v.21 no.2 s.66
    • /
    • pp.42-47
    • /
    • 2007
  • This paper presents the development of a case-based system for an electrical fire cause diagnosis system using the entity relation database. The relation database which provides a very simple but powerful way of representing data is widely used. The system focused on database construction and cause diagnosis can diagnose the causes of electrical fires easily and efficiently. In order to store and access to the information concerned with electrical fires, the key index items which identify electrical fires uniquely are derived out. The case-based system consists of a case which contains information from the past fires. The case-based system could present the cause of a newly occurred fire to be diagnosed by searching the case-based database for reasonable matching. The case-based system has not only searching functions with multiple attributes by using the collected various information(such as fire evidence, structure, and weather of a fire scene) but also more improved diagnosis functions which can be easily used for the electrical fire cause diagnosis system.

Heavy Snowfall Disaster Response using Multiple Satellite Imagery Information (다중 위성정보를 활용한 폭설재난 대응)

  • Kim, Seong Sam;Choi, Jae Won;Goo, Sin Hoi;Park, Young Jin
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.20 no.4
    • /
    • pp.135-143
    • /
    • 2012
  • Remote sensing which observes repeatedly the whole Earth and GIS-based decision-making technology have been utilized widely in disaster management such as early warning monitoring, damage investigation, emergent rescue and response, rapid recovery etc. In addition, various countermeasures of national level to collect timely satellite imagery in emergency have been considered through the operation of a satellite with onboard multiple sensors as well as the practical joint use of satellite imagery by collaboration with space agencies of the world. In order to respond heavy snowfall disaster occurred on the east coast of the Korean Peninsula in February 2011, snow-covered regions were analyzed and detected in this study through NDSI(Normalized Difference Snow Index) considering reflectance of wavelength for MODIS sensor and change detection algorithm using satellite imagery collected from International Charter. We present the application case of National Disaster Management Institute(NDMI) which supported timely decision-making through GIS spatial analysis with various spatial data and snow cover map.

The Effectiveness of High-level Text Features in SOM-based Web Image Clustering (SOM 기반 웹 이미지 분류에서 고수준 텍스트 특징들의 효과)

  • Cho Soo-Sun
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.121-126
    • /
    • 2006
  • In this paper, we propose an approach to increase the power of clustering Web images by using high-level semantic features from text information relevant to Web images as well as low-level visual features of image itself. These high-level text features can be obtained from image URLs and file names, page titles, hyperlinks, and surrounding text. As a clustering engine, self-organizing map (SOM) proposed by Kohonen is used. In the SOM-based clustering using high-level text features and low-level visual features, the 200 images from 10 categories are divided in some suitable clusters effectively. For the evaluation of clustering powers, we propose simple but novel measures indicating the degrees of scattering images from the same category, and degrees of accumulation of the same category images. From the experiment results, we find that the high-level text features are more useful in SOM-based Web image clustering.