• Title/Summary/Keyword: 데이터 부족 문제

Search Result 545, Processing Time 0.03 seconds

Prediction of Key Variables Affecting NBA Playoffs Advancement: Focusing on 3 Points and Turnover Features (미국 프로농구(NBA)의 플레이오프 진출에 영향을 미치는 주요 변수 예측: 3점과 턴오버 속성을 중심으로)

  • An, Sehwan;Kim, Youngmin
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.263-286
    • /
    • 2022
  • This study acquires NBA statistical information for a total of 32 years from 1990 to 2022 using web crawling, observes variables of interest through exploratory data analysis, and generates related derived variables. Unused variables were removed through a purification process on the input data, and correlation analysis, t-test, and ANOVA were performed on the remaining variables. For the variable of interest, the difference in the mean between the groups that advanced to the playoffs and did not advance to the playoffs was tested, and then to compensate for this, the average difference between the three groups (higher/middle/lower) based on ranking was reconfirmed. Of the input data, only this year's season data was used as a test set, and 5-fold cross-validation was performed by dividing the training set and the validation set for model training. The overfitting problem was solved by comparing the cross-validation result and the final analysis result using the test set to confirm that there was no difference in the performance matrix. Because the quality level of the raw data is high and the statistical assumptions are satisfied, most of the models showed good results despite the small data set. This study not only predicts NBA game results or classifies whether or not to advance to the playoffs using machine learning, but also examines whether the variables of interest are included in the major variables with high importance by understanding the importance of input attribute. Through the visualization of SHAP value, it was possible to overcome the limitation that could not be interpreted only with the result of feature importance, and to compensate for the lack of consistency in the importance calculation in the process of entering/removing variables. It was found that a number of variables related to three points and errors classified as subjects of interest in this study were included in the major variables affecting advancing to the playoffs in the NBA. Although this study is similar in that it includes topics such as match results, playoffs, and championship predictions, which have been dealt with in the existing sports data analysis field, and comparatively analyzed several machine learning models for analysis, there is a difference in that the interest features are set in advance and statistically verified, so that it is compared with the machine learning analysis result. Also, it was differentiated from existing studies by presenting explanatory visualization results using SHAP, one of the XAI models.

A Study on Image-Based Mobile Robot Driving on Ship Deck (선박 갑판에서 이미지 기반 이동로봇 주행에 관한 연구)

  • Seon-Deok Kim;Kyung-Min Park;Seung-Yeol Wang
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.7
    • /
    • pp.1216-1221
    • /
    • 2022
  • Ships tend to be larger to increase the efficiency of cargo transportation. Larger ships lead to increased travel time for ship workers, increased work intensity, and reduced work efficiency. Problems such as increased work intensity are reducing the influx of young people into labor, along with the phenomenon of avoidance of high intensity labor by the younger generation. In addition, the rapid aging of the population and decrease in the young labor force aggravate the labor shortage problem in the maritime industry. To overcome this, the maritime industry has recently introduced technologies such as an intelligent production design platform and a smart production operation management system, and a smart autonomous logistics system in one of these technologies. The smart autonomous logistics system is a technology that delivers various goods using intelligent mobile robots, and enables the robot to drive itself by using sensors such as lidar and camera. Therefore, in this paper, it was checked whether the mobile robot could autonomously drive to the stop sign by detecting the passage way of the ship deck. The autonomous driving was performed by detecting the passage way of the ship deck through the camera mounted on the mobile robot based on the data learned through Nvidia's End-to-end learning. The mobile robot was stopped by checking the stop sign using SSD MobileNetV2. The experiment was repeated five times in which the mobile robot autonomously drives to the stop sign without deviation from the ship deck passage way at a distance of about 70m. As a result of the experiment, it was confirmed that the mobile robot was driven without deviation from passage way. If the smart autonomous logistics system to which this result is applied is used in the marine industry, it is thought that the stability, reduction of labor force, and work efficiency will be improved when workers work.

Forecasting Hourly Demand of City Gas in Korea (국내 도시가스의 시간대별 수요 예측)

  • Han, Jung-Hee;Lee, Geun-Cheol
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.2
    • /
    • pp.87-95
    • /
    • 2016
  • This study examined the characteristics of the hourly demand of city gas in Korea and proposed multiple regression models to obtain precise estimates of the hourly demand of city gas. Forecasting the hourly demand of city gas with accuracy is essential in terms of safety and cost. If underestimated, the pipeline pressure needs to be increased sharply to meet the demand, when safety matters. In the opposite case, unnecessary inventory and operation costs are incurred. Data analysis showed that the hourly demand of city gas has a very high autocorrelation and that the 24-hour demand pattern of a day follows the previous 24-hour demand pattern of the same day. That is, there is a weekly cycle pattern. In addition, some conditions that temperature affects the hourly demand level were found. That is, the absolute value of the correlation coefficient between the hourly demand and temperature is about 0.853 on average, while the absolute value of the correlation coefficient on a specific day improves to 0.861 at worst and 0.965 at best. Based on this analysis, this paper proposes a multiple regression model incorporating the hourly demand ahead of 24 hours and the hourly demand ahead of 168 hours, and another multiple regression model with temperature as an additional independent variable. To show the performance of the proposed models, computational experiments were carried out using real data of the domestic city gas demand from 2009 to 2013. The test results showed that the first regression model exhibits a forecasting accuracy of MAPE (Mean Absolute Percentage Error) around 4.5% over the past five years from 2009 to 2013, while the second regression model exhibits 5.13% of MAPE for the same period.

Development of an Eye Patch-Type Biosignal Measuring Device to Measure Sleep Quality (수면의 질을 측정하기 위한 안대형 생체신호 측정기기 개발)

  • Changsun Ahn;Jaekwan Lim;Bongsu Jung;Youngjoo Kim
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.5
    • /
    • pp.171-180
    • /
    • 2023
  • The three major sleep disorders in Korea are snoring, sleep apnea, and insomnia. Lack of sleep is the root of all diseases. Some of the most serious potential problems associated with sleep deprivation are cardiovascular problems, cognitive impairment, obesity, diabetes, colitis, prostate cancer, etc. To solve these problems, the Korean government provided low-cost national health insurance benefits for polysomnography tests in July 2018. However, insomnia patients still have problems getting treated in terms of time, space, and economic perspectives. Therefore, it would be better for insomnia patients to be allowed to test at home. The measuring device can measure six biosignals (eye movement, tossing and turning, body temperature, oxygen saturation, heart rate, and audio). A gyroscope sensor (MPU9250, InvenSense, USA) was used for eye movement, tossing, and turning. The input range of the sensor was in 258°/sec to 460°/sec, and the data range was in the input range. Body temperature, oxygen saturation range, and heart rate were measured by a sensor (MAX30102, Analog Devices, USA). The body temperature was measured in 30 ℃ to 45 ℃, and the oxygen saturation range was 0% for the unused state and 20 % to 90 % for the used state. The heart rate measurement range was in 40 bpm to 180 bpm. The measurement of audio signal was performed by an audio sensor (AMM2742-T-R, PUIaudio, USA). The was -42 dB ±1 dB frequency range was 20 Hz to 20 kHz. The measured data was successfully received in wireless network conditions. The system configuration was consisted of a PC and a mobile app for bio-signal measurement and data collection. The measured data was collected by mobile phones and desktops. The data collected can be used as preliminary data to determine the stage of sleep and perform the screening function for sleep induction and sleep disturbances. In the future, this convenient sleep measurement device could be beneficial for treating insomnia.

Study on 3D Printer Suitable for Character Merchandise Production Training (캐릭터 상품 제작 교육에 적합한 3D프린터 연구)

  • Kwon, Dong-Hyun
    • Cartoon and Animation Studies
    • /
    • s.41
    • /
    • pp.455-486
    • /
    • 2015
  • The 3D printing technology, which started from the patent registration in 1986, was a technology that did not attract attention other than from some companies, due to the lack of awareness at the time. However, today, as expiring patents are appearing after the passage of 20 years, the price of 3D printers have decreased to the level of allowing purchase by individuals and the technology is attracting attention from industries, in addition to the general public, such as by naturally accepting 3D and to share 3D data, based on the generalization of online information exchange and improvement of computer performance. The production capability of 3D printers, which is based on digital data enabling digital transmission and revision and supplementation or production manufacturing not requiring molding, may provide a groundbreaking change to the process of manufacturing, and may attain the same effect in the character merchandise sector. Using a 3D printer is becoming a necessity in various figure merchandise productions which are in the forefront of the kidult culture that is recently gaining attention, and when predicting the demand by the industrial sites related to such character merchandise and when considering the more inexpensive price due to the expiration of patents and sharing of technology, expanding opportunities and sectors of employment and cultivating manpower that are able to engage in further creative work seems as a must, by introducing education courses cultivating manpower that can utilize 3D printers at the education field. However, there are limits in the information that can be obtained when seeking to introduce 3D printers in school education. Because the press or information media only mentions general information, such as the growth of the industrial size or prosperous future value of 3D printers, the research level of the academic world also remains at the level of organizing contents in an introductory level, such as by analyzing data on industrial size, analyzing the applicable scope in the industry, or introducing the printing technology. Such lack of information gives rise to problems at the education site. There would be no choice but to incur temporal and opportunity expenses, since the technology would only be able to be used after going through trials and errors, by first introducing the technology without examining the actual information, such as through comparing the strengths and weaknesses. In particular, if an expensive equipment introduced does not suit the features of school education, the loss costs would be significant. This research targeted general users without a technology-related basis, instead of specialists. By comparing the strengths and weaknesses and analyzing the problems and matters requiring notice upon use, pursuant to the representative technologies, instead of merely introducing the 3D printer technology as had been done previously, this research sought to explain the types of features that a 3D printer should have, in particular, when required in education relating to the development of figure merchandise as an optional cultural contents at cartoon-related departments, and sought to provide information that can be of practical help when seeking to provide education using 3D printers in the future. In the main body, the technologies were explained by making a classification based on a new perspective, such as the buttress method, types of materials, two-dimensional printing method, and three-dimensional printing method. The reason for selecting such different classification method was to easily allow mutual comparison of the practical problems upon use. In conclusion, the most suitable 3D printer was selected as the printer in the FDM method, which is comparatively cheap and requires low repair and maintenance cost and low materials expenses, although rather insufficient in the quality of outputs, and a recommendation was made, in addition, to select an entity that is supportive in providing technical support.

Effective Maintenance of Urban Facilities via Smart Phones (스마트폰을 활용한 도시시설물 유지관리 효율화 방안)

  • Roh, Su-Sung;Sohn, Sae-Hyung;Kim, Do-Nyun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.1
    • /
    • pp.513-522
    • /
    • 2017
  • Rapid urbanization is being portrayed around the world au causing difficulty to manage public hygiene and maintain public order. Common problems often detected in urban areas including traffic congestion and the deterioration of the ecological environment are also being exacerbated. Despite all the effort being made by the government and local governments to manage urban facilities, urban residents are subjected to various levels of difficulty, due to the number of facilities to be managed exceeding the number of management personnel and the lack of an adequate management system. The aim of this study is to propose a management method using smart phones to improve the maintenance efficiency of urban facilities. First, the one-stop maintenance of urban facilities, including facilities information management, maintenance management and history management, is made possible by using smart phones to collect and transfer pictures and the locations of the urban facilities. Second, maintenance management is based on the location and picture information of the subject, not the location of the smart phones, which enables a prompt understanding of and actions to be taken for the facility. This method is especially effective as the smart phone application sends the facility status information directly to the maintenance personnel. Third, all of the information and figures relating to the facilities is managed using a database, resulting in the easy utilization of the history management and data. Fourth, all of the urban residents have access to this information via smart phone applications and, therefore, expanding the role of the facility maintenance personnel is made possible without any additional investment in infrastructure. Lastly, the location-based information enables the management of roads, trees and trails.

Pivot Discrimination Approach for Paraphrase Extraction from Bilingual Corpus (이중 언어 기반 패러프레이즈 추출을 위한 피봇 차별화 방법)

  • Park, Esther;Lee, Hyoung-Gyu;Kim, Min-Jeong;Rim, Hae-Chang
    • Korean Journal of Cognitive Science
    • /
    • v.22 no.1
    • /
    • pp.57-78
    • /
    • 2011
  • Paraphrasing is the act of writing a text using other words without altering the meaning. Paraphrases can be used in many fields of natural language processing. In particular, paraphrases can be incorporated in machine translation in order to improve the coverage and the quality of translation. Recently, the approaches on paraphrase extraction utilize bilingual parallel corpora, which consist of aligned sentence pairs. In these approaches, paraphrases are identified, from the word alignment result, by pivot phrases which are the phrases in one language to which two or more phrases are connected in the other language. However, the word alignment is itself a very difficult task, so there can be many alignment errors. Moreover, the alignment errors can lead to the problem of selecting incorrect pivot phrases. In this study, we propose a method in paraphrase extraction that discriminates good pivot phrases from bad pivot phrases. Each pivot phrase is weighted according to its reliability, which is scored by considering the lexical and part-of-speech information. The experimental result shows that the proposed method achieves higher precision and recall of the paraphrase extraction than the baseline. Also, we show that the extracted paraphrases can increase the coverage of the Korean-English machine translation.

  • PDF

Evaluation on real-time multi-point sensing performance of IoT-based hybrid measurement system (IoT 기반 하이브리드 계측시스템 실시간 다점 측정 성능 평가)

  • Kim, Heonyoung;Kang, Donghoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.4
    • /
    • pp.543-550
    • /
    • 2018
  • The rapid growth of IoT technology induced by the fourth industrial revolution has resulted in research into various types of wireless sensors, and applications based on this technology are prevalent in many areas. However, among the various sites where this technology is used, railway bridges and tunnels with lengths of tens of kilometers have problems with data acquisition, due to the signal noise induced by the long distance measurement and EMI induced by the high voltage power feeding system, when conventional electric sensors are used. To overcome these problems, many studies on fiber optic sensors have been conducted as a substitute for the conventional electric sensors. However, restrictions on the types of fiber optic sensors have limited their application in railways. For this reason, a hybrid measurement system with IoT based wireless data communication, in which both electric and fiber optic sensors can be applied simultaneously, has been developed. In this study, in order to evaluate the applicability of the hybrid measurement system developed in the previous study, a real-time test for 4 types of measurement environments, which reflect possible railway sites, is performed. As a result, it was confirmed that the signals from both the electric and fiber optic sensors, which were acquired at a remote area in real-time, showed good agreement with each other and that this measurement system has the potential to handle sensors with a sampling rate of 2.5 kHz. In the future, it is expected that the IoT-based hybrid measurement system will contribute to the improvement of structural safety by enabling real-time structural health monitoring when applied to various measurement sites.

Adaptive Mapping Information Management Scheme for High Performance Large Sale Flash Memory Storages (고성능 대용량 플래시 메모리 저장장치의 효과적인 매핑정보 캐싱을 위한 적응적 매핑정보 관리기법)

  • Lee, Yongju;Kim, Hyunwoo;Kim, Huijeong;Huh, Taeyeong;Jung, Sanghyuk;Song, Yong Ho
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.3
    • /
    • pp.78-87
    • /
    • 2013
  • NAND flash memory has been widely used as a storage medium in mobile devices, PCs, and workstations due to its advantages such as low power consumption, high performance, and random accessability compared to a hard disk drive. However, NAND flash cannot support in-place update so that it is mandatory to erase the entire block before overwriting the corresponding page. In order to overcome this drawback, flash storages need a software support, named Flash Translation Layer. However, as the high performance mass NAND flash memory is getting widely used, the size of mapping tables is increasing more than the limited DRAM size. In this paper, we propose an adaptive mapping information caching algorithm based on page mapping to solve this DRAM space shortage problem. Our algorithm uses a mapping information caching scheme which minimize the flash memory access frequency based on the analysis of several workloads. The experimental results show that the proposed algorithm can increase the performance by up to 70% comparing with the previous mapping information caching algorithm.

Characteristic analysis for moving in and moving out of departments - Focused on the D university example - (학과 간 전출과 전입의 특성분석 - D대학교의 사례를 중심으로-)

  • Choi, Seungbae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.105-115
    • /
    • 2013
  • As far as the universities in south Korea are concerned, they have to meet the need of the situation as the number of the incoming students are decreasing because of the population-reducing in south Korea. The Ministry of Education Science and Technology is enforcing the restructuring of an universities by evaluating all the universities in Korea by using some indices (employment rate, supplement rate of students etc.). Most of the universities in Korea are widely permitting the changes of the major study as a method to improve the 'supplement rate of students' among some measures. These changes of major study (moving in and moving out) can give rise to difficulties in managing an university because there might be the departments with a small number of students as they moving out from low level departments to high level ones. Moreover, as raising the change rate of the major study, there is no loss from the university's point of view but a department could be in a difficult situation. The purpose of this study is to grasp the characteristics for changing major study by a general statistical analysis and graphs produced by a social network analysis with the D university's case. The results of this study are as follows; (a) category is from the engineering to humanity-society, (b) entrance level is from low to high, and (c) employment rate is from low to high as well.