• Title/Summary/Keyword: 과학 빅데이터

Search Result 519, Processing Time 0.023 seconds

Preliminary Inspection Prediction Model to select the on-Site Inspected Foreign Food Facility using Multiple Correspondence Analysis (차원축소를 활용한 해외제조업체 대상 사전점검 예측 모형에 관한 연구)

  • Hae Jin Park;Jae Suk Choi;Sang Goo Cho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.121-142
    • /
    • 2023
  • As the number and weight of imported food are steadily increasing, safety management of imported food to prevent food safety accidents is becoming more important. The Ministry of Food and Drug Safety conducts on-site inspections of foreign food facilities before customs clearance as well as import inspection at the customs clearance stage. However, a data-based safety management plan for imported food is needed due to time, cost, and limited resources. In this study, we tried to increase the efficiency of the on-site inspection by preparing a machine learning prediction model that pre-selects the companies that are expected to fail before the on-site inspection. Basic information of 303,272 foreign food facilities and processing businesses collected in the Integrated Food Safety Information Network and 1,689 cases of on-site inspection information data collected from 2019 to April 2022 were collected. After preprocessing the data of foreign food facilities, only the data subject to on-site inspection were extracted using the foreign food facility_code. As a result, it consisted of a total of 1,689 data and 103 variables. For 103 variables, variables that were '0' were removed based on the Theil-U index, and after reducing by applying Multiple Correspondence Analysis, 49 characteristic variables were finally derived. We build eight different models and perform hyperparameter tuning through 5-fold cross validation. Then, the performance of the generated models are evaluated. The research purpose of selecting companies subject to on-site inspection is to maximize the recall, which is the probability of judging nonconforming companies as nonconforming. As a result of applying various algorithms of machine learning, the Random Forest model with the highest Recall_macro, AUROC, Average PR, F1-score, and Balanced Accuracy was evaluated as the best model. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the selection reason for nonconforming facilities of individual instances, and discuss applicability to the on-site inspection facility selection system. Based on the results of this study, it is expected that it will contribute to the efficient operation of limited resources such as manpower and budget by establishing an imported food management system through a data-based scientific risk management model.

Analysis of Borrows Demand for Books in Public Libraries Considering Cultural Characteristics (문화적 특성을 고려한 공공도서관 도서 대출수요 분석 : 대구광역시 시립도서관을 사례로)

  • Oh, Min-Ki;Kim, Kyung-Rae;Jeong, Won-Oong;Kim, Keun-Wook
    • Journal of Digital Convergence
    • /
    • v.19 no.3
    • /
    • pp.55-64
    • /
    • 2021
  • Public libraries are a space where residents learn a wide range of knowledge and ideologies, and as they are directly connected to life, various related studies have been conducted. In most previous studies, variables such as population, traffic accessibility, and environment were found to be highly relevant to library use. In this study, it can be said that the difference from previous studies is that the book borrow demand and relevance were analyzed by reflecting the variables of cultural characteristics based on the book borrow history (1,820,407 cases) and member information (297,222 persons). As a result of the analysis, it was analyzed that as the increase in borrows for social science and literature books compared to technical science books, the demand for book borrows increased. In addition, various descriptive statistical analyzes were used to analyze the characteristics of library book borrow demand, and policy implications and limitations of the study were also presented based on the analysis results. and considering that cultural characteristics change depending on the location and time of day, it is believed that related research should be continued in the future.

Performance Assessment of Two Horizontal Shroud Tidal Current Energy Converter using Hydraulic Experiment (수리실험을 통한 수평 2열 쉬라우드 조류에너지 변환장치 성능평가)

  • Lee, Uk-Jae;Choi, Hyuk-Jin;Ko, Dong-Hui
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.34 no.1
    • /
    • pp.1-10
    • /
    • 2022
  • In this study, the two horizontal shroud tidal current energy converter, which can generate power even under low flow speed conditions, was developed. In order to determine the shape of the shroud system, a three-dimensional numerical simulation test was conducted, and a 1/6 scale down model was made to perform a hydraulic model experiment. The hydraulic model experiment was performed under four flow conditions, and the flow speed, torque, and RPM were measured for each experimental case. As a result of the numerical simulation test, it was found that the flow speeds passing through the nozzle were increased by about 2~3 times in the cylinder, and when the extension ratio was 2:1, the highest flow speed was shown. In addition, it was found that the flow speeds increased 2.8 times when the diameter ratio between the nozzle and the cylinder was 1.5:1. Meanwhile, as a result of the hydraulic model experiment, it was found that when the tip speed ratio was between 1.75 and 2, the power coefficient was 0.32 to 0.34.

Analysis of Trends in Education Policy of STEAM Using Text Mining: Comparative Analysis of Ministry of Education's Documents, Articles, and Abstract of Researches from 2009 to 2020 (텍스트 마이닝을 활용한 융합인재교육정책 동향 분석 -2009년~2020년 교육부보도, 언론보도, 학술지 초록 비교분석-)

  • You, Jungmin;Kim, Sung-Won
    • Journal of The Korean Association For Science Education
    • /
    • v.41 no.6
    • /
    • pp.455-470
    • /
    • 2021
  • This study examines the trend changes in keywords and topics of STEAM education from 2009 to 2020 to derive future development direction and education implications. Among the collected data, 42 cases of Ministry of Education's documents, 1,534 cases of articles, and 880 cases of abstract of researches were selected as research subjects. Keyword analysis, keyword network and topic modeling were performed for each stage of STEAM education policy through the Python program. As a result of the analysis, according to the STEAM education policy stage, there were differences in the frequency and network of keywords related to STEAM education by media. It was confirmed that there was a difference in interest in STEAM education policy as there were differences in keywords and topics that were mainly used importantly by media. Most of the topics of the Ministry of Education's documents were found to correspond to topics derived from articles. The implications for the development direction of STEAM education derived from the results of this study are as follows: first, STEAM education needs to consider ways to connect multiple topics, including the humanities. Second, since the media has a difference in interest in STEAM education policy, it is necessary to seek a cooperative development direction through understanding this. Third, the Ministry of Education's support for core competency reinforcement and convergence literacy for nurturing future talents, the goal of STEAM education, and the media's efforts to increase the public's understanding of STEAM education are required. Lastly, it is necessary to continuously analyze the themes that will appear in the evaluation process and change STEAM education policy.

Predicting stock movements based on financial news with systematic group identification (시스템적인 군집 확인과 뉴스를 이용한 주가 예측)

  • Seong, NohYoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.1-17
    • /
    • 2019
  • Because stock price forecasting is an important issue both academically and practically, research in stock price prediction has been actively conducted. The stock price forecasting research is classified into using structured data and using unstructured data. With structured data such as historical stock price and financial statements, past studies usually used technical analysis approach and fundamental analysis. In the big data era, the amount of information has rapidly increased, and the artificial intelligence methodology that can find meaning by quantifying string information, which is an unstructured data that takes up a large amount of information, has developed rapidly. With these developments, many attempts with unstructured data are being made to predict stock prices through online news by applying text mining to stock price forecasts. The stock price prediction methodology adopted in many papers is to forecast stock prices with the news of the target companies to be forecasted. However, according to previous research, not only news of a target company affects its stock price, but news of companies that are related to the company can also affect the stock price. However, finding a highly relevant company is not easy because of the market-wide impact and random signs. Thus, existing studies have found highly relevant companies based primarily on pre-determined international industry classification standards. However, according to recent research, global industry classification standard has different homogeneity within the sectors, and it leads to a limitation that forecasting stock prices by taking them all together without considering only relevant companies can adversely affect predictive performance. To overcome the limitation, we first used random matrix theory with text mining for stock prediction. Wherever the dimension of data is large, the classical limit theorems are no longer suitable, because the statistical efficiency will be reduced. Therefore, a simple correlation analysis in the financial market does not mean the true correlation. To solve the issue, we adopt random matrix theory, which is mainly used in econophysics, to remove market-wide effects and random signals and find a true correlation between companies. With the true correlation, we perform cluster analysis to find relevant companies. Also, based on the clustering analysis, we used multiple kernel learning algorithm, which is an ensemble of support vector machine to incorporate the effects of the target firm and its relevant firms simultaneously. Each kernel was assigned to predict stock prices with features of financial news of the target firm and its relevant firms. The results of this study are as follows. The results of this paper are as follows. (1) Following the existing research flow, we confirmed that it is an effective way to forecast stock prices using news from relevant companies. (2) When looking for a relevant company, looking for it in the wrong way can lower AI prediction performance. (3) The proposed approach with random matrix theory shows better performance than previous studies if cluster analysis is performed based on the true correlation by removing market-wide effects and random signals. The contribution of this study is as follows. First, this study shows that random matrix theory, which is used mainly in economic physics, can be combined with artificial intelligence to produce good methodologies. This suggests that it is important not only to develop AI algorithms but also to adopt physics theory. This extends the existing research that presented the methodology by integrating artificial intelligence with complex system theory through transfer entropy. Second, this study stressed that finding the right companies in the stock market is an important issue. This suggests that it is not only important to study artificial intelligence algorithms, but how to theoretically adjust the input values. Third, we confirmed that firms classified as Global Industrial Classification Standard (GICS) might have low relevance and suggested it is necessary to theoretically define the relevance rather than simply finding it in the GICS.

Exploring Pre-Service Earth Science Teachers' Understandings of Computational Thinking (지구과학 예비교사들의 컴퓨팅 사고에 대한 인식 탐색)

  • Young Shin Park;Ki Rak Park
    • Journal of the Korean earth science society
    • /
    • v.45 no.3
    • /
    • pp.260-276
    • /
    • 2024
  • The purpose of this study is to explore whether pre-service teachers majoring in earth science improve their perception of computational thinking through STEAM classes focused on engineering-based wave power plants. The STEAM class involved designing the most efficient wave power plant model. The survey on computational thinking practices, developed from previous research, was administered to 15 Earth science pre-service teachers to gauge their understanding of computational thinking. Each group developed an efficient wave power plant model based on the scientific principal of turbine operation using waves. The activities included problem recognition (problem solving), coding (coding and programming), creating a wave power plant model using a 3D printer (design and create model), and evaluating the output to correct errors (debugging). The pre-service teachers showed a high level of recognition of computational thinking practices, particularly in "logical thinking," with the top five practices out of 14 averaging five points each. However, participants lacked a clear understanding of certain computational thinking practices such as abstraction, problem decomposition, and using bid data, with their comprehension of these decreasing after the STEAM lesson. Although there was a significant reduction in the misconception that computational thinking is "playing online games" (from 4.06 to 0.86), some participants still equated it with "thinking like a computer" and "using a computer to do calculations". The study found slight improvements in "problem solving" (3.73 to 4.33), "pattern recognition" (3.53 to 3.66), and "best tool selection" (4.26 to 4.66). To enhance computational thinking skills, a practice-oriented curriculum should be offered. Additional STEAM classes on diverse topics could lead to a significant improvement in computational thinking practices. Therefore, establishing an educational curriculum for multisituational learning is essential.

Keyword Network Visualization for Text Summarization and Comparative Analysis (문서 요약 및 비교분석을 위한 주제어 네트워크 가시화)

  • Kim, Kyeong-rim;Lee, Da-yeong;Cho, Hwan-Gue
    • Journal of KIISE
    • /
    • v.44 no.2
    • /
    • pp.139-147
    • /
    • 2017
  • Most of the information prevailing in the Internet space consists of textual information. So one of the main topics regarding the huge document analyses that are required in the "big data" era is the development of an automated understanding system for textual data; accordingly, the automation of the keyword extraction for text summarization and abstraction is a typical research problem. But the simple listing of a few keywords is insufficient to reveal the complex semantic structures of the general texts. In this paper, a text-visualization method that constructs a graph by computing the related degrees from the selected keywords of the target text is developed; therefore, two construction models that provide the edge relation are proposed for the computing of the relation degree among keywords, as follows: influence-interval model and word- distance model. The finally visualized graph from the keyword-derived edge relation is more flexible and useful for the display of the meaning structure of the target text; furthermore, this abstract graph enables a fast and easy understanding of the target text. The authors' experiment showed that the proposed abstract-graph model is superior to the keyword list for the attainment of a semantic and comparitive understanding of text.

Effect of Occurrence of Scion Root on the Growth and Root Nutrient Contents of 'Shiranuhi' Mandarin Hybrid grown in Plastic Film House (자근발생이 부지화 감귤나무의 수체 생육과 뿌리내 양분함량에 미치는 영향)

  • Kang, Seok-Beom;Moon, Young-Eel;Yankg, Gyeong-Rok;Joa, Jae-Ho;Han, Seong-Gap;Lee, Hae-Jin;Park, Woo-Jung
    • Korean Journal of Environmental Agriculture
    • /
    • v.38 no.3
    • /
    • pp.154-158
    • /
    • 2019
  • BACKGROUND: 'Shiranuhi' mandarin is a major cultivar among all late ripening type of citrus, and is widely cultivated in Korea. However, many farmers have reported scion root problems in their orchard resulting in reduced flowering and fruiting. It is necessary that the physiology of scion-rooted 'Shiranuhi' mandarin trees is further understood. METHODS AND RESULTS: This experiment was conducted to understand the growth response and physiology of scion-rooted 'Shiranuhi' mandarin hybrids. In our study, 'Shiranuhi' mandarin trees were divided into two groups: trees without scion roots (control) and trees with scion roots. The experiment was conducted in Seogwipo of Jeju, with ten replicates for each group. Growth of trees with scion roots was more vigorous and the trees were taller than the controls. Tree height and trunk diameter of scion-rooted trees were significantly higher than those of control trees. Exposed length of rootstocks of scion-rooted trees was significantly lower (by about 2 cm) than that of control trees (8.6 cm). In terms of root nutrition, carbon contents of scion-rooted trees was significantly lower than that of control trees, but nitrogen and potassium concentrations in scion roots were significantly higher than those in control roots. CONCLUSION: Based on the results, we infer that growth of scion-rooted trees was very vigorous and the content of nitrogen in these roots was higher than that in the control tree roots. Thus, the carbon/nitrogen ratio of scion roots was significantly lower than that of the control roots.

Propagation of tidal wave and resulted tidal asymmetry upward tidal rivers (감조하천에서 조석 전파 및 조석비대칭)

  • Kang, Ju Whan;Cho, Hong-Yeon
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.6
    • /
    • pp.433-442
    • /
    • 2021
  • In order to examine the characteristics of tidal wave from the estuary to upsteam of tidal river, tidal asymmetry was identified based on analysis of the harmonic constants of M2 and M4 tidal constituents in the domestic western coastal regions. As shallow water tide is greatly developed in the estuary, flood dominance in Han River and Keum River, and ebb dominance in Youngsan River are developed. These tidal asymmetries can be reconfirmed by analyzing the tidal current data. Unlike having reciprocating tidal current patterns in Keum and Youngsan estuaries, rotaing tidal current pattern is shown in the Han River estuary due to the complex topography and waterways around Ganghwa Island area. However, when residual current is removed, flood dominance is shown in consistency with the tide data. The tidal asymmetry in the estuary tends to intensify with the growth in shallow water tide as the tidal wave propagates to upstream of tidal river. Energy dissipation, in shallow Han River and Keum River classified as SD estuaries, is very large regarding bottom friction characteristics. On the other hand, the deep Youngsan River, classified as a WD estuary, shows less energy dissipation.

A Statistical Study on the Result Analysis of CaPSPI, a Diagnostic System for Climacteric and Postmenopausal Syndrome Pattern Identification (CaPSPI(Diagnostic System for Climacteric and Postmenopausal Syndrome Pattern Identification) 업그레이드를 위한 검진용 치료용 진단 결과 분석에 대한 통계 연구)

  • Kim, Tae-Hee;Lee, In-Seon;Kim, Jong-Won;Jeon, Soo-Hyung;Chi, Gyoo-Yong;Kang, Chang-Wan
    • The Journal of Korean Obstetrics and Gynecology
    • /
    • v.35 no.3
    • /
    • pp.105-121
    • /
    • 2022
  • Objectives: It is a statistical analysis study to examine the results of CaPSPI (Diagnostic System for Climacteric and Postmenopausal Syndrome Pattern Identification), developed for objective defecation of climacteric and postmenopausal syndrome. Methods: Total 341 people's questionnaire responses were statistically analyzed. 275 people involved in developing CaPSPI 2018 (E) and 146 people involved in 2019-2020 study of research1,3). Results: The frequency of diagnosis for examination was the highest at liver depression, 93.8% for 320 times, the lowest at heartheat, 62.8% for 214 times. The frequency of treatment for examination was the highest at liver depression, 54.3% for 185 times, and the lowest at dual deficiency of heart-spleen, 16.7% for 57 times. The diagnosis ratio was the lowest at dual deficiency of heart-spleen, 19.72%, and the highest at liver depression, 57.81%. As a result of comparing these diagnoses with the Kupperman's index, all showed significant differences. As a result of comparing these disease elements, all showed significant differences. The correlation between diagnosis and dialectic elements was found to have similar results with the korean medical pathology, and in 7 dialectics except for heartheat, the treatment version was more severe or progressing to perjury than for examination. Conclusions: The CaPSPI shows the characteristics of korean medicine well, and it is needed to utilize the high correlative disease elements to upgrade the system.