• Title/Summary/Keyword: Learning Data

Search Result 11,726, Processing Time 0.041 seconds

Application of data mining and statistical measurement of agricultural high-quality development

  • Yan Zhou
    • Advances in nano research
    • /
    • v.14 no.3
    • /
    • pp.225-234
    • /
    • 2023
  • In this study, we aim to use big data resources and statistical analysis to obtain a reliable instruction to reach high-quality and high yield agricultural yields. In this regard, soil type data, raining and temperature data as well as wheat production in each year are collected for a specific region. Using statistical methodology, the acquired data was cleaned to remove incomplete and defective data. Afterwards, using several classification methods in machine learning we tried to distinguish between different factors and their influence on the final crop yields. Comparing the proposed models' prediction using statistical quantities correlation factor and mean squared error between predicted values of the crop yield and actual values the efficacy of machine learning methods is discussed. The results of the analysis show high accuracy of machine learning methods in the prediction of the crop yields. Moreover, it is indicated that the random forest (RF) classification approach provides best results among other classification methods utilized in this study.

Word Sense Disambiguation based on Concept Learning with a focus on the Lowest Frequency Words (저빈도어를 고려한 개념학습 기반 의미 중의성 해소)

  • Kim Dong-Sung;Choe Jae-Woong
    • Language and Information
    • /
    • v.10 no.1
    • /
    • pp.21-46
    • /
    • 2006
  • This study proposes a Word Sense Disambiguation (WSD) algorithm, based on concept learning with special emphasis on statistically meaningful lowest frequency words. Previous works on WSD typically make use of frequency of collocation and its probability. Such probability based WSD approaches tend to ignore the lowest frequency words which could be meaningful in the context. In this paper, we show an algorithm to extract and make use of the meaningful lowest frequency words in WSD. Learning method is adopted from the Find-Specific algorithm of Mitchell (1997), according to which the search proceeds from the specific predefined hypothetical spaces to the general ones. In our model, this algorithm is used to find contexts with the most specific classifiers and then moves to the more general ones. We build up small seed data and apply those data to the relatively large test data. Following the algorithm in Yarowsky (1995), the classified test data are exhaustively included in the seed data, thus expanding the seed data. However, this might result in lots of noise in the seed data. Thus we introduce the 'maximum a posterior hypothesis' based on the Bayes' assumption to validate the noise status of the new seed data. We use the Naive Bayes Classifier and prove that the application of Find-Specific algorithm enhances the correctness of WSD.

  • PDF

A Study on Application Method of Contour Image Learning to improve the Accuracy of CNN by Data (데이터별 딥러닝 학습 모델의 정확도 향상을 위한 외곽선 특징 적용방안 연구)

  • Kwon, Yong-Soo;Hwang, Seung-Yeon;Shin, Dong-Jin;Kim, Jeong-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.4
    • /
    • pp.171-176
    • /
    • 2022
  • CNN is a type of deep learning and is a neural network used to process images or image data. The filter traverses the image and extracts features of the image to distinguish the image. Deep learning has the characteristic that the more data, the better models can be made, and CNN uses a method of artificially increasing the amount of data by means of data augmentation such as rotation, zoom, shift, and flip to compensate for the weakness of less data. When learning CNN, we would like to check whether outline image learning is helpful in improving performance compared to conventional data augmentation techniques.

Prediction of Chest Deflection Using Frontal Impact Test Results and Deep Learning Model (정면충돌 시험결과와 딥러닝 모델을 이용한 흉부변형량의 예측)

  • Kwon-Hee Lee;Jaemoon Lim
    • Journal of Auto-vehicle Safety Association
    • /
    • v.15 no.1
    • /
    • pp.55-62
    • /
    • 2023
  • In this study, a chest deflection is predicted by introducing a deep learning technique with the results of the frontal impact of the USNCAP conducted for 110 car models from MY2018 to MY2020. The 120 data are divided into training data and test data, and the training data is divided into training data and validation data to determine the hyperparameters. In this process, the deceleration data of each vehicle is averaged in units of 10 ms from crash pulses measured up to 100 ms. The performance of the deep learning model is measured by the indices of the mean squared error and the mean absolute error on the test data. A DNN (Deep Neural Network) model can give different predictions for the same hyperparameter values at every run. Considering this, the mean and standard deviation of the MSE (Mean Squared Error) and the MAE (Mean Absolute Error) are calculated. In addition, the deep learning model performance according to the inclusion of CVW (Curb Vehicle Weight) is also reviewed.

Image-based rainfall prediction from a novel deep learning method

  • Byun, Jongyun;Kim, Jinwon;Jun, Changhyun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.183-183
    • /
    • 2021
  • Deep learning methods and their application have become an essential part of prediction and modeling in water-related research areas, including hydrological processes, climate change, etc. It is known that application of deep learning leads to high availability of data sources in hydrology, which shows its usefulness in analysis of precipitation, runoff, groundwater level, evapotranspiration, and so on. However, there is still a limitation on microclimate analysis and prediction with deep learning methods because of deficiency of gauge-based data and shortcomings of existing technologies. In this study, a real-time rainfall prediction model was developed from a sky image data set with convolutional neural networks (CNNs). These daily image data were collected at Chung-Ang University and Korea University. For high accuracy of the proposed model, it considers data classification, image processing, ratio adjustment of no-rain data. Rainfall prediction data were compared with minutely rainfall data at rain gauge stations close to image sensors. It indicates that the proposed model could offer an interpolation of current rainfall observation system and have large potential to fill an observation gap. Information from small-scaled areas leads to advance in accurate weather forecasting and hydrological modeling at a micro scale.

  • PDF

Indoor Location Data Construction Technique using GAN (GAN을 이용한 실내 위치 데이터 구성 기법)

  • Yoon, Chang-Pyo;Hwang, Chi-Gon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.490-491
    • /
    • 2021
  • Recently, technologies using Wi-Fi fingerprints and deep learning are being studied to provide accurate location-based services in an indoor environment. At this time, the composition of learning data is very important, and it is essential to collect sufficient data necessary for learning. However, the number of specific points for the collection of radio signal data within the area requiring positioning is infinite, and it is impossible to collect all of these data. Therefore, there is a need for a way to make up for insufficient learning data. This study proposes a method of constructing a sufficient number of location data necessary for learning based on insufficiently collected location data.

  • PDF

Learning Effects of Flipped Learning based on Learning Analytics in SW Coding Education (SW 코딩교육에서의 학습분석기반 플립러닝의 학습효과)

  • Pi, Su-Young
    • Journal of Digital Convergence
    • /
    • v.18 no.11
    • /
    • pp.19-29
    • /
    • 2020
  • The study aims to examine the effectiveness of flipped learning teaching methods by using learning analytics to enable effective programming learning for non-major students. After designing a flipped learning programming class model applied with the ADDIE model, learning-related data of the lecture support system operated by the school was processed with crawling. By providing data processed with crawling through a dashboard so that the instructor can understand it easily, the instructor can design classes more efficiently and provide individually tailored learning based on this. As a result of analysis based on the learning-related data collected through one semester class, it was found that the department, academic year, attendance, assignment submission, and preliminary/review attendance had an effect on academic achievement. As a result of survey analysis, they responded that the individualized feedback of instructors through learning analysis was very helpful in self-directed learning. It is expected that it will serve as an opportunity for instructors to provide a foundation for enhancing teaching activities. In the future, the contents of social network services related to learners' learning will be processed with crawling to analyze learners' learning situations.

A Study on the learning behavior and the effect of on-line class using LMS data - Focusing on computer-practice classes (LMS 데이터를 활용한 온라인 러닝의 학습 행동 및 효과에 관한 연구 - 컴퓨터 실습수업을 위주로)

  • Jun Byoungho
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.2
    • /
    • pp.79-87
    • /
    • 2023
  • On-line learning has been adopted as a major educational method due to the COVID-19 pandemic. Students and faculties got accustomed to on-line educational environment as they experienced it during the COVID-19 pandemic. Development of various technologies and social requirement for educational renovation lay groundwork for on-line learning as well. Therefore, on-line learning or blended learning will be likely to go on after the end of COVID-19 pandemic and it is necessary to prepare the guidelines for effective utilizing on-line learning. The primary purpose of this study is to examine the learning behaviors and the learning effects by using LMS data. Learning behaviors were measured in terms of learning time and access frequency for pre-recorded video lectures targeting computer-practice classes. The results of empirical analysis reveal that frequency was the significant predictor of course achievements but learning time was not. The findings of empirical analysis will provide insights that the effective planning and designing on-line classes based on learning behaviors are key to enhancing learning effects and learner's satisfaction.

Online Learning of Bayesian Network Parameters for Incomplete Data of Real World (현실 세계의 불완전한 데이타를 위한 베이지안 네트워크 파라메터의 온라인 학습)

  • Lim, Sung-Soo;Cho, Sung-Bae
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.12
    • /
    • pp.885-893
    • /
    • 2006
  • The Bayesian network(BN) has emerged in recent years as a powerful technique for handling uncertainty iii complex domains. Parameter learning of BN to find the most proper network from given data set has been investigated to decrease the time and effort for designing BN. Off-line learning needs much time and effort to gather the enough data and since there are uncertainties in real world, it is hard to get the complete data. In this paper, we propose an online learning method of Bayesian network parameters from incomplete data. It provides higher flexibility through learning from incomplete data and higher adaptability on environments through online learning. The results of comparison with Voting EM algorithm proposed by Cohen at el. confirm that the proposed method has the same performance in complete data set and higher performance in incomplete data set, comparing with Voting EM algorithm.

Performance Evaluation of a Machine Learning Model Based on Data Feature Using Network Data Normalization Technique (네트워크 데이터 정형화 기법을 통한 데이터 특성 기반 기계학습 모델 성능평가)

  • Lee, Wooho;Noh, BongNam;Jeong, Kimoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.4
    • /
    • pp.785-794
    • /
    • 2019
  • Recently Deep Learning technology, one of the fourth industrial revolution technologies, is used to identify the hidden meaning of network data that is difficult to detect in the security arena and to predict attacks. Property and quality analysis of data sources are required before selecting the deep learning algorithm to be used for intrusion detection. This is because it affects the detection method depending on the contamination of the data used for learning. Therefore, the characteristics of the data should be identified and the characteristics selected. In this paper, the characteristics of malware were analyzed using network data set and the effect of each feature on performance was analyzed when the deep learning model was applied. The traffic classification experiment was conducted on the comparison of characteristics according to network characteristics and 96.52% accuracy was classified based on the selected characteristics.