Search | Korea Science

Application of data mining and statistical measurement of agricultural high-quality development

Yan Zhou
- Advances in nano research
- /
- v.14 no.3
- /
- pp.225-234
- /
- 2023
In this study, we aim to use big data resources and statistical analysis to obtain a reliable instruction to reach high-quality and high yield agricultural yields. In this regard, soil type data, raining and temperature data as well as wheat production in each year are collected for a specific region. Using statistical methodology, the acquired data was cleaned to remove incomplete and defective data. Afterwards, using several classification methods in machine learning we tried to distinguish between different factors and their influence on the final crop yields. Comparing the proposed models' prediction using statistical quantities correlation factor and mean squared error between predicted values of the crop yield and actual values the efficacy of machine learning methods is discussed. The results of the analysis show high accuracy of machine learning methods in the prediction of the crop yields. Moreover, it is indicated that the random forest (RF) classification approach provides best results among other classification methods utilized in this study.
https://doi.org/10.12989/anr.2023.14.3.225 인용

Application of Bayesian Statistical Analysis to Multisource Data Integration

Hong, Sa-Hyun;Moon, Wooil-M.
- Proceedings of the KSRS Conference
- /
- 2002.10a
- /
- pp.394-399
- /
- 2002
In this paper, Multisource data classification methods based on Bayesian formula are considered. For this decision fusion scheme, the individual data sources are handled separately by statistical classification algorithms and then Bayesian fusion method is applied to integrate from the available data sources. This method includes the combination of each expert decisions where the weights of the individual experts represent the reliability of the sources. The reliability measure used in the statistical approach is common to all pixels in previous work. In this experiment, the weight factors have been assigned to have different value for all pixels in order to improve the integrated classification accuracies. Although most implementations of Bayesian classification approaches assume fixed a priori probabilities, we have used adaptive a priori probabilities by iteratively calculating the local a priori probabilities so as to maximize the posteriori probabilities. The effectiveness of the proposed method is at first demonstrated on simulations with artificial and evaluated in terms of real-world data sets. As a result, we have shown that Bayesian statistical fusion scheme performs well on multispectral data classification.
PDF

Functional Data Classification of Variable Stars

Park, Minjeong;Kim, Donghoh;Cho, Sinsup;Oh, Hee-Seok
- Communications for Statistical Applications and Methods
- /
- v.20 no.4
- /
- pp.271-281
- /
- 2013
This paper considers a problem of classification of variable stars based on functional data analysis. For a better understanding of galaxy structure and stellar evolution, various approaches for classification of variable stars have been studied. Several features that explain the characteristics of variable stars (such as color index, amplitude, period, and Fourier coefficients) were usually used to classify variable stars. Excluding other factors but focusing only on the curve shapes of variable stars, Deb and Singh (2009) proposed a classification procedure using multivariate principal component analysis. However, this approach is limited to accommodate some features of the light curve data that are unequally spaced in the phase domain and have some functional properties. In this paper, we propose a light curve estimation method that is suitable for functional data analysis, and provide a classification procedure for variable stars that combined the features of a light curve with existing functional data analysis methods. To evaluate its practical applicability, we apply the proposed classification procedure to the data sets of variable stars from the project STellar Astrophysics and Research on Exoplanets (STARE).
https://doi.org/10.5351/CSAM.2013.20.4.271 인용 PDF KSCI

One-dimensional CNN Model of Network Traffic Classification based on Transfer Learning

Lingyun Yang;Yuning Dong;Zaijian Wang;Feifei Gao
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.18 no.2
- /
- pp.420-437
- /
- 2024
There are some problems in network traffic classification (NTC), such as complicated statistical features and insufficient training samples, which may cause poor classification effect. A NTC architecture based on one-dimensional Convolutional Neural Network (CNN) and transfer learning is proposed to tackle these problems and improve the fine-grained classification performance. The key points of the proposed architecture include: (1) Model classification--by extracting normalized rate feature set from original data, plus existing statistical features to optimize the CNN NTC model. (2) To apply transfer learning in the classification to improve NTC performance. We collect two typical network flows data from Youku and YouTube, and verify the proposed method through extensive experiments. The results show that compared with existing methods, our method could improve the classification accuracy by around 3-5%for Youku, and by about 7 to 27% for YouTube.
https://doi.org/10.3837/tiis.2024.02.008 인용 PDF HTML

A Study on Statistical Classification of Wear Debris Morphology

Cho, Unchung
- KSTLE International Journal
- /
- v.2 no.1
- /
- pp.35-39
- /
- 2001
In this paper, statistical approach is undertaken to investigate the classification of wear debris which is the key function of objective assessment of wear debris morphology. Wear tests are run to produce various kinds of wear debris. The images of wear debris from wear tests are captured with image acquisition equipment. By thresholding, two-dimensional binary images of wear debris are made and, then, morphological parameters are used to quantify the images of debris. Parametric and nonparametric discriminant method are employed to classify wear debris into predefined wear conditions. It is demonstrated that classification accuracy of parametric and nonparametric discriminant method is similar. The selected use of morphological parameters by stepwise discriminant analysis can generally improve the classification accuracy of parametric and nonparametric discriminant method.
PDF

Prediction of extreme PM_2.5 concentrations via extreme quantile regression

Lee, SangHyuk;Park, Seoncheol;Lim, Yaeji
- Communications for Statistical Applications and Methods
- /
- v.29 no.3
- /
- pp.319-331
- /
- 2022
In this paper, we develop a new statistical model to forecast the PM_2.5 level in Seoul, South Korea. The proposed model is based on the extreme quantile regression model with lasso penalty. Various meteorological variables and air pollution variables are considered as predictors in the regression model, and the lasso quantile regression performs variable selection and solves the multicollinearity problem. The final prediction model is obtained by combining various extreme lasso quantile regression estimators and we construct a binary classifier based on the model. Prediction performance is evaluated through the statistical measures of the performance of a binary classification test. We observe that the proposed method works better compared to the other classification methods, and predicts 'very bad' cases of the PM_2.5 level well.
https://doi.org/10.29220/CSAM.2022.29.3.319 인용 PDF KSCI

A New Approach to Statistical Analysis of Electrical Fire and Classification of Electrical Fire Causes

Kim, Doo-Hyun;Lee, Jong-Ho;Kim, Sung-Chul
- International Journal of Safety
- /
- v.6 no.2
- /
- pp.17-21
- /
- 2007
This paper aims at the statistical analysis of electrical fire and classification of electrical fire causes to collect electrical fires data efficiently. Electrical fire statistics are produced to monitor the number and characteristics of fires attended by fire fighters, including the causes and effects of fire so that action can be taken to reduce the human and financial cost of fire. Electrical fires make up the majority of fires in Korea(including nearly 30% of total fires according to recent figures), The incorrect and biased knowledge for electrical fires changed the classification of certain types of fires, from non-electrical to electrical. It is convenient and required to develop the standardized form that makes, in the assessment of the cause of electrical fires, the fire fighters directly ticking the appropriate box on the fire report form or making an assessment of a text description. Therefore, it is highly recommended to develop electrical fire cause classification and electrical fire assessment on the fire statistics in order to categorize and assess electrical fires exactly. In this paper newly developed electrical fire cause classification structure, which is well-defined hierarchical structure so that there are not any relationship or overlap between cause categories, is suggested. Also fire statistics systems of foreign countries are introduced and compared.
PDF KSCI

Optimal bandwidth in nonparametric classification between two univariate densities

Hall, Peter;Kang, Kee-Hoon
- Proceedings of the Korean Statistical Society Conference
- /
- 2002.05a
- /
- pp.1-5
- /
- 2002
We consider the problem of optimal bandwidth choice for nonparametric classification, based on kernel density estimators, where the problem of interest is distinguishing between two univariate distributions. When the densities intersect at a single point, optimal bandwidth choice depends on curvatures of the densities at that point. The problem of empirical bandwidth selection and classifying data in the tails of a distribution are also addressed.
PDF

Classification of Microarray Gene Expression Data by MultiBlock Dimension Reduction

Oh, Mi-Ra;Kim, Seo-Young;Kim, Kyung-Sook;Baek, Jang-Sun;Son, Young-Sook
- Communications for Statistical Applications and Methods
- /
- v.13 no.3
- /
- pp.567-576
- /
- 2006
In this paper, we applied the multiblock dimension reduction methods to the classification of tumor based on microarray gene expressions data. This procedure involves clustering selected genes, multiblock dimension reduction and classification using linear discrimination analysis and quadratic discrimination analysis.
https://doi.org/10.5351/CKSS.2006.13.3.567 인용 PDF KSCI

On EM Algorithm For Discrete Classification With Bahadur Model: Unknown Prior Case

Kim, Hea-Jung;Jung, Hun-Jo
- Journal of the Korean Statistical Society
- /
- v.23 no.1
- /
- pp.63-78
- /
- 1994
For discrimination with binary variables, reformulated full and first order Bahadur model with incomplete observations are presented. This allows prior probabilities associated with multiple population to be estimated for the sample-based classification rule. The EM algorithm is adopted to provided the maximum likelihood estimates of the parameters of interest. Some experiences with the models are evaluated and discussed.
PDF

Search Result 1,415, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)