• Title/Summary/Keyword: optimal classification method

Search Result 368, Processing Time 0.033 seconds

A Comparative Study of Different Color Space for Paddy Disease Segmentation (벼 병충해분할을 위한 색채공간의 비교연구)

  • Zahangir, Alom Md.;Lee, Hyo-Jong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.3
    • /
    • pp.90-98
    • /
    • 2011
  • The recognition and classification of paddy rice disease are of major importance to the technical and economical aspect of agricultural industry over the world. Computer vision techniques are used to diagnose rice diseases and to efficiently manage crops. Segmentation of lesions is the most important technique to detect paddy rice disease early and accurately. A new Gaussian Mean (GM) method was proposed to segment paddy rice diseases in various color spaces. Different color spaces produced different results in segmenting paddy diseases. Thus, this empirical study was conducted with the motivation to determine which color space is best for segmentation of rice disease. It included five color spaces; NTSC, CIE, YCbCr, HSV and the normalized RGB(NRGB). The results showed that YCbCr was the best color space for optimal segmentation of the disease lesions with 98.0% of accuracy. Furthermore, the proposed method demonstrated that diseases lesions of paddy rice can be segmented automatically and robustly.

Self-Sampling Versus Physicians' Sampling for Cervical Cancer Screening - Agreement of Cytological Diagnoses

  • Othman, Nor Hayati;Zaki, Fatma Hariati Mohamad;Hussain, Nik Hazlina Nik;Yusoff, Wan Zahanim Wan;Ismail, Pazuddin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.7
    • /
    • pp.3489-3494
    • /
    • 2016
  • Background: A major problem with cervical cancer screening in countries which have no organized national screening program for cervical cancer is sub-optimal participation. Implementation of self-sampling method may increase the coverage. Objective: We determined the agreement of cytological diagnoses made on samples collected by women themselves (self-sampling) versus samples collected by physicians (Physician sampling). Materials and Methods: We invited women volunteers to undergo two procedures; cervical self-sampling using the Evalyn brush and physician sampling using a Cervex brush. The women were shown a video presentation on how to take their own cervical samples before the procedure. The samples taken by physicians were taken as per routine testing (Gold Standard). All samples were subjected to Thin Prep monolayer smears. The diagnoses made were according to the Bethesda classification. The results from these two sampling methods were analysed and compared. Results: A total of 367 women were recruited into the study, ranging from 22 to 65 years age. There was a significant good agreement of the cytological diagnoses made on the samples from the two sampling methods with the Kappa value of 0.568 (p=0.040). Using the cytological smears taken by physicians as the gold standard, the sensitivity of self-sampling was 71.9% (95% CI:70.9-72.8), the specificity was 86.6% (95% CI:85.7-87.5), the positive predictive value was 74.2% (95% CI:73.3-75.1) and the negative predictive value was 85.1% (95% CI: 84.2-86.0). Self-sampling smears (22.9%) allowed detection of micro-organisms better than physicians samples (18.5%). Conclusions: This study shows that samples taken by women themselves (self-sampling) and physicians have good diagnostic agreement. Self-sampling could be the method of choice in countries in which the coverage of women attending clinics for screening for cervical cancer is poor.

The Study on matrix based high performance pattern matching by independence partial match (독립 부분 매칭에 의한 행렬 기반 고성능 패턴 매칭 방법에 관한 연구)

  • Jung, Woo-Sug;Kwon, Taeck-Geun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.9B
    • /
    • pp.914-922
    • /
    • 2009
  • In this paper, we propose a matrix based real-time pattern matching method, called MDPI, for real-time intrusion detection on several Gbps network traffic. Particularly, in order to minimize a kind of overhead caused by buffering, reordering, and reassembling under the circumstance where the incoming packet sequence is disrupted, MDPI adopts independent partial matching in the case dealing with pattern matching matrix. Consequently, we achieved the performance improvement of the amount of 61% and 50% with respect to TCAM method efficiency through several experiments where the average length of the Snort rule set was maintained as 9 bytes, and w=4 bytes and w=8bytes were assigned, respectively, Moreover, we observed the pattern scan speed of MDPI was 10.941Gbps and the consumption of hardware resource was 5.79LC/Char in the pattern classification of MDPI. This means that MDPI provides the optimal performance compared to hardware complexity. Therefore, by decreasing the hardware cost came from the increased TCAM memory efficiency, MDPI is proven the cost effective high performance intrusion detection technique.

An Image Contrast Enhancement Technique Using the Improved Integrated Adaptive Fuzzy Clustering Model (개선된 IAFC 모델을 이용한 영상 대비 향상 기법)

  • 이금분;김용수
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.9
    • /
    • pp.777-781
    • /
    • 2001
  • This paper presents an image contrast enhancement technique for improving the low contrast images using the improved IAFC(Integrated Adaptive Fuzzy Clustering) model. The low pictorial information of a low contrast image is due to the vagueness or fuzziness of the multivalued levels of brightness rather than randomness. Fuzzy image processing has three main stages, namely, image fuzzification, modification of membership values, and image defuzzification. Using a new model of automatic crossover point selection, optimal crossover point is selected automatically. The problem of crossover point selection can be considered as the two-category classification problem. The improved IAFC model is used to classify the image into two classes. The proposed method is applied to several experimental images with 256 gray levels and the results are compared with those of the histogram equalization technique. We utilized the index of fuzziness as a measure of image quality. The results show that the proposed method is better than the histogram equalization technique.

  • PDF

Combined Application of Data Imbalance Reduction Techniques Using Genetic Algorithm (유전자 알고리즘을 활용한 데이터 불균형 해소 기법의 조합적 활용)

  • Jang, Young-Sik;Kim, Jong-Woo;Hur, Joon
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.3
    • /
    • pp.133-154
    • /
    • 2008
  • The data imbalance problem which can be uncounted in data mining classification problems typically means that there are more or less instances in a class than those in other classes. In order to solve the data imbalance problem, there has been proposed a number of techniques based on re-sampling with replacement, adjusting decision thresholds, and adjusting the cost of the different classes. In this paper, we study the feasibility of the combination usage of the techniques previously proposed to deal with the data imbalance problem, and suggest a combination method using genetic algorithm to find the optimal combination ratio of the techniques. To improve the prediction accuracy of a minority class, we determine the combination ratio based on the F-value of the minority class as the fitness function of genetic algorithm. To compare the performance with those of single techniques and the matrix-style combination of random percentage, we performed experiments using four public datasets which has been generally used to compare the performance of methods for the data imbalance problem. From the results of experiments, we can find the usefulness of the proposed method.

  • PDF

Optimal Ratio of Data Oversampling Based on a Genetic Algorithm for Overcoming Data Imbalance (데이터 불균형 해소를 위한 유전알고리즘 기반 최적의 오버샘플링 비율)

  • Shin, Seung-Soo;Cho, Hwi-Yeon;Kim, Yong-Hyuk
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.1
    • /
    • pp.49-55
    • /
    • 2021
  • Recently, with the development of database, it is possible to store a lot of data generated in finance, security, and networks. These data are being analyzed through classifiers based on machine learning. The main problem at this time is data imbalance. When we train imbalanced data, it may happen that classification accuracy is degraded due to over-fitting with majority class data. To overcome the problem of data imbalance, oversampling strategy that increases the quantity of data of minority class data is widely used. It requires to tuning process about suitable method and parameters for data distribution. To improve the process, In this study, we propose a strategy to explore and optimize oversampling combinations and ratio based on various methods such as synthetic minority oversampling technique and generative adversarial networks through genetic algorithms. After sampling credit card fraud detection which is a representative case of data imbalance, with the proposed strategy and single oversampling strategies, we compare the performance of trained classifiers with each data. As a result, a strategy that is optimized by exploring for ratio of each method with genetic algorithms was superior to previous strategies.

Leakage Detection Method in Water Pipe using Tree-based Boosting Algorithm (트리 기반 부스팅 알고리듬을 이용한 상수도관 누수 탐지 방법)

  • Jae-Heung Lee;Yunsung Oh;Junhyeok Min
    • Journal of Internet of Things and Convergence
    • /
    • v.10 no.2
    • /
    • pp.17-23
    • /
    • 2024
  • Losses in domestic water supply due to leaks are very large, such as fractures and defects in pipelines. Therefore, preventive measures to prevent water leakage are necessary. We propose the development of a leakage detection sensor utilizing vibration sensors and present an optimal leakage detection algorithm leveraging artificial intelligence. Vibrational sound data acquired from water pipelines undergo a preprocessing stage using FFT (Fast Fourier Transform), followed by leakage classification using an optimized tree-based boosting algorithm. Applying this method to approximately 260,000 experimental data points from various real-world scenarios resulted in a 97% accuracy, a 4% improvement over existing SVM(Support Vector Machine) methods. The processing speed also increased approximately 80 times, confirming its suitability for edge device applications.

A Study on the Visiting Areas Classification of Cargo Vehicles Using Dynamic Clustering Method (화물차량의 방문시설 공간설정 방법론 연구)

  • Bum Chul Cho;Eun A Cho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.6
    • /
    • pp.141-156
    • /
    • 2023
  • This study aims to improve understanding of freight movement, crucial for logistics facility investment and policy making. It addresses the limitations of traditional freight truck traffic data, aggregated only at city and county levels, by developing a new methodology. This method uses trip chain data for more detailed, facility-level analysis of freight truck movements. It employs DTG (Digital Tachograph) data to identify individual truck visit locations and creates H3 system-based polygons to represent these visits spatially. The study also involves an algorithm to dynamically determine the optimal spatial resolution of these polygons. Tested nationally, the approach resulted in polygons with 81.26% spatial fit and 14.8% error rate, offering insights into freight characteristics and enabling clustering based on traffic chain characteristics of freight trucks and visited facility types.

Fuzzy discretization with spatial distribution of data and Its application to feature selection (데이터의 공간적 분포를 고려한 퍼지 이산화와 특징선택에의 응용)

  • Son, Chang-Sik;Shin, A-Mi;Lee, In-Hee;Park, Hee-Joon;Park, Hyoung-Seob;Kim, Yoon-Nyun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.2
    • /
    • pp.165-172
    • /
    • 2010
  • In clinical data minig, choosing the optimal subset of features is such important, not only to reduce the computational complexity but also to improve the usefulness of the model constructed from the given data. Moreover the threshold values (i.e., cut-off points) of selected features are used in a clinical decision criteria of experts for differential diagnosis of diseases. In this paper, we propose a fuzzy discretization approach, which is evaluated by measuring the degree of separation of redundant attribute values in overlapping region, based on spatial distribution of data with continuous attributes. The weighted average of the redundant attribute values is then used to determine the threshold value for each feature and rough set theory is utilized to select a subset of relevant features from the overall features. To verify the validity of the proposed method, we compared experimental results, which applied to classification problem using 668 patients with a chief complaint of dyspnea, based on three discretization methods (i.e., equal-width, equal-frequency, and entropy-based) and proposed discretization method. From the experimental results, we confirm that the discretization methods with fuzzy partition give better results in two evaluation measures, average classification accuracy and G-mean, than those with hard partition.

A Study about Learning Graph Representation on Farmhouse Apple Quality Images with Graph Transformer (그래프 트랜스포머 기반 농가 사과 품질 이미지의 그래프 표현 학습 연구)

  • Ji Hun Bae;Ju Hwan Lee;Gwang Hyun Yu;Gyeong Ju Kwon;Jin Young Kim
    • Smart Media Journal
    • /
    • v.12 no.1
    • /
    • pp.9-16
    • /
    • 2023
  • Recently, a convolutional neural network (CNN) based system is being developed to overcome the limitations of human resources in the apple quality classification of farmhouse. However, since convolutional neural networks receive only images of the same size, preprocessing such as sampling may be required, and in the case of oversampling, information loss of the original image such as image quality degradation and blurring occurs. In this paper, in order to minimize the above problem, to generate a image patch based graph of an original image and propose a random walk-based positional encoding method to apply the graph transformer model. The above method continuously learns the position embedding information of patches which don't have a positional information based on the random walk algorithm, and finds the optimal graph structure by aggregating useful node information through the self-attention technique of graph transformer model. Therefore, it is robust and shows good performance even in a new graph structure of random node order and an arbitrary graph structure according to the location of an object in an image. As a result, when experimented with 5 apple quality datasets, the learning accuracy was higher than other GNN models by a minimum of 1.3% to a maximum of 4.7%, and the number of parameters was 3.59M, which was about 15% less than the 23.52M of the ResNet18 model. Therefore, it shows fast reasoning speed according to the reduction of the amount of computation and proves the effect.