• Title/Summary/Keyword: Combining Data

Search Result 2,143, Processing Time 0.032 seconds

A Design of an Optimized Classifier based on Feature Elimination for Gene Selection (유전자 선택을 위해 속성 삭제에 기반을 둔 최적화된 분류기 설계)

  • Lee, Byung-Kwan;Park, Seok-Gyu;Tifani, Yusrina
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.8 no.5
    • /
    • pp.384-393
    • /
    • 2015
  • This paper proposes an optimized classifier based on feature elimination (OCFE) for gene selection with combining two feature elimination methods, ReliefF and SVM-RFE. ReliefF algorithm is filter feature selection which rank the data by the importance of the data. SVM-RFE algorithm is a wrapper feature selection which wrapped the data and rank the data based on the weight of feature. With combining these two methods we get less error rate average, 0.3016138 for OCFE and 0.3096779 for SVM-RFE. The proposed method also get better accuracy with 70% for OCFE and 69% for SVM-RFE.

Face Recognition Based on PCA and LDA Combining Clustering (Clustering을 결합한 PCA와 LDA 기반 얼굴 인식)

  • Guo, Lian-Hua;Kim, Pyo-Jae;Chang, Hyung-Jin;Choi, Jin-Young
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.387-388
    • /
    • 2006
  • In this paper, we propose an efficient algorithm based on PCA and LDA combining K-means clustering method, which has better accuracy of face recognition than Eigenface and Fisherface. In this algorithm, PCA is firstly used to reduce the dimensionality of original face image. Secondly, a truncated face image data are sub-clustered by K-means clustering method based on Euclidean distances, and all small subclusters are labeled in sequence. Then LDA method project data into low dimension feature space and group data easier to classify. Finally we use nearest neighborhood method to determine the label of test data. To show the recognition accuracy of the proposed algorithm, we performed several simulations using the Yale and ORL (Olivetti Research Laboratory) database. Simulation results show that proposed method achieves better performance in recognition accuracy.

  • PDF

Robust Most Significant Periods of Developments In Time Dominated Data

  • Aboukalam, F.
    • International Journal of Reliability and Applications
    • /
    • v.7 no.2
    • /
    • pp.101-110
    • /
    • 2006
  • Let E be a set of n quantitative observations under the time control. The interval of time is to be split into several subintervals such that the observations in each subinterval are almost similar, whereas the observations between the subintervals are very dissimilar. The corresponding time-subintervals become periods or phases of the development that exist in the underlying phenomenon. Aboukalam(2005) proposes a robust solution based on some initial subintervals and a technique for combining any two successive groups in that starter using a t-test under a fixed significant level ($\alpha$). The inconvenience is that; the technique reliability is not released from the level $\alpha$ which must not be defined apart from the number of the periods that is, in its turn, unknown. To avoid this, we propose what so called; most significant periods solution. The new technique constructs its own initial subintervals and uses another way for combining the groups. However, the way of determining and treating outliers has not changed. This paper conducts many empirical simulations using different possible time dominated data in order to illustrate the reliability of the proposed technique. Finally, we apply both techniques on some real time dominated data to explain the advantage of the proposal.

  • PDF

Development of lidar detection system for improvement of measurement range (Combined photon counting detection and analog-to-digital signal) (라이다 측정 거리 향상을 위한 통합 수신 시스템 개발 (아날로그방식과 광자계수방식 신호 접합))

  • Shin, Dong Ho;Noh, Young Min;Shin, Sung Kyun;Kim, Young J.
    • Korean Journal of Remote Sensing
    • /
    • v.30 no.2
    • /
    • pp.251-258
    • /
    • 2014
  • We upgraded to utilize a novel method for combining the analog to digital converter and photon-counting measurements for backscatter photon signal of lidar. We have and improve the standard combining method for determination of those conversion factors between analog to digital converter data and photon-counting data measurement which is conducted dead time correction. The combining method and dead time correction method presented here has been successfully applied to experimental data obtained in Gwangju, Korea.

Quality Improvement of Greenhouse Gas Inventories by the Use of Bottom-Up Data (상향식 자료를 이용한 온실가스 인벤토리의 품질 개선 방향 - 화학, 금속 분야를 중심으로 -)

  • Choi, Eunhwa;Shin, Eunseop;Yi, Seung-Muk
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.30 no.2
    • /
    • pp.161-174
    • /
    • 2014
  • The methodology report '2006 IPCC Guidelines for National Greenhouse Gas Inventories' shows higher tier method can be a good practice, which uses country-specific or plant-specific data when calculating greenhouse gas emissions by country. We review the methodology report to present principles of using plant-level data and also examine examples of using plant-level data in chemical and metal industry in 20 countries for the purpose of quality improvement of national greenhouse gas inventories. We propose that Korea consider utilizing plant-level data, as reported according to 'Greenhouse gas and Energy Target Management Scheme', in the following order as a preference. First, the data can be utilized for quality control of Korea's own parameters, when Tier 2 method is adopted and bottom-up approach is not applicable. Second, both plant-level data and IPCC default data can be used together, combining Tier 1 method with Tier 3 method. Third, we can also use acquired plant-level data and country specific parameters, combining Tier 2 method with Tier 3 method. Fourth, if the plant-level data involves all categories of emissions and the data is proven to be representative, we can apply Tier 3 method. In this case, we still need to examine the data to check its reliability by a consistent framework, including appropriate quality control.

The Efficiency of Boosting on SVM

  • Seok, Kyung-Ha;Ryu, Tae-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.13 no.2
    • /
    • pp.55-64
    • /
    • 2002
  • In this paper, we introduce SVM(support vector machine) developed to solve the problem of generalization of neural networks. We also introduce boosting algorithm which is a general method to improve accuracy of some given learning algorithm. We propose a new algorithm combining SVM and boosting to solve classification problem. Through the experiment with real and simulated data sets, we can obtain better performance of the proposed algorithm.

  • PDF

Continuous Query Modelling for Various Kinds of Monitoring Services for Stream Data (다양한 응용의 스트림 데이터 모니터링을 위한 연속질의 모델링)

  • Cho, Dae-Soo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.7
    • /
    • pp.1525-1530
    • /
    • 2011
  • Techniques for processing continuous queries are required to developing the various types of application services (monitoring services) in ubiquitous environment where the real-time data acquisition from a lot of sensors, analysis, and processing are required. In the previous works of the continuous queries, they have represented all of the continuous queries as the interval queries or region queries, and proposed some methods for processing theses queries. The types of continuous queries, however, are very various, and could be presented by combining the attribute conditions, spatial conditions, and temporal conditions. In this paper, I have classify the types of continuous queries, and have proposed the continuous query model which could be presented by combining those conditions. The contributions of this paper include that it proposes the query model representing the continuous queries and suggests future research directions.

Combining Machine Learning Techniques with Terrestrial Laser Scanning for Automatic Building Material Recognition

  • Yuan, Liang;Guo, Jingjing;Wang, Qian
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.361-370
    • /
    • 2020
  • Automatic building material recognition has been a popular research interest over the past decade because it is useful for construction management and facility management. Currently, the extensively used methods for automatic material recognition are mainly based on 2D images. A terrestrial laser scanner (TLS) with a built-in camera can generate a set of coloured laser scan data that contains not only the visual features of building materials but also other attributes such as material reflectance and surface roughness. With more characteristics provided, laser scan data have the potential to improve the accuracy of building material recognition. Therefore, this research aims to develop a TLS-based building material recognition method by combining machine learning techniques. The developed method uses material reflectance, HSV colour values, and surface roughness as the features for material recognition. A database containing the laser scan data of common building materials was created and used for model training and validation with machine learning techniques. Different machine learning algorithms were compared, and the best algorithm showed an average recognition accuracy of 96.5%, which demonstrated the feasibility of the developed method.

  • PDF

Analysis and Compression of Spun-yarn Density Profiles using Adaptive Wavelets

  • Kim, Joo-Yong
    • Textile Coloration and Finishing
    • /
    • v.18 no.5 s.90
    • /
    • pp.88-93
    • /
    • 2006
  • A data compression system has been developed by combining adaptive wavelets and optimization technique. The adaptive wavelets were made by optimizing the coefficients of the wavelet matrix. The optimization procedure has been performed by criteria of minimizing the reconstruction error. The resulting adaptive basis outperformed such conventional basis as Daubechies-5 by 5-10%. It was also shown that the yarn density profiles could be compressed by over 95% without a significant loss of information.

Integrated System of On-Off Line in Agricultural Products Electronic Commerce Based on Data Mining (데이터 마이닝을 이용한 농산물 전자상거래의 온 오프라인 통합시스템)

  • 주종문;황승국
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.25 no.3
    • /
    • pp.58-63
    • /
    • 2002
  • The Internet, as a commercial tool, presented a new market that connects producers with consumers through the E-commerce. Now, E-commerce spreads over almost all industries through the Internet excluding some. This research indicates the reason why the E-commerce is not activated in agricultural Industry, which is less developed than other industries. And it suggests a good example of E-commerce on the agricultural products combining on and off line markets. In addition, data-mining technique is suggested to analyze whole information in system.